COCAM: a cooperative video edge caching and multicasting approach based on multi-agent deep reinforcement learning in multi-clouds environment

Journal of Cloud Computing

Advances, Systems and Applications

Table 1 Summary of important notations

Notations	Definition
\(\beta\)	The hyperparameter of the entropy term
\(\gamma\)	The discount factor
\(N,\mathcal N\)	The number and set of edge clouds
G	The video set through the XC scheme
\(F,\mathcal F\)	The number and set of videos
\(x_{t,n}^f\)	The variable whether the requested video f is transmitted from the remote cloud to edge cloud n at time t
\(y_{t,n}^f\)	The variable whether the requested video f has been stored in edge cloud n at time t
\(q_{t,n}^f\)	The request for file f received by the edge cloud n at time t
C	The maximum capacity of edge cloud
\({s}_{t,n}\)	The state of agent n at time t
\(\hat{s}_{t,n}\)	The joint observation state of an agent n
\(\pi _{t,n}\)	The policy of agent n
\({a}_{t,n}\)	The action of agent n
\({\omega }_{n}\)	The parameter of the critic network for agent n
\(\theta _{n}\)	The parameter of the actor network for agent n
\(R_{t,n}\)	The expected value equation for edge cloud n
\(r_{n}\)	The global reward
B	The replay buffer memory
\(\zeta\)	The target network update parameter n
V	The value function of the critic network
\(\tilde{A}_{t, n}\)	The advantage function
\(\mathcal N_n\)	The neighborhoods set of agent n