Distributed reinforcement learning-based memory allocation for edge-PLCs in industrial IoT

Journal of Cloud Computing

Advances, Systems and Applications

Table 1 Symbolic variables used by the system model

Variable	Meaning
\(Tp_i\)	Data type i
\(l_t ^ i\)	Absolute memory capacity allocated to \(Tp_i\) at time t
\(x_i\)	The actual size of \(Tp_i\) type data
\(n_t^i\)	The allocated data unit quantity for \(Tp_i\) at time t
m	The total number of data types that exist in the system
\(P_t\)	The partition of memory at time t.
S	The set of all the states
A	The set of all the actions
\(R_{t+1}\)	The reward value corresponding to \((s_t,a_t)\)
Mem	Edge PLC memory maximum capacity
\(Ploss_t ^ i\)	Loss probability of \(Tp_i\) between t and \(t+1\)
\(Ar_t ^ i\)	The amount of \(Tp_i\) that arrives between t and \(t+1\)
\(loss_t ^ i\)	The amount of \(Tp_i\) that is lost between t and \(t+1\)