Advances, Systems and Applications
From: Fast DRL-based scheduler configuration tuning for reducing tail latency in edge-cloud jobs
Notations | Introduction |
---|---|
\(\mathcal {S}\) | the state space |
\(\mathcal {A}\) | the action space |
\(R(s_t,a_t)\) | the reward function |
\(P(s_{t+1}|s_t,a_t)\) | transition dynamics (reflects the time-variant dynamics of cluster, \(0 \le P(s_{t+1}|s_t,a_t) \le 1\)) |
\(s_t\) | the node and task state information during a scheduling interval |
\(v^w\) | a waiting task |
\(v^r\) | a running task |
\(a_t\) | an action that is one possible configuration combination of cluster schedulers |
\(V^{allocate}\) | the tasks that obtain resource allocations |
\(V^{complete}\) | the completed tasks |
\(V^{arrive}\) | the newly arrival tasks |
JTL | denoted the job tail latency as JTL |
J | the set of jobs completed within period \((t-1, t]\) |
TTL | denoted the tail latency of a task as TTL |
\(V^{run}\) | the set of tasks running within period \((t-1, t]\) |
\(V^{wait}\) | the set of tasks waiting within period \((t-1, t]\) |
\(r^{job}\) | the reward of job |
\(r^{run}\) | the reward of the set \(V^{run}\) |
\(r^{wait}\) | the reward of the set \(V^{wait}\) |
\(r_t\) | the reward of time-step t |
\(\alpha _1,\alpha _2,\alpha _3\) | the negative values |
\(\beta _1,\beta _2,\beta _3\) | the positive values |
B(Actor) | the maximal size of local buffer in Actor |
\(T_s(Actor)\) | the number of sampling steps in Actor |
N(Learner) | the experience number to start training in Learner |
L(Learner) | the maximal size of local buffer in Learner |
\(T_s(Learner)\) | the maximum number of training in Learner |
\(t^s\) | the simulation time |
\(\triangle t\) | the duration of one iteration in simulation |
|N| | the number of cluster nodes |