A cloud-edge collaborative task scheduling method based on model segmentation

Zhang, Chuanfu; Chen, Jing; Li, Wen; Sun, Hao; Geng, Yudong; Zhang, Tianxiang; Ji, Mingchao; Fu, Tonglin

doi:10.1186/s13677-024-00635-7

Journal of Cloud Computing

Advances, Systems and Applications

Table 1 Notations

From: A cloud-edge collaborative task scheduling method based on model segmentation

Notations	Description
\(num\)	the amount of data uploaded by users to the edge
\(N\)	the number of layers in a neural network model
\(T_{\delta }\)	task calculation delay on the cloud or edge
\(\delta \in \{ e,c\}\)	edge node and cloud node respectively
\(T_{total}\)	transmission delay of intermediate data
\(T_{c}\), \(T_{e}\)	cloud computing delay, edge computing delay
\(F\)	FLOPs of the neural network model
\(F_{c}\), \(F_{e}\)	cloud computing and edge computing capability
\(F_{\delta }\)	FLOPS of the edge or cloud server
\(F = \{ f_{1} ,f_{2} ,...,f_{n} \}\)	FLOPs of each layer of the neural network model
\(C_{in}\)	the number of input characteristic matrices
\(K_{w}\), \(K_{h}\)	width and height of convolution kernel
\(C_{out}\)	the number of output characteristic matrices
\(w\), \(h\)	width and height of output characteristic matrices
\(FLOPs_{{\text{cov}}}\), \(FLOPs_{fc}\)	FLOPs of convolution and full connection layer
\(N_{In}\), \(N_{Out}\),	input features, output features
\(N_{core}\), \(H_{{\text{c}}}\)	cores number and frequency of the processor
\(N_{float}\)	floating-point operations per cycle of the processor
\(T_{e} = \{ t_{e,1} ,t_{e,2} ,...,t_{e,n} \}\)	computing delay of each layer at the edge node
\(T_{c} = \{ t_{c,1} ,t_{c,2} ,...,t_{c,n} \}\)	computing delay of each layer at the cloud node
\(V_{trans}\), \(V_{up}\),\(V_{down}\)	transmission rate, uplink rate, and downlink rate
\(O\)	the size of data to be transmitted
\(O = \{ o_{0} ,o_{1} ,...,o_{n - 1} \}\)	the data output of each layer
\(T_{up} = \{ t_{up,0} ,t_{up,1} ,...,t_{up,n - 1} \}\)	uplink delay of the output data of each layer
\(T_{down} = \{ t_{down,0} ,t_{down,1} ,...,t_{down,n - 1} \}\)	downlink delay of the output data of each layer
\(T_{trans} = \{ t_{trans,0} ,t_{trans,1} ,...,t_{trans,n - 1} \}\)	transmission delay of the output data of each layer
\(D_{size}\)	the space occupied by the corresponding data type
\((\dim_{1} ,\dim_{2} ,...,\dim_{m} )\)	the Tensor size output
\(T_{total} = \{ t_{total,0} ,t_{total,1} ,...,t_{total,n - 1} \}\)	the transmission delay of the data of each layer
\(T = \{ t_{0} ,t_{1} ,...,t_{{\text{N}}} \}\)	task completion time
\(M_{total}\)	the total memory of the system
\(M_{{{\text{wait}}}}\)	system memory consumption without task execution
\(M = \{ m_{0} ,m_{1} ,...,m_{n} \}\)	memory consumption of each layer when running task
\(M_{cost} = \{ m_{\cos t,0} ,m_{\cos t,1} ,...,m_{\cos t,n} \}\)	memory consumption of each layer
\({R}_{memory}=\{{r}_{m,0},{r}_{m,1},...,{r}_{m,n}\}\)	memory occupancy of each layer
\(split\)	the location of the segmentation point

Back to article page