 Research
 Open Access
 Published:
Collaborative ondemand dynamic deployment via deep reinforcement learning for IoV service in multi edge clouds
Journal of Cloud Computing volumeÂ 12, ArticleÂ number:Â 119 (2023)
Abstract
In vehicular edge computing, the lowdelay services are invoked by the vehicles from the edge clouds while the vehicles moving on the roads. Because of the insufficiency of computing capacity and storage resource for edge clouds, a single edge cloud cannot handle all the services, and thus the efficient service deployment strategy in multi edge clouds should be designed according to the service demands. Noticed that the service demands are dynamic in temporal, and the interrelationship between services is a nonnegligible factor for service deployment. In order to address the new challenges produced by these factors, a collaborative service ondemand dynamic deployment approach with deep reinforcement learning is proposed, which is named CODDDQN. In our approach, the number of service request of each edge clouds are forecasted by a timeaware service demands prediction algorithm, and then the interacting services are discovered through the analysis of service invoking logs. On this basis, the service response time models are constructed to formulated the problem, aiming to minimize service response time with data transmission delay between services. Furthermore, a collaborative service dynamic deployment algorithm with DQN model is proposed to deploy the interacting services. Finally, the realworld dataset based experiments are conducted. The results show our approach can achieve lowest service response time than other algorithms for service deployment.
Introduction
Internet of Vehicles (IoV) creates the bridge between the vehicles and roadside units (RSUs) through the wireless communication technologiesÂ [1], which can be regarded as a typical IoT network and has been applied in urban transportation system. The IoV system can realize the data interaction between vehicles and RSUs, and make the decision for autodrivingÂ [2].
In intelligent transportation system, the vehicles equipped with intelligent devices which are responsible for the collection of the vehicles moving status and traffic road condition data for analysis and computationÂ [3]. The cloud computing based IoV system can address the problems produced by the computing capacity limitation of vehiclesÂ [4]. With the data collected by sensors increased, the cloud computing may bring the high service delay and the network congestion problems, which is difficult to satisfy the lowdelay requirement for latency sensitive servicesÂ [5]. Besides the lowlatency requirement, we also noticed the mobility of vehicles is another important factor, which may bring the difficult to provide all services to vehicles relying on a single cloud, which may result in a serious performance degradation. To solve such problems, the edge computing has been produced and raised widely attentions of researches, which can not only provide lowlatency service to users efficiently, but can also avoid the single cloud provider lockin and guarantee the service performancesÂ [6, 7].
In edge computing, the intelligent devices are responsible for preprocessing raw data and offloading them to the edge clouds, which are closer to users, and mainly undertake to process the dataÂ [8]. Thus the edge computing can enhance the computing capacity of the edge of networkÂ [9, 10]. In reality, because of the insufficiency of computation capability and storage resource for edge clouds, the execution of the IoT services on edge clouds require the designing of service deployment strategyÂ [11, 12]. Most of the studies concentrate on reducing the service response time and energy consumption of the intelligent devicesÂ [13,14,15]. With the deepening of researches, some studies noticed the heterogeneity of service requests among multi edge clouds. To address the problem of service requests imbalances among multi edge clouds, some studies proposed some efficient approaches of service deployment with computation workload scheduling strategiesÂ [16,17,18]. In sight of these studies, most of these schemes are produced based on the assumption of the known service demands. Generally speaking, the service demands are unknown in practice, which may result in the unreasonable deployment strategy and large service delay during the service deployment process. Thus, Hao et al.Â [19] presented a service deployment with the computation resource allocation strategy under the uncertainty of service demands in industrial cyberphysical system.
Along with the deepening of vehicular edge computing, we noticed the service deployment meet new challenges due to the particularity of the IoV environment. First, with the dramatic increase of mobile vehicles, the service demands are imbalance and highly dynamic in temporal, which may greatly influence the service delay to a large extentÂ [20]. Thus the services should be deployed according to the service demands and the temporal dynamic of service demands should be considered for service deployment. Second, it is demonstrated that with the development of IoV, single atomic service cannot satisfy the complex business requirements. So the interacting services should complete the business goal with collaboration, and it exists large amount transmission data between the servicesÂ [21]. Thus, the interrelationship between services is another nonnegligible factor for service deployment.
To deal with the above mentioned challenges, a collaborative service ondemand dynamic deployment approach is proposed to deploy the interacting services on multi edge clouds, which is named CODDDQN. In our approach, a time aware service demands prediction algorithm is introduced to forecast the number of service request for each edge cloud, and then the interacting services are mined by a parallel algorithm. On this basis, the service response time models are formulated. Furthermore, we propose a collaborative service dynamic deployment algorithm via deep reinforcement learning to deploy the interacting services according to the forecasted the number of service request, which considers the minimization problem for service response time with data transmission delay between services. Specifically, the contributions of this paper can be threefold as the following descriptions.

The number of service request for each edge cloud are forecasted by a timeaware service demands prediction algorithm based on the ARIMA model, which can investigate the temporal dynamic characteristics of service demands.

Service response time models are formulated according to the interrelationship between interacting services which have been discovered by a parallel mining algorithm.

The collaborative service ondemand dynamic deployment algorithm via DQN model is presented to deploy the interacting services according to the forecasted value of service demands, which can reduce the service response time with data transmission delay between services.
The rest of this paper is organized as follows. we introduce the related work of this research in Related work section. Framework of collaborative service dynamic deployment section presents the framework of collaborative service dynamic deployment, and then a ARIMA based timeaware algorithm is presented to forecast the number of service request in Timeaware service demands prediction section. The system response time models are constructed to formulate the problem of service deployment in System model and problem formulation section. Furthermore, Algorithm for collaborative service dynamic deployment section proposes a collaborative service ondemand dynamic deployment algorithm with DQN to deploy the interacting services according to the service demands, aiming to solve the minimization problem of the service response time with data transmission delay between services. Finally, we evaluate the efficiency of our algorithms in Experimental evaluation section, and then Conclusion section concludes this paper.
Related work
In IoT environments, the data produced by the various intelligent devices are experiencing rise, which may lead to high latency and network congestion in IoT system. Thus the cloud computing cannot provide the low latency services for usersÂ [9]. To address such problems, edge computing is introduced and applied in wide areas. For edge computing, intelligent devices offload the preprocessed raw data to the edge clouds which is near to the users. While the edge clouds responsible to execute the services, and the cloud servers are only undertake to execute dataintensive services and train the deep neural networkÂ [22].
Currently, most studies have concentrated on task offloading, which mainly concentrate on how to design efficient offloading strategy to offload the tasks on edge clouds or remote cloud serverÂ [23]. In sight of these works, existing task offloading strategy can be divided into 0/1 offloading and partial offloadingÂ [24, 25]. Considering the insufficient computing capacity of intelligent devices and the limitation computation resource of edge clouds, the partial offloading is the reasonable task offloading manner, which can be formulated as a minimization problem of service request delay or energy consumption of devicesÂ [26, 27].
According the prior knowledge of the global information, the computing capacity or storage resource of a single edge cloud is insufficient, and all services cannot be executed on single edge cloud. Thus, an efficient services deployment strategy should be designed for deploying services on edge clouds or remote cloud server. For service deployment, some existing studies proposed efficient service deployment algorithms to reduce the service response time or allocate computation resource for edge computingÂ [28,29,30]. For example, a fog configuration is presented to solve the minimization problem of energy consumption and request delay for industrial IoTÂ [13, 31]. Wang et al.Â [14] proposed a edge server placement algorithm, which can minimize multi optimization objectives and balance the workloads between edge clouds. Noticed that the imbalance of service demands is another nonnegligible factor on multi edge clouds, and then the optimization of service deployment joint with resource scheduling are investigated by some researchers. Ma et al.Â [17] introduced a cooperative schema combined service placement and workload scheduling for minimizing the service response time. Hao et al.Â [19] proposed an efficient service deployment strategy joint with resource allocation through considering the uncertain service demands. In summary, the service demands is another factor which must be take into consideration for service deployment.
In sight of the existing studies, internet of vehicles has been widely used in modern urban traffic system, and thus the edge computing based IoV has been widely concentrated by some investigationsÂ [32]. For vehicular edge computing, the vehicles invoke lowdelay services from edge clouds which are closer to the vehicles. According to our prior knowledge, the service demands are uncertain and present temporal dynamic characteristics among the multi edge clouds. To design a reasonable service deployment strategy, the service demands uncertainty and temporal dynamic of service request must be considered for service deploymentÂ [20]. It is demonstrated the simple atomic service cannot satisfy the complex business requirements in reality, therefore the interacting services should collaborative work with each other to complete the business goal. It exists large amount transmission data between the interacting services, which is another nonnegligible factor for service provisioningÂ [21]. In our previous workÂ [33], we studied the collaboration between interacting services for service offloading to minimize the service request delay and data transmission delay between services. Comparing with existing studies, we study the temporal dynamic characteristics of service demands and reveal the interrelationship between services, aiming to solve the minimization problem of service response time with data transmission delay between services.
Framework of collaborative service dynamic deployment
The architecture of internet of vehicles is presented in Fig.Â 1. Typically, the architecture can be composed of three layers, which are remote cloud layer, edge network layer and vehicle user layer. Generally speaking, the vehicle user layer contains numerous vehicles, which mainly undertake the capacity of sensing the road environment and collect the data from vehicles. Due to the limited computation of vehicles, the vehicles only preprocess the raw data and transmit them to the RSUs, which often act as edge clouds in IoV. Comparing with vehicles devices, the edge clouds have rich communication, computation and storage resources. Thus the RSUs are responsible for the execution of computationintensive services. By deploying the service on edge clouds, the edge clouds are beneficial for processing the strict latency requirements and deliver the lowlatency service to vehicle users. In IoV, the cloud server with higher computing capacity and more storage capacity undertake toÂ provide the global management and centralized decisions control in the system. We investigate the temporal dynamic of service demands and reveal the interrelationship between services in this paper. Thus, the interacting services are deployed according the forecasted number of service request on multi edge clouds. The cloud servers only responsible for training the deep reinforcement learning based service deployment model, and then service deployment strategy will be send to the edge to perform for minimizing the service response time in the whole system.
As Fig.Â 2 shows, the service invoking logs are collected as the input of our approach, which contains the service request sequence and the number of service request on each edge cloud. First, to investigate the temporal dynamic characteristic of the service demands, a timeaware service demands prediction algorithm by ARIMA model is introduced to forecast the number of service request. Furthermore, we employ a parallel algorithm to discover the interacting servicesÂ [34, 35]. Finally, the interacting services are deployed by the DQNbased collaborative service dynamic deployment algorithm according the forecasted number of service request, aiming to optimize the service response time with data transmission delay between services. The details can be found as follows.

Step1.
Service invoking logs are exacted as the input of our approach, and then the ARIMA model based algorithm is put forward for forecasting the number of service request for each edge cloud, which can investigate the temporal dynamic characteristic of service demands.

Step2.
Service response time models are constructed according to the interrelationship between services, which have been discovered by our proposed algorithmÂ [34, 35].

Step3.
A collaborative service ondemand dynamic deployment algorithm based on DQN model is presented to deploy the interacting services, aiming to minimize the service response time with data transmission delay between services. This algorithm can obtain the optimal service deployment strategy through receiving environment status and performing the decision actions through iterative computing.
Timeaware service demands prediction
In vehicular edge computing, due to the mobility of the vehicles, the service demands are imbalance and dynamic in temporal. According to the temporal characteristics of service demands, we put forward a timeaware algorithm to forecast the number of service request based on the ARIMA model. Next, we will present the timeaware service demands prediction algorithm to forecast the number of service request of each edge clouds.
In our system, the services \(s=\{1, 2, ..., s\}\) are deployed on the edge clouds \(E=\{1, 2, ..., i\}\). In order to investigate the temporal dynamic of the service demands, the number of service request for service k deployed on edge cloud i denoted as \(\{c(i, k, t)t=0,1,2,...,n\}\). The number of service request can be forecasted by our algorithm. The ARIMA integrates autoregressive (AR) and moving average (MA) model to formulated the time series dataÂ [36]. In this model, if the original data is nonstationary, the data should be transferred into a stationary data through d steps differences. Thus the time series denoted by ARMA(p,Â q) can be modeled as follows.
where \(\phi _{0}\) is a constant item. \(\theta _{j}\) and \(\phi _{i}\) denote the parameter of MA and AR model, respectively. \(a_{t}\) denotes the white noisy. p, q are nonnegative integer, which denote the order of AR model, MA model, respectively.
To our best knowledge, the most important step for ARIMAbased time series forecasting is constructing ARIMA model and determining the order of model to forecast the future data. In our algorithm, the precondition of constructing the ARIMA model is checking the series data is white noisy or not. For this step, the LjungBox test is used for white noisy checking. If the series data satisfy the precondition, the ARIMA model can be used for time series forecasting, else, we employ the simple moving average to forecast the number of service request in this algorithm, which can be formulated as
where \(\hat{c}(i, k, t+n)\) represents the \(nth\) forecasted value of the number of service request, and c(i,Â k,Â t) is the \(tth\) observed value.
According to the discusses of time series forecasting, the process of ARIMAbased service demands prediction algorithm follows the following six steps.
Step 1: Stationarity Checking. With the white noisy checking completed, the stationarity of the number of service request series should be determined by the unit root test. If the time series data is not stationarity, the original data should be calculated through d steps differences, and transfer them into the stationary series.
Step 2: Model Identification. Model identification is the most important step in time series forecasting. During this process, the order of p and q should be determined for constructing the ARMA model. In this step, the ACF (autocorrelation function) and PACF (patrial autocorrelation function) are computed to assist the order selection, which can be obtained by the following expressions:
where \(\rho _k\) represents the lag k ACF, and the \(\gamma _k\) represents the lag k autocovariance function. The lag k PACF is denoted by \(\phi _{kk}\).
Since the ACF and PACF are computed by the former equations, the order of the ARIMA model is selected accordingly. If PACF is truncated at porder and ACF decays, the AR(p) model can be selected to constructed the model. If ACF is truncated at qorder and PACF decays, the MA(q) model can be used to fit the series data. If ACF and PACF decay, the ARMA(p,Â q) can be adopted as the model to fit the series data.
Step 3: Model Estimation. After the model order is selected, the parameters of the model should be estimated for the ARMA model. In this step, the maximum likelihood estimation is adopted to determined the parameters by the following expression.
where l represents the likelihood function, and \(a_{t}\sim N(o, \sigma ^2)\) denotes the white noisy.
Step 4: Model Checking. In this step, the significance of models and parameters should be checked. If the significance test is satisfied, the model can be adopted to forecast the number of service request.
Step 5: Model Selection. Since the model checking is completed, the optimal model should be selected from all candidate models which have passed the significance test. The model selection according to the AIC (Akaikeâ€™s Information Criterion) value in this step, the model which has the minimum AIC value should be selected to forecast the future data.
Step 6: Number of service request forecasting. Since the optimal model is selected, the number of service request are forecasted by the constructed model. In this algorithm, the \((n+1)th\) value is calculated according to the \(nth\) forecasted value. Thus, As the steps increased, the forecasted error increases accordingly. The details of the algorithm can be found in AlgorithmÂ 1. In our system, the prediction algorithm is deployed on each edge cloud, and the number of service request for each edge cloud can be forecasted by this algorithm.
System model and problem formulation
In this section, the service response time models are presented to formulate the service deployment problem of our approach in the following contents.
System model
In reality, a complex service can be composed by a serial of subservices, each of which processes certain data and accomplishes one piece of subtask. In that cases, the precursor service should be executed and transmit the processed data to the subsequent service, and then the subsequent service should process the transmitted data to accomplish a certain task. Thus, it may exist the data communications between interacting services. In such cases, the interrelationship between the services should be considered for service provisioning in edge computing.
In this paper, we construct the system model for service deployment during the time slots \(T=\{1, 2, ..., t\}\). During the process, the services are deployed on edge clouds and the computation resource are allocation in each duration. In our system, the finite services are deployed on multi edge clouds upon the limited storage and computation resource, and the user requests the service from the proximity edge clouds. We assume there are a series services denoted as \(K=\{1, 2, ..., k\}\), which are deployed on the multi edge clouds. The edge clouds can be denoted by \(S=\{1, 2, ..., s\}\). We let M(i) and D(i) denote the computing and storage capacity of edge cloud, respectively. In contrast with previous worksÂ [19, 20], we study the temporal dynamic of service demands and consider the interrelationship between interacting services for service deployment, and thus the interacting services are deployed collaborative on the multi edge clouds. The remote cloud is only responsible to train the deep reinforcement learning model for searching the service deployment strategy. In the following contents, we present the system model with service response time and formulate the service deployment problem. The important notations of this paper are shown in TableÂ 1.
As mentioned above, we construct the system model for services deployment with computation resource allocation in multi edge clouds. First, we define the service deployment function as \(b(k,i,t)\in \{0,1\}\), whose value is a binary variable. Thus, when the service is deployed on edge cloud, we let \(b(k,i,t)=1\), otherwise \(b(k,i,t)=0\). Due to the insufficiency of the storage capacity of edge cloud, the whole data size of the services cannot exceed the storage capacity of edge cloud.
where d(k) represents the data size of the service k.
To improve the utilization of computation resource, a primer resource allocation scheme for service deployment is designed in multi edge clouds. We use \(l(k,i,t)\in [0,1]\) denote the proportion of computation resource allocation. Accordingly, if the service is not deployed on the edge cloud in this time, the \(l(k,i,t)=0\). Thus the computation resource allocation function is defined by \(L(t)=\{l(k,i,t)i\in S, k\in K\}\). Since the computing capacity of edge cloud is insufficient, the allocated proportion of computation resource to execute service cannot exceed 1, which can be expressed as
Once the service is deployed on the edge cloud, the computation resource should be allocated according to the following scheme for executing this service. Thus, when the services are deployed on the edge cloud, the computation resource should be allocated as a certain proportion value, otherwise the allocated computation resource is 0. The relationship between l(k,Â i,Â t) and b(k,Â i,Â t) can be formulated as follows.
where g denotes the proportion value of computation resource allocated for executing the service.
To analyze the service response time in this system, we let c(k,Â i,Â t) denote the number of service request, which can be forecasted by our proposed service demands prediction algorithm. In this paper, once the service cannot be deployed on such edge cloud, the service should be executed on another edge cloud through service scheduling. We notice that the data back haul delay for executing the service is much smaller than service request delay and data transmission delay, thus the delay of data back haul can be ignored in this paper.
In this paper, the edge clouds receive the service request and the data should be transmitted from vehicles to edge clouds, thus the data transmission delay between vehicles and edge clouds can be calculated by the following expression.
where C(k,Â t) denotes the total value of the number of service request for service k on all edge clouds, which can be computed through \(C(k,t)=\sum ^{S}_{i=1}c(k,i,t)\). The \(V_{v2e}\) denotes the network transmission rate between vehicles and edge clouds.
As mentioned above, the services should be executed through service scheduling in some cases. Therefore, the data transmission delay between edge clouds can be computed as
where \(C(k,t)c(k,i,t)\) is the number of service request handled on other edge clouds, and \(V_{e2e}\) denotes the network transmission rate between edge clouds.
When the service is executed on the edge cloud, the computation delay can be calculated by the following expression.
where m(k) is the computation resource requirement of service k.
Comparing with other studies, we investigate the data transmission delay between services. Assuming it exists some interacting services, which can be divided to the preservice k and successor service \(k^{*}\). In that case, the number of service request for service k handled on edge cloud i can be denoted as \(c_{comp}(k,i,t)\), and the total value of the number of service request for service k handled at time slot t can be computed by \(C_{comp}=\sum ^{S}_{i=1}c_{comp}(k,i,t)\). Thus, the data transmission delay between services can be calculated by
where \(d(kk^{*})\) denotes the data transmission size between interacting services. In that case, the computation delay for executing the successor service \(k^{*}\) can be calculated by the following expression.
where \(m(k^*)\) is the computation resource requirement of successor service \(k^{*}\).
Problem formulation
With the system models are constructed, the response time for handling the interacting services can be obtained as follows.
In addition, the service response time for handling the single atomic services can be obtained by the following expression.
In summary, the total delay for handling all services can be obtained as
In this paper, our purpose is minimizing the service response time for service deployment based on the service demands prediction. So we formulate the service deployment problem as
As mentioned above, the service deployment problem is formulated as a mixed integer nonlinear programming, which is an NPhard problem. We noticed that deep reinforcement learning algorithms have its natural advantages on solving this kind of problemÂ [37], so a DQNbased algorithm is designed to address this problem, which will be described in the next content.
Algorithm for collaborative service dynamic deployment
In this section, the interacting services is deployed by a collaborative service dynamic deployment algorithm with DQN model. The detailed information of this algorithm can be found as follows.
DQN algorithm is a typical deep reinforcement learning algorithm which is produced from the Qlearning algorithmÂ [38]. As Fig.Â 3 shows, the DQN model contains two Qnetworks with the same structure and the same initial parameters, which are current value network and target value network. In DQN algorithm, two neural networks are updated with the different frequency through a iterative computation process. During the training process, the model obtain the initial state and the initial action which are selected based on the greedy policy at first, and then the next state is obtained by calculating the rewards. Secondly, the \((s^{*}_t,a_t,R_t,s^{*}_{t+1})\) is stored in the replay memory. With the training steps increased, the parameters of the Qnetwork are updated and the action value can be calculated to be performed. The details of DQN model can be found inÂ [38].
As the description of DQN model in the above content, we construct state space and action space, and then the reward function is formulated for MDP process. Next we will describe these three elements as below.
State space: In our vehicular edge computing system, the DQN model on cloud servers receives the state of edge clouds at each time slot. Thus, the state space can be expressed as
Action space: Assuming there are K services deployed on S edge clouds. As mentioned in System model section, we defined the service deployment function \(b(k,i,t)\in \{0,1\}\). Therefore, the action space of services deployment is \(2^{S*K}\). Besides the services deployment, we also considered the computation resource allocation during the services deployment progress. In this vehicular edge computing system, we defined the minimum allocation unit is \(\Delta l(k,i,t)\), thus the schema of computation resource allocation follows the below expression.
Therefore, the action space of edge cloud i at time slot t can be formulated as
Reward: The purpose of this paper is searching the optimal deployment strategy and solving the minimization problem of service response time with data transmission delay between interacting services. We let \(P=\sum ^{T}_{t=1}\sum ^{K}_{k=1}W^{sum}(k, t)\). Thus, the reward function can be obtained as
In this model, the action \(A_{t}\) is performed, and then the state of next time slot \(s^{*}_{t+1}\) is obtained. We use \(\Delta w\) denote the difference of response time between two states, which can be calculated as
where a is a constant item.
As the Equation.Â 22 shows, our purpose can be transferred into the optimization for maximizing the reward function. So the action value function \(Q(s^{*},a)\) can be calculated as
where \(\gamma\) denotes the discount factor, and \(\gamma \in [0,1]\). Thus, the action of searching the optimal service deployment \(a^{*}\) can be expressed as the optimization for maximizing the action value.
During this progress, the loss function \(L(\theta _{t})\) can be obtained by
The gradient descent method is employed to update the parameter \(\theta\), which can be expressed as
where \(\eta\) denotes the learning rate, and the parameter \(\theta\) can be updated through \(\mathcal {C}\) steps.
With the MDP process described, the interacting services are deployed by CODDDQN algorithm, which is a iterative process. The details can be found in AlgorithmÂ 2.
Experimental evaluation
Next, we evaluate the efficacy of proposed algorithms, including service demands algorithm and CODDDQN algorithm. First, the accuracy of service demands prediction algorithm is evaluated by realworld dataset, and then the simulation experiments are conducted to evaluate the efficiency of CODDDQN by comparing with other baseline algorithms.
Experiment setting
In this paper, a reallife ISP dataset in China is employed to evaluated the accuracy of service demands prediction, which records more than 480,000 records of mobile users invoking about 16,000 base stations in three citiesÂ [39]. We random select continuous 80 hours records from the dataset to record the service demands from these base stations. We conduct the experiments with four metrics to evaluate the accuracy of service demands prediction algorithm, which are root mean square error (RMSE), mean square error (MSE), mean absolute percentage error (MAPE) and mean absolute error (MAE). We vary the proportion of observation data from \(50\%\) to \(90\%\) to forecast the remain data values and compared with other common prediction algorithms, which are simple exponential smoothing (SES), move average (MA) and autoregressive (AR).
Besides the accuracy of our service demands prediction approach, we also conduct the CODDDQN algorithm with simulation experiments and compare the average response time with following algorithms.

Random: Deploying the services randomly under the constraint of the data size of services and the storage capacity of edge clouds.

Greedy: Deploying the services and allocating the computation resource according to the computation requirement for executing the service. Thus the service with high computation requirement are deployed on the edge cloud with priority.

Frequency: Deploying the services and allocating the computing resource according to the frequency of the service request.

QLearning: QLeaning based service deploying algorithmÂ [40].

DQN w.o. collaboration: DQNbased service deploying algorithm without considering the interrelationship between interacting services.
In this paper, we set the network transmission rate between the edge clouds \(V_{e2e}\) and the network transmission rate between the vehicles and edge clouds \(V_{v2e}\) are 100Mbps. The data size of the services follow the random value from 2GB to 8GB, and the value of computation requirements for executing services are randomly from 1gigacycles to 5gigacycles. To indicate the heterogeneity of the edge clouds, the storage capacity of edge clouds are set as the random value from 10GB to 30GB, and the computing capacity of edge clouds follows the random value from 5GHz to 10GHz. In DQN algorithm, we set the size of experience pool as 3000 and construct neural network with a single hidden layer, whose number of nodes is 128. In our algorithm, the \(\varepsilon\)greedy strategy is used, where the initial value of \(\varepsilon\) is 0.9, and decreases with 0.0005 decrement. After several test, we set batchsize is 64. All of the simulation parameters can be found in TableÂ 2.
Results analysis
First, we evaluate the accuracy of the service demands prediction using the reallife dataset, and vary the proportion of observation data from \(50\%\) to \(90\%\) to forecast the future number of service request. In this paper, our algorithm are compared with other baseline algorithms. From Fig.Â 4, we find the accuracy of our algorithm is higher than other baseline algorithms. As Fig.Â 4a shows, with the training set increases from \(50\%\) to \(90\%\), the MSE decreases from 4489 to 100. When the training set is \(90\%\), the MSE value remain 100, which indicate the higher accuracy can be obtained by our service demand prediction algorithm, therefore we have rich time to caching the service beforehand. Besides the MSE, we also conduct the experiments by other metrics. In Fig.Â 4b, we know the RMSE value decreases from 67 to 12 rapidly, when the proportion increases from \(50\%\) to \(70\%\). The RMSE remains 10 when the proportion is \(90\%\). As Fig.Â 4c and d show that with the proportion increases, the accuracy of prediction increases following. From Fig.Â 4c we can find, with the proportion increases from \(50\%\) to \(70\%\), the MAE of our algorithm decreases rapidly, and achieves at 11.1 when the proportion is \(70\%\). As the proportion increases from \(70\%\) to \(90\%\), the MAE decreases slowly, and achieves at 8.83 when the proportion is \(90\%\). As Fig.Â 4d shows, as the proportion increases from \(50\%\) to \(70\%\), the MAPE of our algorithm decreases from \(19.8\%\) to \(3.64\%\), and achieves at \(3.27\%\) when the proportion is \(90\%\).
With the accuracy of the service demands prediction evaluated, we also evaluate the efficiency of service dynamic deployment algorithm with simulation experiments. In DQN model, we set the initial value of the greedy strategy parameter \(\varepsilon\) is 0.9 and decrement value is 0.0005. First, the hyperparameters in our algorithm are determined through the training progress. As Fig.Â 5 shows, the algorithm can obtain the best performance when the discount factor \(\gamma\) is 0.9, The average response time can reach about 0.65s when the episode decreases at 400. So the optimal discount factor is set as 0.9.
Furthermore, we determine the learning rate by several experiments. FigureÂ 6 shows the convergence performance comparison of the algorithm with different learning rates \(\eta\). From this Figure, we notice the CODDDQN performs best performance when \(\eta = 0.0001\), while the algorithm is not convergence when \(\eta =0.001\) and \(\eta =0.0005\). Therefore, we set the value of learning rate as 0.0001.
Since the hyperparameters are determined, we evaluate the performance of our algorithm to compare with other algorithms. FigureÂ 7 shows the average response time of different algorithms. We can see that our CODDDQN algorithm can achieve the lowest average response time than the four algorithms. As Fig.Â 7 shows, with the number of episode increases, the Qlearning algorithm is not convergence, while our CODDDQN algorithm can obtain the average response time about 0.65s when the episode is 400. Compared with DQN w.o. collaboration algorithm, our algorithm achieves the lower average response time than DON w.o. collaboration algorithm, and converges at 400 episodes, while the DQN w.o. collaboration algorithm converges at about 600 episodes. Because the DQN w.o. collaboration algorithm deploys the services without considering the relationships between interacting services, which may increase the data communication delay between interacting services.
We also conduct the experiments under different system simulation parameters. Since the Qlearning algorithm cannot converge, we only compare the average response time of our algorithm with other baseline algorithms. First, we evaluate the service response time with different values of storage capacity. FigureÂ 8 show the convergence performance and service response time comparison under different storage capacity. The performance of CODDDQN algorithm and DQN w.o. collaboration algorithm can be found in Fig.Â 8a. We notice that the smaller the storage capacity of edge clouds, the higher response time of the algorithm. The CODDDQN algorithm can achieve the lower response time than DQN w.o. collaboration algorithm, and converges about at 0.7s when the storage capacity is 20GB. FigureÂ 8b shows the service response time comparison between CODDDQN algorithm and other baseline algorithms under different storage capacity of edge clouds. From the Figure, we can see the CODDDQN algorithm can obtain the lowest response time than other algorithms. With the storage capacity increased from 10GB to 30GB, the response time decreases following, and the response time of our CODDDQN algorithm remains at about 0.67s when the storage capacity increases at 30GB.
FigureÂ 9 show the convergence performance and service response time comparison of the algorithms under different values of the number of services. From the Fig.Â 9a we know the service response time of two DRLbased algorithms with the number of services is 10 are higher than that when the number of services is 8. Thus, the more services, the higher response time in our system. We also found that the CODDDQN algorithm can obtain the lower response time than DQN w.o. collaboration algorithm, and Converges at about 0.59s when the number of services is 8. FigureÂ 9b shows the service response time comparison between CODDDQN algorithm and other baseline algorithms under different values of the number of services. With the number of services increased from 4 to 12, the response time of CODDDQN algorithm increases from 0.31s to 1.28s, and achieves the lowest response time than other algorithms.
Besides these experiments, we also conduct the experiments under other different system parameters. we vary the computing capacities of the edge clouds and conduct the performance of different algorithms. FigureÂ 10 show the group results of convergence performance and service response time comparison of the algorithms under different computing capacities of edge clouds. From the Fig.Â 10a we know the service response time of two DRLbased algorithms with the computing capacity of edge clouds is 6Â GHZ are higher than that when the computing capacity of edge clouds is 8Â GHZ. Thus, the higher computing capacity of edge clouds , the lower response time in our system. We also found that the CODDDQN algorithm can obtain the lower response time than DQN w.o. collaboration algorithm. FigureÂ 10b shows the service response time comparison between CODDDQN algorithm and other baseline algorithms under different computing capacities. With the computing capacities increased from 6Â GHZ to 10Â GHZ, the response time of CODDDQN algorithm decreases from 0.85s to 0.74s, and achieves the lowest response time than other algorithms.
In order to indicate the performance of algorithms under different number of edge clouds, we also vary the number of edge clouds and compare the performances of different algorithms. FigureÂ 11 show the group results of convergence performance and service response time comparison of the algorithms under different number of edge clouds. From the Fig.Â 11a we know the service response time of two DRLbased algorithms with the number of edge clouds is 3 are higher than that when the number of edge clouds is 5. Thus, the more edge clouds, the lower response time in our system. We also found that the CODDDQN algorithm can obtain the lower response time than DQN w.o. collaboration algorithm. FigureÂ 11b shows the service response time comparison between CODDDQN algorithm and other baseline algorithms under different values of the number of edge clouds. With the number of edge clouds increased from 3 to 7, the response time of CODDDQN algorithm decreases from 0.81s to 0.68s, and achieves the lowest response time than other algorithms.
Conclusion
In this paper, A collaborative service ondemand dynamic deployment approach via DQN model is proposed in vehicular edge computing, which is named CODDDQN. To investigate the temporal dynamic characteristics of service request, a timeaware service demands prediction algorithm by ARIMA model is produced to forecast the number of service request for each edge cloud, and then the interacting services are discovered through the analysis of the service invoking logs. Furthermore, the service response time models are constructed to formulate the service deployment as an optimization problem, and the collaborative service deployment algorithm is presented by DQN model to deploy the interacting services, which can solve the minimization problem of service response time with data transmission delay. Finally, the reallife dataset based experiments are conducted to evaluate the efficiency of the algorithms. The results show proposed CODDDQN algorithm can achieve lowest service response time than other algorithms on deploying the interacting services.
Noticed that our purpose is to design approach for service dynamic deployment by forecasting the number of service request with efficiency. To improve the utilization of the computation resource, we also design a primer resource allocation function during service deployment. Note that the resource allocation is a complex problem which is need to be studied, and thus the detail schema of resource should be designed. In the future, we plane to design a detail resource allocation strategy to improve the utilization of the resource. Besides this, we also notice the efficacy of our algorithms are only evaluated by simulation experiment in laboratory environments due to the limitation of hardware. We will construct the real vehicular edge computing environment to evaluate efficiency and improve the performance of the algorithms.
Availability of data and materials
The datasets used during the current study are available from the corresponding author on reasonable request.
References
ContrerasCastillo J, Zeadally S, IbÃ¡Ã±ez JAG (2018) Internet of Vehicles: Architecture, Protocols, and Security. IEEE Internet Things J. 5(5):3701â€“3709
Wang X, Ning Z, Hu X, Wang L, Hu B, Cheng J et al (2019) Optimizing Content Dissemination for RealTime Traffic Management in LargeScale Internet of Vehicle Systems. IEEE Trans Veh Technol. 68(2):1093â€“1105
Singh D, Singh M (2015) Internet of vehicles for smart and safe driving. International Conference on Connected Vehicles and Expo, ICCVE 2015, October 1923, 2015. IEEE, Shenzhen, pp 328â€“329
Hussain R, Kim D, Son J, Lee J, Kerrache CA, Benslimane A et al (2018) Secure and PrivacyAware IncentivesBased Witness Service in Social Internet of Vehicles Clouds. IEEE Internet Things J. 5(4):2441â€“2448
Zhang M, Wang S, Gao Q (2020) A joint optimization scheme of content caching and resource allocation for internet of vehicles in mobile edge computing. J Cloud Comput. 9:33
Wu L, Zhang R, Li Q, Ma C, Shi X (2022) A mobile edge computingbased applications execution framework for Internet of Vehicles. Frontiers Comput Sci. 16(5):165506
Zhang J, Letaief KB (2020) Mobile Edge Intelligence and Computing for the Internet of Vehicles. Proc IEEE. 108(2):246â€“261
Chen Y, Zhao J, Zhou X et al (2023) A Distributed Game Theoretical Approach for Credibilityguaranteed Multimedia Data Offloading in MEC. Inf Sci. 644:119306. https://doi.org/10.1016/j.ins.2023.119306
Zhang Y (2022) Mobile Edge Computing, vol 9. Springer, Cham
Ning Z, Huang J, Wang X, Rodrigues JJPC, Guo L (2019) Mobile Edge ComputingEnabled Internet of Vehicles: Toward EnergyEfficient Scheduling. IEEE Netw. 33(5):198â€“205
Wang S, Urgaonkar R, He T, Chan K, Zafer M, Leung KK (2017) Dynamic Service Placement for Mobile MicroClouds with Predicted Future Costs. IEEE Trans Parallel Distrib Syst. 28(4):1002â€“1016
Hao Y, Chen M, Cao D, Zhao W, Petrov I, Antonenko VA et al (2020) CognitiveCaching: Cognitive Wireless Mobile Caching by Learning FineGrained CachingAware Indicators. IEEE Wirel Commun. 27(1):100â€“106
Chen L, Zhou P, Gao L, Xu J (2018) Adaptive Fog Configuration for the Industrial Internet of Things. IEEE Trans Ind Inform. 14(10):4656â€“4664
Wang L, Jiao L, He T, Li J, MÃ¼hlhÃ¤user M (2018) Service Entity Placement for Social Virtual Reality Applications in Edge Computing. 2018 IEEE Conference on Computer Communications, INFOCOM 2018, April 1619, 2018. IEEE, Honolulu, pp 468â€“476
AÃ¯tSalaht F, Desprez F, Lebre A (2021) An Overview of Service Placement Problem in Fog and Edge Computing. ACM Comput Surv 53(3):65:165:35
Poularakis K, Llorca J, Tulino AM, Taylor IJ, Tassiulas L (2019) Joint Service Placement and Request Routing in Multicell Mobile Edge Computing Networks. 2019 IEEE Conference on Computer Communications, INFOCOM 2019, April 29  May 2, 2019. IEEE, Paris, pp 10â€“18
Ma X, Zhou A, Zhang S, Wang S (2020) Cooperative Service Caching and Workload Scheduling in Mobile Edge Computing. 39th IEEE Conference on Computer Communications, INFOCOM 2020, July 69, 2020. IEEE, Toronto, pp 2076â€“2085
Chen Y, Zhao J, Hu J et al (2023) Distributed Task Offloading and Resource Purchasing in NOMAenabled Mobile Edge Computing: Hierarchical Game Theoretical Approaches. ACM Trans Embed Comput Syst. early access. https://doi.org/10.1145/3597023
Hao Y, Chen M, Gharavi H, Zhang Y, Hwang K (2021) Deep Reinforcement Learning for Edge Service Placement in Softwarized Industrial CyberPhysical System. IEEE Trans Ind Informatics. 17(8):5552â€“5561
Wang R, Kan Z, Cui Y, Wu D, Zhen Y (2021) Cooperative Caching Strategy With Content Request Prediction in Internet of Vehicles. IEEE Internet Things J. 8(11):8964â€“8975
Hui Y, Ma X, Su Z, Cheng N, Yin Z, Luan TH et al (2022) Collaboration as a Service: DigitalTwinEnabled Collaborative and Distributed Autonomous Driving. IEEE Internet Things J. 9(19):18607â€“18619
Chen H, Qin W, Wang L (2022) Task partitioning and offloading in IoT cloudedge collaborative computing framework: a survey. J Cloud Comput. 11:86
Huang J, Gao H, Wan S et al (2023) AoIaware energy control and computation offloading for industrial IoT. Futur Gener Comput Syst. 139:29â€“37
Chen Y, Zhao J, Wu Y et al (2022) QoEaware Decentralized Task Offloading and Resource Allocation for EndEdgeCloud Systems: A GameTheoretical Approach. IEEE Trans Mob Comput. early access.1â€“17. https://doi.org/10.1109/TMC.2022.3223119
Chen Y, Hu J, Zhao J, Min G (2023) QoSAware Computation Offloading in LEO Satellite Edge Computing for IoT: A GameTheoretical Approach. Chin J Electron. early access. https://doi.org/10.23919/cje.2022.00.412
LiWang M, Gao Z, Hosseinalipour S, Dai H (2020) MultiTask Offloading over Vehicular Clouds under Graphbased Representation. 2020 IEEE International Conference on Communications, ICC 2020, June 711, 2020. IEEE, Dublin, pp 1â€“7
Chen Y, Gu W, Xu J et al (2022) Dynamic Task Offloading for Digital Twinempowered Mobile Edge Computing via Deep Reinforcement Learning. Chin Commun. early access.Â 1â€“12. https://doi.org/10.23919/JCC.ea.20220372.202302
Hegyi P (2022) Service deployment design in latencycritical multicloud environment. Comput Netw. 213:108975
Lima D, Miranda H (2022) A geographicalaware state deployment service for Fog Computing. Comput Netw. 216:109208
Huang J, Lv B, Wu Y et al (2022) Dynamic Admission Control and Resource Allocation for Mobile Edge Computing Enabled Small Cell Network. IEEE Trans Veh Technol. 71(2):1964â€“1973
Chen Y, Xing H, Ma Z, etÂ al (2022) CostEfficient Edge Caching for NOMAenabled IoT Services. Chin Commun
Huang J, Wan J, Lv B, Ye Q et al (2023) Joint Computation Offloading and Resource Allocation for EdgeCloud Collaboration in Internet of Vehicles via Deep Reinforcement Learning. IEEE Syst J. 17(2):2500â€“2511. https://doi.org/10.1109/JSYST.2023.3249217
Huang Y, Cao Y, Zhang M, Feng B, Guo Z (2022) CSODRL: A Collaborative Service Offloading Approach with Deep Reinforcement Learning in Vehicular Edge Computing. Sci Prog. 2022:1163177. https://doi.org/10.1155/2022/1163177
Huang Y, Huang J, Cheng B, Yao T, Chen J (2017) Poster: Interacting DataIntensive Services Mining and Placement in Mobile Edge Clouds. Proceedings of the 23rd Annual International Conference on Mobile Computing and Networking, MobiCom 2017, October 16  20, 2017. ACM, Snowbird, pp 558â€“560
Huang Y, Huang J, Liu C, Zhang C (2020) PFPMine: A parallel approach for discovering interacting data entities in dataintensive cloud workflows. Future Gener Comput Syst. 113:474â€“487
Box GEP, Jenkins GM (2015) Time Series Analysis: Forecasting and Control, 5th edn. Wiley, Hoboken
Chen W, Qiu X, Cai T, Dai H, Zheng Z, Zhang Y (2021) Deep Reinforcement Learning for Internet of Things: A Comprehensive Survey. IEEE Commun Surv Tutorials. 23(3):1659â€“1692
Mnih V, Kavukcuoglu K, Silver D, Rusu AA, Veness J, Bellemare MG et al (2015) Humanlevel control through deep reinforcement learning. Nat. 518(7540):529â€“533
Liu H, Li Y, Wang S (2022) Request Scheduling Combined with Load Balancing in Mobile Edge Computing. IEEE Internet of Things. 9(21):20841â€“20852. https://doi.org/10.1109/JIOT.2022.3176631
Suton RS, Barto AG (2018) Reinforcement Learning, 2nd edn. MIT Press, Cambridge
Acknowledgements
The authors would like to thank the anonymous reviewers for their insightful comments and suggestions on improving this paper.
Funding
This work is sponsored by Natural Science Foundation of Chongqing, China (No. CSTB2022NSCQMSX0368), and Young Project of Science and Technology Research Program of Chongqing Education Commission of China (No. KJQN202200702, No. KJQN201900708).
Author information
Authors and Affiliations
Contributions
Yuze Huang conceived the initial ideal and designed the algorithms, and wrote the paper. Beipeng Feng designed system model and carried out the experiments. Yuhui Cao analyzed the experimental data. Zhenzhen Guo contributed to data collection and analysis. Miao Zhang and Boren Zheng proofread the manuscript. The authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Ethics approval and consent to participate
Not applicable.
Consent for publication
Not applicable.
Competing interests
The authors declare no competing interests.
Additional information
Publisherâ€™s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Huang, Y., Feng, B., Cao, Y. et al. Collaborative ondemand dynamic deployment via deep reinforcement learning for IoV service in multi edge clouds. J Cloud Comp 12, 119 (2023). https://doi.org/10.1186/s13677023004886
Received:
Accepted:
Published:
DOI: https://doi.org/10.1186/s13677023004886
Keywords
 Service deployment
 Internet of vehicles
 Service demands
 Deep reinforcement learning
 Multi edge clouds