### System model

In the system, macro base stations are connected to Internet by the core network in cellular communication system. MEC servers are deployed at macro base stations and micro base stations [30]. It is assumed that micro base stations are connected to macro base stations in a wired manner in this system. Since the interference between macro base stations is small, it is assumed that there is a network architecture of *n* micro base stations within the coverage of one macro base station, and *n* = {1, 2, ⋯, *N*} represents a collection of micro base stations. There are *i* vehicles under the micro base station *n*, *i* = {1, 2, ⋯, *I*} represents a collection of vehicles. Only single-antenna vehicles and micro base stations are considered in this system. The system model for multiple base stations and multiple MEC servers is shown in Fig. 1.

It is assumed that each vehicle has a computationally intensive and demanding task that needs to be completed in unit time. Each vehicle can offload the calculation to MEC servers by the micro base station or macro base station connected to it. Each vehicle will upload a task, the tasks uploaded by vehicle *i* are:

$$ {T}_i=\left\{{D}_i,{C}_i,{T}_i^{\mathrm{max}}\right\} $$

(1)

where *D*_{i} is the amount of data uploaded by tasks, *C*_{i} is the number of CPU cycles required by the server to process tasks, and \( {T}_i^{\mathrm{max}} \) is the maximum time allowed for the task to complete.

During the task offloading process, the vehicle is constantly moving, and the access base station may be switched. This system mainly considers task-intensive and ultra-low-latency task offloading, \( {T}_i^{\mathrm{max}} \) is less than tens of milliseconds. Therefore, it is assumed that no base station handover occurs during task offloading.

### Vehicle distance prediction based on Kalman filtering

There are three key random vectors in the whole process of Kalman filtering: the predicted value \( {X}_t^p \) of system state, the measured value \( {X}_t^m \) and the estimated value \( {X}_t^c \). \( {X}_t^p \) represents the final estimation of *t* cycle system state by Kalman filtering, which is obtained by data fusion between \( {X}_t^m \) and \( {X}_t^c \) [31]. The prediction process is:

$$ {\displaystyle \begin{array}{l}{x}_t^p={F}_t{x}_{t-1}^c+{B}_t{u}_{t-1}\\ {}{P}_t^p={F}_t{P}_{t-1}^c{F}_t^T+{Q}_{t-1}\end{array}} $$

(2)

where \( {x}_t^p \) is the mean of \( {X}_t^p \) and \( {P}_t^p \) is the covariance matrix of \( {X}_t^p \). \( {x}_t^c \) is the mean of \( {X}_t^c \) and \( {P}_t^c \) is the covariance matrix of \( {X}_t^c \). *F*_{t} represents the transition matrix of the impact for *t* − 1 cycle system state on *t* cycle system state, and *u*_{t − 1} is the control input matrix. *B*_{t} represents the matrix that transforms the influence of the control input to system state, and *Q*_{t − 1} represents the covariance matrix of predicted noise. Here, the prediction noise is assumed to be a Gaussian distribution with zero mean, so it only affects the covariance matrix of this predicted value. Moreover, the prediction noise indicates the accuracy of the prediction model. If the prediction model is more accurate, the prediction noise is smaller.

In an actual system, the object of measurement may not be system states, but some measurement parameters related to it. The measured value of system states can be obtained indirectly by these measurement parameters. Let these measurement parameters be *Z*_{t}, and their relationship with the measured values is:

$$ {Z}_t={H}_t{x}_t^m+{s}_t $$

(3)

where *Z*_{t} represents the matrix that maps system states to the measurement parameters. *s*_{t} represents measurement noise, subjects to a Gaussian distribution with mean zero and covariance matrix *R*_{t}.

The process of Kalman filtering is shown in Fig. 2. The left half of this figure indicates that when the system is in period t, the system state of period t + 1 is predicted. The right half of this figure shows that after the t + 1 period, the measured value of t + 1 period is obtained. Thus, the estimated value of t + 1 period is calculated as the input for the next round of prediction. It is applied to vehicle distance prediction.

The system state is the location information of vehicle *i* (vehicle *i*, *v*_{i}). Since the width of road is negligible relative to the length, the vehicle position is modeled as a one-dimensional coordinate. In order to make the prediction model more accurate, speed is also added to the system state. Thus, the mean value \( {X}_{i,t}^c \) of the estimated value \( {x}_{i,t}^c \) of *v*_{i} in *t* period is shown in Eq. 3–7, and the predicted and measured values are the same.

$$ {x}_{i,t}^c=\left[\begin{array}{l}{loc}_{i,t}^c\\ {}{velocity}_{i,t}^c\end{array}\right] $$

(4)

Use uniformly accelerated linear motion to predict this system, and set the period interval to T. The acceleration of *v*_{i} is *a*_{i, t}, then:

$$ {F}_t\left[\begin{array}{l}1\kern1em \Delta t\\ {}0\kern1em 1\end{array}\right],{B}_t=\left[\begin{array}{l}\frac{{\left(\Delta t\right)}^2}{2}\\ {}\kern0.5em \Delta t\end{array}\right],{u}_t={a}_{i,t} $$

(5)

When directly measuring the position and speed, \( {X}_{i,t}^m={Z}_{i,t} \), that is:

$$ {H}_t=\left[\begin{array}{l}1\kern1em 0\\ {}0\kern1em 1\end{array}\right],{Z}_{i,t}={X}_{i,t}^m,{R}_{i,t}={P}_{i,t}^m $$

(6)

where *Z*_{i, t} is the measurement parameter, *H*_{t} represents the matrix that maps system states to measurement parameters, and *R*_{i, t} is the covariance matrix of measurement noises.

Substituting eqs. (4)–(6) into eqs. (2) (3), Kalman filtering can be applied to vehicle position prediction. Since system states are a two-dimensional Gaussian distribution composed of position and velocity, it is easy to obtain a one-dimensional Gaussian distribution in various dimensions. Let \( {LOC}_{i,t}^c \) be the estimated value of position for *v*_{i} in *t* period. Similarly, \( {LOC}_{i,t}^p \) is the predicted value and \( {LOC}_{i,t}^m \) is the measured value. They all obey one-dimensional Gaussian distribution, namely:

$$ {\displaystyle \begin{array}{l}{LOC}_{i,t}^e\sim N\left({\mu}_{i,t}^e,{\left({\mu}_{i,t}^e\right)}^2\right),{LOC}_{i,t}^p\sim N\left({\mu}_{i,t}^p,{\left({\mu}_{i,t}^p\right)}^2\right),\\ {}{LOC}_{i,t}^c\sim N\left({\mu}_{i,t}^c,{\left({\mu}_{i,t}^c\right)}^2\right)\end{array}} $$

(7)

For two vehicles *v*_{i} and *v*_{j}, at the *t* cycle, random variable *D*_{i, j, t} between them can be obtained by subtracting the position random variables *LOC*_{i, t} and *LOC*_{j, t}:

$$ {D}_{i,j,t}={LOC}_{i,t}-{LOC}_{j,t} $$

(8)

A random variable representing the distance between two vehicles can be obtained by the above formula. At the same time, *D*_{i, j, t} follows a one-dimensional Gaussian distribution, such as:

$$ {D}_{i,j,t}=N\left({\mu}_{i,t}-{\mu}_{j,t},{\left({\sigma}_{i,t}\right)}^2+{\left({\sigma}_{j,t}\right)}^2\right) $$

(9)

Compared to random variables, Vehicle to Vehicle (V2V) computing offloading and V2V communication resource algorithms hope to obtain an exact value directly representing the distance between two vehicles. In this way, V2V computing offloading and V2V communication resource allocation algorithms can completely ignore the mobility and focus on the problem itself to achieve decoupling of complex problems [32].

### Participation in vehicle location privacy protection mechanism

Note that the probability of disturbance from real position \( {l}_i^r \) to position \( {l}_j^o \) of the participant is \( p\left({l}_j^o\left|{l}_i^r\right.\right) \), so for all positions of the participant, the probability matrix of disturbance can be obtained as *P* and *P* = {*p*_{i, j}}_{L × m}, which is expressed as follows

$$ \mathbf{P}={\left[\begin{array}{l}p\left({l}_1^o\left|{l}_1^r\right.\right)\kern1em p\left({l}_1^o\left|{l}_2^r\right.\right)\kern0.5em \cdots \kern0.5em p\left({l}_1^o\left|{l}_m^r\right.\right)\ \\ {}p\left({l}_2^o\left|{l}_1^r\right.\right)\kern1em p\left({l}_2^o\left|{l}_2^r\right.\right)\kern0.5em \cdots \kern0.5em p\left({l}_2^o\left|{l}_m^r\right.\right)\\ {}\kern1.5em \vdots \kern4em \vdots \kern2em \vdots \kern2.25em \vdots \kern1.5em \\ {}p\left({l}_L^o\left|{l}_1^r\right.\right)\kern1.25em p\left({l}_L^o\left|{l}_2^r\right.\right)\kern0.5em \cdots \kern0.5em p\left({l}_L^o\left|{l}_m^r\right.\right)\end{array}\right]}_{L\times m} $$

(10)

Therefore, \( {p}_{i,j}=p\left({l}_j^o\left|{l}_i^r\right.\right) \) can also be understood as the conditional probability of \( {l}_i^r \) disturbance to \( {l}_j^o \) in the real position. Next, based on the differential privacy, the location indistinguishability disturbance mechanism is proposed.

The probability perturbation mechanism *P* satisfies the position indistinguishability if and only if it satisfies the following inequality

$$ p\left({l}_j^o\left|{l}_{i_1}^r\right.\right)\le {e}^{ed\left({l}_{i_1}^r,{l}_{i_2}^r\right)}p\left({l}_j^o\left|{l}_{i_2}^r\right.\right) $$

(11)

Where \( {l}_{i_1}^r \) and \( {l}_{i_1}^r \) belong to set *l*^{R}. As the differential privacy budget *e* represents the degree of privacy protection, generally speaking, the smaller *e* is, the higher the degree of privacy protection is, the more difficult it is for \( {l}_{i_1}^r \) and \( {l}_{i_1}^r \) to distinguish; on the contrary, it means the degree of privacy protection is low, and the distinction between the two real locations is high. The function \( d\left({l}_{i_1}^r,{l}_{i_2}^r\right) \) represents the distance between position \( {l}_{i_1}^r \) and position \( {l}_{i_2}^r \), which can be Euclidean distance or Hamming distance. The distance function adopted in this chapter is Euclidean distance. In fact, it can be seen from formula (11) that when the appropriate differential privacy budget *e* is selected, if two positions are selected. The smaller the distance between \( {l}_{i_1}^r \) and \( {l}_{i_2}^r \), that is, the closer the two positions are, the smaller the probability of generating disturbance position \( {l}_j^o \) from these two positions is. In other words, in this case, the attacker can’t exactly distinguish the real location of the participant or the location near the participant.

Because the participant only publishes the disturbed location, the attacker can observe the disturbed location of the participant, but can’t get its real location directly. In this chapter, we consider that the attacker has background knowledge, that is, the attacker can obtain disturbance mechanism *P* and probability \( p\left({l}_i^r\right) \), then the attacker can use Bayesian theorem to deduce the observed disturbance location to get its real location. Probability \( p\left({l}_i^r\left|{l}_j^o\right.\right) \) represents the probability that the real location of the participant is in \( {l}_i^r \) under the premise of disturbance location. From Bayes theorem and total probability formula, we can get:

$$ p\left({l}_i^r\left|{l}_j^o\right.\right)=\frac{p\left({l}_i^r\right)p\left({l}_j^o\left|{l}_i^r\right.\right)}{p\left({l}_j^o\right)}=\frac{p\left({l}_i^r\right)p\left({l}_j^o\left|{l}_i^r\right.\right)}{\sum_{i=1}^mp\left({l}_j^o\left|{l}_i^r\right.\right)p\left({l}_i^r\right)} $$

(12)

From the above formula, it can be seen that since the disturbance mechanism *P* (i.e. the probability from the real location \( {p}_i^r \) to the disturbed location \( {p}_j^o \)) can be obtained by the attacker, and the probability \( p\left({l}_i^r\right) \) of the real location can also be obtained (the attacker can get the posterior probability \( p\left({l}_i^r\left|{l}_j^o\right.\right) \) by using the Markov model through the public data set). And \( p\left({l}_i^r\left|{l}_j^o\right.\right) \) is bounded. Therefore, the disturbance probability matrix satisfying formula (11) can realize the indistinguishability of participants’ location, overcome the attackers with prior knowledge, and protect the participants’ location privacy.

### System model analysis

In the analysis of vehicle edge computing, it is assumed that the edge network node base station serves as the dispatch control center. The vehicle user equipment is the computing task generator, and vehicles and base stations are the computing task offloading processors, as shown in Fig. 3. When a computing task is generated on the vehicle equipment side, scheduling requesting will first reach the edge network node base station. The task will be scheduled by base stations, and the scheduling algorithm decides to schedule computing tasks to a service queue on the base station node side or a service queue for vehicles [33]. Once the computing task enters a queue, it will queue up at the end of this queue. At the same time, it is assumed that vehicle users have a total of *M* different computing tasks. For each computing task *m*, there is a fixed communication workload *f*_{m}, a fixed computing workload *d*_{m} and a fixed task time constraint *T*_{m}. The computing task volume can be expressed by the number of revolutions of the CPU.

Vehicles perform periodic state interactions, and information such as location, driving direction, speed and idle computing power of neighboring vehicles can be obtained by the communication network. When the vehicle equipment generates a computing task, it initiates the computing offloading request information to edge nodes. The request information includes explanation information about computing tasks. The explanation information includes: the communication task size *f*_{m} of computing tasks, the computing task size *d*_{m}, the delay requirement *T*_{m}, and the idle computing capacity of the neighboring vehicle.

It is also assumed that vehicles on the road are traveling at a constant speed at a fixed speed. In the analysis of vehicle communication mechanism in the previous two chapters, it can be seen that there is a communication link between edge node base stations and vehicles. Information such as the vehicle’s computing power, location, driving direction and speed can be periodically interacted with base stations by CAM messages. The system scheduling decision is \( {b}_k^t\in \left({b}_1,{b}_2,\cdots, {b}_m,{b}_{m+1}\right) \), where \( {b}_k^t \) indicates that the computing task that arrives at time t is placed in the corresponding computing processing queue *k* [9]. Therefore, when the computing request of vehicle users arrives, how to allocate computing tasks to the corresponding calculation service queue, and thus ensure the delay requirement of the long message security service, which allows the system to have the greatest alarm revenue.

In the analysis of our designed computing task scheduling model, the scheduling process is regarded as a Markov decision process [34]. When the base station receives computing offloading requests sent by the user equipment of vehicles, base stations calculate queue status according to the calculation. The state of the available computing processing queue of vehicles and the information of the computing task combined with Markov decision model determine a certain computing processing queue as the offloading queue of computing tasks. The definition of system states at time t is as follows:

$$ {S}^t=\left({q}_1^t,{q}_2^t,\cdots, {q}_m^t,{q}_{m+1}^t,{v}_{m+1}^t,{d}^t,{f}^t\right) $$

(13)

where \( {q}_1^t,{q}_2^t,\cdots, {q}_m^t \) is the queue length (computing task size) of *m* computing processing queues at edge nodes at time *t*. \( {q}_{m+1}^t \) is the length of vehicles’ calculation processing queue, and *d*^{t} is the amount of computing task generated by users at time *t*. *f*^{t} is the size of communication task generated by users at time *t*. The value of \( {v}_{m+1}^t \) is the idle computing capacity of vehicles generating the emerging alarm service and its neighboring auxiliary vehicles.

The system state at time *t* is \( \left({q}_1^t,{q}_2^t,\cdots, {q}_m^t,{q}_{m+1}^t,{v}_{m+1}^t,{d}^t,{f}^t\right) \), and the scheduling decision is \( {b}_k^t\in \left({b}_1,{b}_2,\cdots, {b}_m,{b}_{m+1}\right) \). The actual processing capacity of each computing processing queue within time interval *τ* is shown in the following formula:

$$ {\hat{S}}_k^t=\min \left({q}_k^t+{P}_k^t\times {d}^t,{v}_k\times \tau \right) $$

(14)

In the formula, when the scheduling probability \( {P}_k^t \) is 1, it means that computing task *d*^{t} that arrives at time t is scheduled to the computing task processing queue *k*. When \( {P}_k^t \) is 0, it means that the computing task that arrives at time t is not scheduled to the computing task processing queue *k*. Therefore, the system state at *t* + 1 can be derived as shown in the following formula:

$$ {\displaystyle \begin{array}{l}{S}^{t+1}=\Big({q}_1^t+{P}_1^t\cdot {d}^t-{\hat{S}}_1^t,\cdots, {q}_m^t+{P}_m^t\cdot {d}^t-{\hat{S}}_m^t,\\ {}\kern2.5em {q}_{m+1}^t+{P}_{m+1}^t\cdot {d}^t-{\hat{S}}_{m+1}^t,{v}_{m+1}^{t+1},{d}^{t+1},{f}^{t+1}\Big)\end{array}} $$

(15)

In addition, the impact of communication resource allocation on computing resource scheduling needs to be considered. If the scheduling behavior \( {b}_k^t \) schedules the computing task of the vehicle safety application to vehicle nodes, then tasks will be coordinated by neighboring vehicles to participate in the calculation process, and the processing delay is as follows:

$$ {T}_b^t=\frac{d_m^t+{q}_{m+1}^t}{v_{m+1}} $$

(16)

If the scheduling behavior \( {b}_k^t \) schedules computing tasks of the vehicle safety application to base stations, then the completion delay \( {T}_b^t \) of task *m* due to scheduling is:

$$ {T}_b^t=\frac{d_m^t+{q}_k^t}{v_k}+\frac{f_m^t}{C} $$

(17)

where the uplink communication rate between user equipment of vehicle *C* and the edge node base station.

At this point, the return *r*_{t} from the state transition from *S*_{t} to *S*_{t + 1} caused by behavior decision \( {b}_k^t \) can be analyzed as:

$$ {r}_t=r\left({s}^t,{b}^t,{s}^{t+1}\right)=\sum \limits_{k=0}^{m+1}\left(\frac{{\hat{S}}_k^t}{V_k}\cdot {\zeta}_k\right)-\alpha {\left({q}_k^{t+1}\right)}^2-\beta {F}_2\left({T}_b^t-{T}_m\right) $$

(18)

The first item about *r*_{t} is the total alarm revenue from computing resources provided by each service queue within a time interval. The second term is to punish the square of queue length in order to avoid a serious imbalance in the length of service queue. The last item is the punishment of whether tasks are completed within time delay requirement to improve the alarm performance. In order to obtain better performance in the long term, computing resource providers must consider not only the return at the current moment, but also the future return to be obtained. The ultimate goal is to learn an optimal scheduling strategy to maximize the cumulative discount reward, as shown in the following formula:

$$ {\pi}^{\ast }=\arg \underset{\pi }{\max E}\left[\sum \limits_{t=0}^{\infty}\left({\eta}^t\cdot {r}_t\right)\right] $$

(19)

where *η*(0 ≤ *η* ≤ 1) is the discount factor. When t is large enough, *η*^{t} tends to 0, which means that *r*_{t} has a small effect on the total return. The ultimate goal is to learn an optimal scheduling strategy *π*^{∗} to maximize system revenue.