Cloud-edge data encryption in the internet of vehicles using Zeckendorf representation

Wu, Yun; Wu, Liangshun; Cai, Hengjin

doi:10.1186/s13677-023-00417-7

Research
Open access
Published: 17 March 2023

Cloud-edge data encryption in the internet of vehicles using Zeckendorf representation

Yun Wu¹,
Liangshun Wu² &
Hengjin Cai¹

Journal of Cloud Computing volume 12, Article number: 39 (2023) Cite this article

1582 Accesses
3 Citations
1 Altmetric
Metrics details

Abstract

Cloud-edge data security is a key issue in the internet of vehicles (IoV), as the potential for data breaches increases as more vehicles are connected. As vehicles become smarter and more connected, the risk of unauthorized access to the data generated by the vehicles also increases. Data encryption is a highly effective security measure that is widely used to protect the IoV from malicious actors. By encrypting data, it becomes virtually impossible for unauthorized individuals to access the information. This ensures that only the intended parties can access the data, allowing for secure communication between cloud and edge. Data encryption is a cost-effective and reliable security measure that is essential for any organization that relies on the IoV. The IoV is characterized by the large volume of data that is exchanged between devices in cloud and edge. This necessitates the use of a strong encryption method, such as stream ciphering, which is particularly well-suited to this type of environment. Stream ciphering provides the highest levels of security, making it the ideal choice for securing data transmission in the IoV. Many stream ciphering algorithms use bitwise exclusive or (XOR) to encrypt the data stream, so the core is the generation of a pseudo-random key stream. This paper proves that the probability of the number 1 appearing in the middle part of the Zeckendorf representation is constant, which can be used to generate pseudo-random key stream sequences. The pseudo-random sequence generated by the linear feedback shift register (LSFR) is periodic, and the key sequence will be duplicated. The logistic chaos (LC) sequence is too sensitive to the disturbance of initial value, and its stability is poor. In this paper, our proposed ZPKG (key generator based on Zeckendorf presentation) algorithm solves these two main problems in stream ciphering. The generated key sequence not only has strong randomness, but also is infinitely long, and it is robust to the minor disturbance of the initial value.

Graphical Abstract

Introduction

Internet of Vehicles (IoV) is an emerging technology that enables the connection of vehicles to the Internet, allowing them to communicate with other connected devices. This technology enables vehicles to be monitored and controlled remotely, allowing for advanced safety features such as automatic braking and lane departure warnings. It also allows for improved fuel efficiency and emissions management and the integration of other digital services such as navigation, entertainment, and more. IoV is a rapidly developing technology with the potential to revolutionize the automotive industry, making vehicles safer and more efficient. Big data analytics can be used to gain valuable insights from the data collected, allowing for better decision-making and predictive capabilities. With the ever-increasing sophistication of technology, the sensors can detect a multitude of variables, from the vehicle’s speed and direction to the engine’s temperature and fuel level. Figure 1 gives examples of the data transmitted in the IoV, including advisory speed, flags, vehicle speed, driver input, raw position, lane level position, and so on. The data are often transmitted to cloud servers for Map Matching, Digital Twin Visualization, Human Behavior Modeling, Motion Planning, and Control, creating insights into the vehicle’s performance and making recommendations for better driving and maintenance. By combining the power of IoV and Big Data, businesses can make better decisions and create innovative solutions which can improve the lives of individuals and communities.

A typical model of IoV is comprised of three parties:

Cloud: The Cloud is considered reliable and trustworthy. Periodically, the RSU sends data packets to the Cloud. The data packets contain critical information such as car conditions, real-time traffic conditions, etc. The Cloud records the properties of each car and broadcast traffic statistics.
RSUs: RSUs (Road Side Units) are roadside units in the ETC system that are supported by Dedicated Short Range Communication (DSRC) technology. They interact with OBUs. RSUs are utilized for vehicle identification as well as computerized point deduction. RSUs are often put on the side of the road and unattended express lanes, which are responsible for the management of highways and vehicle yards.
OBUs: OBUs (Short for On Board Units) is a microwave device that communicates with RSUs through Dedicated Short Range Communication (DSRC) technology. The OBUs on the vehicle and the RSU-Road Side Units on the side of the road interact with each other through microwaves in the ETC system. When the vehicle speeds through the RSU, the OBU, and the RSU interact with microwaves, much like our contactless card, but from a distance of a dozen meters and at a higher frequency of 5.8GHz. When it passes, it determines the authenticity, receives the model, computes the rate, and subtracts the toll.

Figure 2 depicts cloud-edge communication scenarios, where OBUs and RSUs act as edge devices and they communicate with the cloud.

Cloud-edge data security is a critical issue in IoV, as this technology introduces a range of new cybersecurity risks. Automated vehicles are connected to the cloud for critical services such as navigation, safety and security, and fleet management. However, this also means that the data and communication generated by these vehicles are vulnerable to cyber-attacks. The data could be intercepted and manipulated, leading to potential safety issues, or the vehicle itself could be taken over by malicious actors. Data encryption is one of the most powerful tools available to address these security concerns, which offers an effective way to protect sensitive information from unauthorized access. By scrambling data and making it unreadable, encryption provides an extra layer of security to help ensure the privacy and integrity of data. The IoV has the feature of big data.

Before the large-scale application of Quantum Cryptography, stream ciphering was the most suitable encryption method for “big data” [1]. For stream ciphers, pseudo-random sequence generation (PRNG) is very important. For normal applications, where good statistical properties and speed are important, consider O ‘Neill’s PCG collection, Vigna’s xoroshiro family, such as xoroshiro128 + (not the Japanese name btw, but “X or, rotate, shift, rotate “), and D. E. Shaw’s Random123 suite (including Philox and the well-named ARS, which is a simplified form of encrypting infinite zero sequences using AES-CTR), though I’m not sure how much checking has been done with Random123 PRNG. To encrypt secure PRNG(CSPRNG) For encryption applications where unpredictability is important, consider using cryptographically secure PRNGS, for example, Bernstein’s ChaCha20, RF C7539. The alternative is Wu’s HC-256 and Jenkins’ ISAAC64. The downside of these algorithms is that they’re too complicated. Linear Shift Register (LSFR) and Logistic chaos (LC) are the two popular PRNGs for stream ciphering. Logistic Chaos is a PRNG algorithm that utilizes chaotic dynamical systems to generate unpredictable numbers. LSFR is a PRNG algorithm that uses linear feedback to generate a sequence of pseudo-random numbers. Both algorithms have been used for various applications, including cryptographic applications, simulation, and game design. Both algorithms are capable of generating statistically random numbers, making them useful for applications that require a certain degree of randomness. However, the pseudo-random sequence generated by the linear feedback shift register (LSFR) is periodic, and the key sequence will be repeated. LC method, which is used most currently, however, has a synchronization problem. That is, it is too sensitive to the disturbance of initial parameters and has poor stability.

Fibonacci numbers have been known to mathematics for some 800 years and it seems rather surprising that this property of theirs did not receive attention until relatively recently. Indeed, nothing appeared in print concerning the Zeckendorf representation until the middle of the 20th century, with the publication of a paper by C.G. Lekkerkerker [2]. Zeckendorf did not publish his account until 1972 (see [3]) although he had proof of his theorem by 1939. Every positive integer can also be uniquely represented as a sum of the different powers of two, and it is natural to compare Zeckendorf representations with binary representations. This is pursued in [4], where there is a discussion of algorithms that, given two positive integers a and b in the Zeckendorf form, produce the Zeckendorf forms for a + b. Zeckendorf’s Theorem points out that natural numbers are the sum of one or more different Fibonacci numbers [5]. In other words, a positive integer n can be expressed as: $N={F}_{k_1}+{F}_{k_2}+{F}_L+{F}_{k_r}$. For example: 17 = F₁ + F₄ + F₇, 20 = F₃ + F₅ + F₇ . If all Fibonacci numbers in the above decomposition are mapped to a bit string composed of 0,1 according to their positions, that is, let the positions k₁, k₂, …, k_r of the sequence be 1, and the rest positions are 0, a new coding system represented by 0,1 can be obtained. This coding method, also known as Zeckendorf-Lekkerkerker coding, is the Ostrowski coding system [6, 7]. In recent years, The research in this field is focused on the discussion of mathematical properties. Among other results, Bugeaud [8] establishes, in a quantitative form, that any sufficiently large integer cannot simultaneously be divisible only by very small primes and have very few digits in its Zeckendorf representation. Idziaszek [9] investigates a relationship between the numeral system using Zeckendorf representations and the golden ratio numeral system. Alecci et al. [10] determine the Zeckendorf representation of the multiplicative inverse of a modulo F_n. Vukusic et al. [11] find an explicit upper bound for y^a, which only depends on the Hamming weight of y concerning the Zeckendorf representation. Other influential work includes [12]. The Zeckendorf representation has many advantages, such as non-uniqueness of coding, strong regularity of probability structure, periodicity of modules, and anti-interference, which may provide a new idea for the breakthrough of data encryption in IoV.

In what follows, based on the probability structure properties of the Zeckendorf representation, we propose a pseudorandom keystream generator (PKG), which can be used to encrypt cloud-edge IoV data. The proposed PKG algorithm addresses the periodicity problem of linear feedback shift registers (LFSRs) as well as the synchronization problem of LC.

The following content is organized as follows: Related work section reviews the literature, The proposed algorithm section proposes our algorithm, Security analysis section carries out security analysis theoretically, Experiments and discussion section conducts experiments and discusses the results, and Conclusion section concludes.

Related work

Modern cryptography comprises symmetric encryption, asymmetric encryption, and hash functions, with symmetric encryption further split into block ciphers and stream ciphers. Block ciphers accept fixed-size plaintext as input, such as 64-bit, 128-bit, 256-bit, and so on. Permutation and diffusion are also used. However, when large volumes of data with high real-time needs are sent, block encryption algorithms such as DES and AES fail to meet the requirements due to their strong correlation and high redundancy [13].

Stream ciphers are lightweight encryption algorithms that are ideal for big data. Stream ciphers are considerably faster to perform than block ciphers. Stream ciphers are classified into two types: those based on bytes and those based on pseudo-random number sequences, with the latter being more extensively employed in practice. RC4 is a byte-based stream ciphering method that uses the PRGA algorithm to create pseudo-random numbers and then uses the KSA encryption technique to accomplish particular encryption operations. RC4 is widely used in SSL/TLS protocol and WEP protocol as an IEEE802.11 WLAN standard [14]. The key of a stream ciphering technique based on a pseudo-random number sequence changes at random and every bit of data is encrypted with every bit of key (typically XOR /XOR operation), making cracking theoretically impossible (one-time encryption principle). Because the XOR technique is simple to implement in hardware, it allows for real-time encryption of high-speed large data. The statistical inspection institution has set stringent random criteria for the pseudo-random number sequence. Berger et al. [15] developed a new pseudo-random number generator that is resistant to common assaults and can increase internal diffusion properties greatly. In recent years, there are many pieces of research on big data encryption. For example, Hosseini [16] examines the fingerprint system, classifies different aspects of it, such as techniques for extracting features and matching procedures, and identifies various vulnerabilities. He then proposes three mitigation strategies, including software (big data and encryption/ decryption) and secure hardware techniques. Zhou et al. [17] present a chaotic secure communication strategy based on the synchronization of multiple complex dynamical networks with double layers. Djamaluddin et al. [18]’s goal is to verify a real-time data transport and analytics environment in terms of encryption and access control. Chen et al. [19] present a massive data encryption technique based on a de-redundancy approach.

This study is not the first to propose research on stream ciphers utilizing Zeckendorf representation. Carry register feedback (FCSR) is critical in the hardware design of a stream cipher. Lin et al. [20] refined the FCRS circuit and developed a homogeneous Zeckendorf representation-based quark hash function that employs FCRS to produce random integers [21]. However, this type of study focuses on the hardware level rather than the software level production of stream cipher key streams; also, these studies have not thoroughly examined the properties of the Zeckendorf representation. The canonical Zeckendorf representation is only used in the random number generator to raise the disorder degree of the sequence; it has not been investigated for the construction of randomness.

The proposed algorithm

Requirements for stream ciphers of big data

RC4 stream cipher technology is a word-based stream ciphering method with variable key length and a large number of permutation operations, making it more appropriate for software implementation. However, one out of every 256 keys in RC4 is a weak key susceptible to state table analysis attacks [22]. Furthermore, cryptanalysis can uncover a significant correlation between the key’s bytes, and if the same key sequence is repeated, the hacker can break the ciphertext. If the first three words of the key are discovered, iteration may be used to retrieve each word of the key used in RC4 [23]. Unlike RC4, which encrypts by byte, the LC and LFSR use a pseudo-random key stream to encrypt. Linear feedback shift registers (LFSRs) are excellent for hardware implementation due to their ease of algebraic analysis and high unpredictability (they can be successfully built by simple logic gates and registers). The primary drawback of LFSR is that it is periodic, which means that it can only be used to encrypt particular length sequences. Second, because its sequence is not a simple monotone sequence, it must be initiated [24]. The most often used stream ciphering technology is chaotic encryption. The difficulty with encryption is that it is excessively sensitive to the initial values of parameters due to its nonlinearity and synchronization [25]. To address the drawbacks of the LC and LFSR, a new pseudo-random key stream generation method with the following features must be investigated: First, the key stream sequence is sufficiently random. Second, the key stream sequence is lengthy enough to encrypt a large quantity of data. Third, the key stream sequence should not be very dependent on the starting value.

(1)
Randomness

XOR is commonly used in stream ciphers. If numerous data are encrypted with the same key string in this situation, the data may be decoded without knowing the key sequence. The following are the reasons: suppose P₁ and P₂ are two strings of plaintext data, K is the key used to encrypt the data. The ciphertext: E₁ = P₁ ⨁ K, E₂ = P₂ ⨁ K, ⨁ denotes XOR operation. Because E₁ ⨁ E₂ = P₁ ⨁ K ⨁ P₂ ⨁ K = P₁ ⨁ P₂ ⨁ (K ⨁ K) = P₁ ⨁ P₂, XOR P₂ on both sides, we obtain E₁ ⨁ E₂ ⨁ P₂ = P₁ ⨁ P₂ ⨁ P₂ = P₁. The ciphertext is cracked.

The Golomb randomness hypothesis [26] defines the basic characteristics that a sequence needs to have if it is considered random enough. Include:
- C1: In a periodic segment, the difference between the numbers of 1 s and 0 s is no more than 1.
- C2: In a periodic segment, if the length of runs accounts for the total number of runs, here it is assumed that there are at least two lengths of runs.
- C3: Autocorrelation function:
$$R\left(\Delta \right)=\sum \limits_{i=1}^L{l}_i{l}_{i+\Delta}=\left\{\begin{array}{l}L/2,\Delta =0\\ {}L/4,0<\Delta <L\end{array}\right.$$
(1)

L denotes the length of the sequence, Δ denotes the interval.
(2)
Infinity

Because the LSFR pseudo-random sequence is periodic, key duplication will occur if the plaintext length exceeds the period of the key sequence. As a result, given a large key space, the key stream sequence must be both random and aperiodic. In theory, this needs an infinitely long key sequence.
(3)
Stability

The LC method is extremely sensitive to the initial value, and it is precisely because of this sensitivity that it produces randomness, which is also the origin of the term “chaotic,” that is, the phenomenon in which the dynamic system appears random but is not random; however, a minor change can result in different encryption results, implying that chaotic systems are highly uncontrollable.

To some extent, the key sequence must be sensitive to the starting value. If it is not sensitive, all sequences will be the same or equivalent, and the intended unpredictability will be lost. The key sequence, however, will be easily broken if the initial value sensitivity is too high.

Algorithm principle

Assuming that there are m ones in the Zeckendorf representation sequence, the coding sequence is marked as: Z_{L, m}, according to Filipponi and Wolfowicz’s discovery in 1987 [27], the number of possible sequences is

$${N}_{L,m}=\left\{\begin{array}{l}\left(\begin{array}{c}L-m+1\\ {}m\end{array}\right),0\le m\le \left\lceil \frac{L+1}{2}\right\rceil \\ {}0,m>\left\lceil \frac{L+1}{2}\right\rceil \end{array}\right.$$

(2)

N_{L, m}(k) indicates the number of all Zeckendorf representation sequences with k-th bit being 1. Filipponi et al. proved that:

$${N}_{L,m}(k)={N}_{L,m}\left(L-k+1\right)$$

(3)

and

$${N}_{L,m}(k)=\sum \limits_{i=0}^{k-1}{\left(-1\right)}^i\left(\begin{array}{c}L-m-i\\ {}m-1-i\end{array}\right){N}_{L,m}(k)$$

(4)

$${\displaystyle \begin{array}{l}\textrm{Among}\ \textrm{them}.\textrm{or}\kern0.5em 1\le k\le \left\lceil \frac{L+1}{2}\right\rceil .\textrm{Or}\\ {}{N}_{L,m}(k)=\frac{1-{\left(-1\right)}^k}{2}\left(\begin{array}{c}L-m-k+1\\ {}m-k\end{array}\right)+\sum \limits_{i=0}^{\left\lceil \frac{k-2}{2}\right\rceil}\left(\begin{array}{c}L-m-2i-1\\ {}m-2i-1\end{array}\right)\end{array}}$$

(5)

Based on the following assumptions [28]

$$\left(\begin{array}{c}a\\ {}-\left|b\right|\end{array}\right)=0$$

(6)

It can be inferred from (4) that i > m − 1.

Theorem 1 If m < k < L − m + 1, N_{L, m}(k) = N_{L, m}(m) is a constant.

Proof From (4), we have

$${\displaystyle \begin{array}{l}{N}_{L,0}(k)=0\\ {}{N}_{L,1}(k)=1\\ {}{N}_{L,2}(k)=\left\{\begin{array}{c}L-2,k\in \left\{1,L\right\}\\ {}L-3,k\notin \left\{1,L\right\}\end{array}\right.\end{array}}$$

(7)

From (5), we have

$${N}_{L,m}(m)={\frac{1-\left(-1\right)}{2}}^m\left(\begin{array}{c}L-m-k+1\\ {}m-k\end{array}\right)+\sum \limits_{i=0}^{\left\lceil \frac{m-2}{2}\right\rceil}\left(\begin{array}{c}L-m-2i-1\\ {}m-2i-1\end{array}\right)$$

(8)

Then,

$${\displaystyle \begin{array}{l}{N}_{2m-1,m}(m)=\left\{\begin{array}{c}1,2\nmid m\\ {}0,2\left|m\right.\end{array}\right.\\ {}{N}_{2m,m}(m)=\left\lceil \frac{m+1}{2}\right\rceil \\ {}{N}_{2m+1,m}(m)=\left\{\begin{array}{c}{\left(m+1\right)}^2/4,2\nmid m\\ {}m\left(m+2\right)/4,2\left|m\right.\end{array}\right.\end{array}}$$

(9)

Therefore,

$${\Pr}_{L,m}(k)={N}_{L,m}(k)/{N}_{L,m}$$

(10)

Similarly, using (7) and (9), we can get

$${\displaystyle \begin{array}{l}\begin{array}{l}{\Pr}_{L,m}(1)={\Pr}_{L,m}(L)=\frac{m}{L-m-1}\\ {}{\Pr}_{L,m}(2)={\Pr}_{L,m}\left(L-1\right)=\frac{m\left(L-2m+1\right)}{\left(L-m-1\right)\left(L-m\right)}\\ {}{\Pr}_{2m-1,m}(m)=\left\{\begin{array}{c}1,2\nmid m\\ {}0,2\left|m\right.\end{array}\right.\end{array}\\ {}{N}_{2m,m}=\left\{\begin{array}{c}1/2,2\nmid m\\ {}m/\left(2m+2\right),2\left|m\right.\end{array}\right.\\ {}{N}_{2m+1,m}=\left\{\begin{array}{c}\left(m+1\right)/\left(2m+4\right),2\nmid m\\ {}m/\left(2m+2\right),2\left|m\right.\end{array}\right.\end{array}}$$

(11)

Algorithm implementation

Based on the characteristics of the Zeckendorf representation structure, a pseudo-random key stream generation algorithm is designed.

Suppose there is a pair of keys (e₁, e₂), which are calculated as follows:

$$\left\{\begin{array}{l}{e}_1=\left({a}_n,{c}_n,{\mu}_n\right)\\ {}{e}_2=\left({a}_m,{b}_m,{\mu}_m\right)\end{array}\right.$$

(12)

where μ_n, μ_m are two prime numbers of the same order of magnitude. There is a pseudo-random sequence of integers

$$\left\{\begin{array}{l}\textbf{N} =\left({N}_1,{N}_2,\cdots, {N}_H\right)\\ {}\textbf{M} =\left({M}_1,{M}_2,\cdots, {M}_H\right)\end{array}\right.$$

(13)

Its initial value satisfies:

$${\displaystyle \begin{array}{l}{N}_0>0,{a}_n,{c}_n<{\mu}_n\\ {}\begin{array}{l}{M}_0>0,{a}_m,{c}_m<{\mu}_m\\ {}{N}_0\ne M{}_0\end{array}\end{array}}$$

(14)

Each of M and Ν is

$$\left\{\begin{array}{l}{N}_{h+1}=\left({a}_n{N}_h+{c}_n\right)\operatorname{mod}{\mu}_n\\ {}{M}_{h+1}=\left({a}_m{M}_h+{c}_m\right)\operatorname{mod}{\mu}_m\end{array}\right.,0\le h\le H-1$$

(15)

μ_n, μ_m are prime numbers that satisfy N_h + 1 ≠ M_h + 1 。 H is determined by the message length. Then

$$\left\{\begin{array}{l}{Z}_L\left({N}_i\right)=\left({n}_1,{n}_2,\cdots, {n}_L\right)\\ {}{Z}_L\left({M}_i\right)=\left({m}_1,{m}_2,\cdots, {m}_L\right)\end{array}\right.$$

(16)

Next, perform OR operation on the middle part (the part with constant probability) to get C_i of length t:

$${C}_i=\left\{{c}_1,{c}_2,\cdots, {c}_t\right\}$$

(17)

where,

$$t=L-2{U}_L+2$$

(18)

and

$${U}_L=\frac{5\left(L+2\right)-8-{\left[5{\left(L+2\right)}^2+4\right]}^{1/2}}{10}+1$$

(19)

$${c}_j={n}_k+{m}_k,j=1,2,\cdots, t,k={U}_L+j-1$$

(20)

Concatenate C_i, we obtain C of length Ht.

$$C=\left\{{C}_1,{C}_2,\cdots, {C}_H\right\}$$

(21)

C is the desired pseudo-random key stream. We call this pseudo-random key stream generation algorithm “ZPKG” (key generator based on Zeckendorf presentation).

Security analysis

Randomness analysis

Let F_r be the maximum Fibonacci number no larger than N_i, then the shortest Zeckendorf sequence is S_L(N_i) with length L = n − 1. We can prove

$$L=\left[{\log}_{\Phi}\sqrt{5}\left({N}_i+\frac{1}{2}\right)\right]-1$$

(22)

where $\Phi =\frac{1+\sqrt{5}}{2}$ denotes the Golden ratio.

$$p(1)={\Pr}_{L,{U}_L}\left({U}_L\right)={N}_{L,{U}_L}\left({U}_L\right)/{N}_{L,{U}_L}$$

(23)

When L goes to infinity, ${\Pr}_{L,{U}_L}\left({U}_L\right)$ converges. In this case, L > 25. Then, the k-th number of Z_L(N_i) and Z_L(M_i) is

$$\Pr (0)={p}^2(0)\approx {\Phi}^2/5\approx 0.524$$

(24)

Among them,

$$p(0)=1-p(1)\approx \left(\Phi +1\right)/\left(\Phi +2\right)\approx 0.724$$

(25)

Similarly,

$$\Pr (1)=1-\Pr (0)=1-{p}^2(0)\approx 0.476$$

(26)

From (24) and (26), it can be known when L > 25 Golomb’s first randomness hypothesis is satisfied. Golomb’s second randomness hypothesis is difficult to prove directly. We can verify whether Pr(00), Pr(01), Pr(10) and Pr(11) are satisfied. As per the definition of Zeckendorf representation, there is no 11 pair. Therefore, we compute

$$p(01)=p(10)=p(1)\approx 1/\left(\Phi +2\right)$$

(27)

and

$$p(00)=1-p(01)-p(10)=1-2p(1)\approx \Phi /\left(\Phi +2\right)$$

(28)

Perform logical OR operation to get all pairs.

$${\displaystyle \begin{array}{l}(00)=(00)+(00);\\ {}(01)=(01)+(00),(00)+(01)\kern0.5em \textrm{or}\kern0.5em (01)+(01);\\ {}(10)=(10)+(00),(00)+(10)\kern0.5em \textrm{or}\kern0.5em (10)+(10);\\ {}(11)=(10)+(01)\kern0.5em \textrm{or}\kern0.5em (01)+(10)\circ \end{array}}$$

From (27) and (28), we have

$${\displaystyle \begin{array}{l}\Pr (00)={p}^2(00)=1/5=0.2\\ {}\Pr (01)=\Pr (10)=2p(00)p(01)+{p}^2(01)\approx \Phi /5\approx 0.324\\ {}\Pr (11)=2{p}^2(01)\approx 2/\left(5{\Phi}^2\right)\approx 0.152\end{array}}$$

(29)

Golomb’s second randomness hypothesis is thus satisfied.

Infinity analysis

When H → + ∞, N and M goes to infinity.

$$\left\{\begin{array}{l}\textbf{N} =\left({N}_1,{N}_2,\cdots, {N}_H\right)\\ {}\textbf{M} =\left({M}_1,{M}_2,\cdots, {M}_H\right)\end{array}\right.,H\to +\infty$$

(30)

In this way, the pseudo-random key stream that is obtained by OR operation with the middle part of the Zeckendorf representation of the integer pair tends to be infinitely long, that is

$$C=\left\{{C}_1,{C}_2,\cdots, {C}_H\right\}\to \infty, \textrm{if}\ H\to +\infty$$

(31)

That is, the sequence length of C is infinite.

Stability analysis

According to the definition of each item of N and M

$$\left\{\begin{array}{l}{N}_{h+1}=\left({a}_n{N}_h+{c}_n\right)\operatorname{mod}{\mu}_n\\ {}{M}_{h+1}=\left({a}_m{M}_h+{c}_m\right)\operatorname{mod}{\mu}_m\end{array}\right.,0\le h\le H-1$$

(32)

Where e₁ = (a_n, c_n, μ_n), e₂ = (a_m, b_m, μ_m) are the initial parameters. We make a small change in the initial parameter, for example, make c_n be c_n = c_n + 1, then

$${N}_{h+1}=\left({a}_n{N}_h+{c}_n+1\right)\operatorname{mod}{\mu}_n\approx {N}_{h+1}+1$$

(33)

If encode it as Zeckendorf representation again, then

$${\displaystyle \begin{array}{c}Z\left({N}_{h+1}\right)=1,0,1,\cdots, {0}_{F_1},0\\ {}Z\left({N}_{h+1}\right)=1,0,1,\cdots, {1}_{F_1},0\end{array}}$$

(34)

The new sequence only changes from 0 to 1 at F₁, while the position of the sequence segment that generates the pseudo-random key stream is: [m, L − m + 1] . That is to say, the generated pseudo-random key stream will not change.

Experiments and discussion

Statistical properties test

Using the FIPS140–2 [29] standard to test the random performance of the output keystream sequence. The four main tests are as follows:

(1)
Monobit test: The number M of “1” in the output 20,000bit stream sequence satisfies 9725 < M < 10275.
(2)
Poker test: Divide 20,000-bit stream sequences into 5000 continuous 4-bit blocks, and the following conditions are required f(i) is the number of decimal values i of 4-bit blocks, where 0 < i < 15, the test value $X=\frac{16}{5000}{\sum}_{i=0}^{15}{f}^2(i)-5000$, passes the standard 2.16 < X < 46.17.
(3)
Run test: The number of continuous “1” or “0” in the 20,000-bit stream sequence is required to meet the standard in Table 1.
(4)
Long run test: It is required that the length of continuous “1” in the 20,000-bit stream sequence cannot exceed 26.

Table 1 Run test pass standard

Full size table

To test the randomness of the keystream sequence generated by the algorithm, each of the four main tests was performed k = 200 times, each time using a different initial key. The results are shown in Figs. 3 and 4.

Scrambling effect

(1)
First check the relevance of the key. The key autocorrelation function obtained by randomly generating a 2048-bit key stream is shown in Fig. 5a. The autocorrelation function of the ciphertext obtained by encrypting the same plaintext “vehicle-related data...traffic-related data” using these keys is shown in Fig. 5b.

It can be seen that the correlation function of the key and the ciphertext has no repetition period, has characteristics similar to white noise, and has good autocorrelation.

(2)
Information entropy is one of the measurement indicators of uncertainty. The ciphertext (2048bit) information entropy is shown in Table 2.

According to information theory, when the probability of occurrence of each symbol of the sequence (in binary terms, 0 or 1) is equal, the information entropy reaches the maximum value. Here, the maximum information entropy of the 2048bit sequence is log2048 = 11. It can be seen that the information entropy of the three algorithms is close to 11, approaching the theoretical upper limit. The information entropy of the ZPKG algorithm is greater than that of the LSFR and LC, indicating that the ciphertext generated by it has a higher degree of disorder and a better scrambling effect.

Table 2 Information entropy between different algorithms

Full size table

Sensitivity test

(1)
Check whether it conforms to the avalanche principle and the mutual correlation characteristics. Randomly change the key by 1 bit, and the obtained 2 KB ciphertext changes by 1.023 KB and the change amount is 49 95%, which shows that the encryption method has a strong avalanche effect. The cross-correlation functions of the obtained two sets of ciphertexts are shown in Fig. 6. A value close to zero indicates that a small change in the key will cause a complete change in the ciphertext.

(2)
The generalized avalanche effect is not only the change of the ciphertext caused by the change of the key but also the change of the ciphertext caused by the change of the bits of the plaintext. Extend the experiment in Fig. 6, flip 1 bit of the plaintext, exchange 2 bits of any plaintext, change 3 bits of the key, change 10 bits of the key, and see the bit-flipping ratio of the ciphertext. All experiments were repeated 10 times, and the average value of the results was taken, as shown in Table 3. It can be seen that the ciphertext bit flip ratio is close to 0.5, which meets the strict avalanche criterion (Strict Avalanche Criterion, SAC).

Table 3 Sensitivity analysis of ZPKG

Full size table

Algorithm performance

The simulation environment of this experiment is: Windows Server 2008 R2 Enterprise operating system, the processor is Intel(R) Xeon(R) CPU E7-4860 v4 @ 2.27GHz, RAM 4.0 GB.

In the experimental simulation process, the simulation generates a key stream from 0 to 1 million bits and observes the time consumption of LSFR, LC, and this scheme (ZPKG). The results are shown in Fig. 7. As the length of the required key stream increases, and the execution time of the algorithm also increases. The time-consuming increase of the LC algorithm with the key stream length is the largest, followed by ZPKG, and LFSR is the smallest. In terms of basic time consuming, the basic time consuming of the ZPKG algorithm is the most, followed by LC, and LFSR is the least. This reflects that the initial preparation time of the ZPKG algorithm is the longest, but as the length of the keystream increases, the time consumption of the ZPKG algorithm increases linearly, and the performance is slightly lower than that of LFSR, but slightly higher than that of LC.

Conclusion

This paper presents a novel stream ciphering algorithm that is suited for cloud-edge communication in IoV. The unpredictability of the keystream is the foundation of stream ciphering. The probability of digit 1 occurring in the center section of the Zeckendorf representation is proven to be constant, which may be utilized to construct a pseudo-random keystream sequence. The pseudo-random sequence created by the linear feedback shift register (LSFR) is periodic and always repeated. The starting value of the sequence generated by chaotic encryption is too sensitive to disturbance, making it easy to be cracked. These two issues are addressed by the proposed algorithm: ZPKG. The keystream generated by ZPKG is not only very random but also endlessly lengthy, with a moderate parameters sensitivity. The experimental results support the above advantages.

Future research directions include:

(1)
The ASIC design of this algorithm. In terms of hardware resource utilization and operating speed, this technique is nearly identical to the linear feedback shift register (LFSR). LFSR has the disadvantage of being periodic, therefore it can only be used for the encryption of particular length sequences. The sequences created by ZPKG, on the other hand, are not only random, but also aperiodic, and can yield limitless keystreams. Therefore, ZPKG might be investigated as a replacement for LFSR as a PRBS (pseudo-random binary sequence) generator, which is often used in telecommunication to produce white noise.
(2)
Quickly generate the Fibonacci numbers. Fibonacci numbers are required for the Zeckendorf encoding process. The computational complexity (o(logn)) of generating Fibonacci numbers is still high, resulting in slow encoding.

Availability of data and materials

The plaintext data used for encryption testing is randomly generated. Relevant experimental results can be requested by email to the corresponding author.

Abbreviations

IoV:: Internet of Vehicles
RSU:: Road Side Unit
OBU:: Short for On Board Unit
DSRC:: Dedicated Short-Range Communication
ZPKG:: Pseudo-random Keystream Generator with Zeckendorf representation
LFSR:: Linear feedback shift register
LC:: Logistic Chaos

References

Zhou H, Xie H, Zhang H, Zhang H (2021) Parallel remote sensing image encryption algorithm based on chaotic system and DNA coding. J Image Graph 26(05):1081–1094
Google Scholar
Lekkerkerker CG (1952) Voorstelling van natuurlijke getallen door een som van getallen van Fibonacci. Simon Stevin 29:190–195
MathSciNet Google Scholar
Zeckendorf E (1972) Représentation des nombres naturels par une somme de nombres de Fibonacci ou de nombres de Lucas. Bull Soc R Sci Liège 41:179–182
MathSciNet MATH Google Scholar
Kimberling C (1998) Edouard Zeckendorf. Fibonacci Quart 36:416–418
MathSciNet MATH Google Scholar
Rosen KH (2010) Elementary number theory and its applications. China Machine Press, Beijing
Brown J Jr (1964) Zeckendorf’s theorem and some applications. Fibonacci Quart 2:162–168
MATH Google Scholar
Lekkerkerker CR (1951) Voorstelling van natuurlijke getallen door een som van getallen van Fibonacci. In: Stichting Mathematisch Centrum. Zuivere Wiskunde, p 30
Google Scholar
Yann Bugeaud; On the Zeckendorf representation of smooth numbers, ARXIV-MATH.NT, 2019
MATH Google Scholar
Idziaszek T (2021) Efficient algorithm for multiplication of numbers in Zeckendorf representation
Google Scholar
Gessica Alecci; Nadir Murru; Carlo Sanna; Zeckendorf representation of multiplicative inverses modulo a Fibonacci number, ARXIV-MATH.NT, 2022
Book Google Scholar
Vukusic I, Ziegler V (2023) On sums of two Fibonacci numbers that are powers of numbers with limited hamming weight. ARXIV-MATH.NT
Google Scholar
Wu L, Cai H, Liu J, Li Z (2021) Enhancing the anti-cryptanalysis ability and avalanche effect with Zeckendorf representation via FPGA implementation. In: 2021 4th international conference on advanced electronic ...
Google Scholar
Wu X, Wang D, Kurths J et al (2016) A novel lossless color image encryption scheme using 2D DWT and 6D hyperchaotic system. Inf Sci 349:137–153
Article Google Scholar
Kim H, Han J, Cho S (2007) An efficient implementation of RC4 cipher for encrypting multimedia files on mobile devices. In: Proceedings of the ACM symposium on applied computing. ACM, Seoul, pp 1171–1175
Berger TP, Minier M, Pousse B (2009) Software oriented stream ciphers based upon FCSRs in diversified mode, international conference on cryptology in India. Springer, New Delhi, pp 119–135
MATH Google Scholar
Hosseini S (2018) Fingerprint vulnerability: a survey. In: 2018 4th international conference on web research (ICWR)
Google Scholar
Zhou L, Tan F (2019) A chaotic secure communication scheme based on synchronization of double-layered and multiple complex networks. Nonlinear Dyn 96:869–883
Article MATH Google Scholar
Djamaluddin B, Ferianto T, Akbar H (2020) End to end data security challenges in real-time drilling data environment - from data transfer to analytics
Book Google Scholar
Chen W, Chen G, Zhao Y, Zhang J (2021) Security vulnerability and encryption technology of computer information technology data under big data environment. J Phys Conf Ser 1800(1):012012
Article Google Scholar
Lin Z (2013) The transformation from the Galois NLFSR to the Fibonacci configuration. In: International conference on emerging intelligent data and web technologies. IEEE, Xi’an, pp 335–339
Google Scholar
Mansouri SS, Dubrova E (2013) An improved hardware implementation of the quark hash function. International workshop on radio frequency identification: security and privacy issues. Springer, Graz, pp 113–127
Chapter Google Scholar
RC4 Encryption Algorithm. Available: https://en.wikipedia.org/wiki/RC4. Accessed 3 Jan 2023
Mantin SA (2001) Weaknesses in the key scheduling algorithm of RC4, Lecture Notes in Computer Science. In: Revised papers from the 8th annual international workshop on selected areas in cryptography, vol 2259, pp 1–24
Google Scholar
Wu C, Kuo CJ (2005) Design of integrated multimedia compression and encryption systems. IEEE Trans Multimed 7(5):828–839
Article Google Scholar
Sreelaja NK, Pai GAV (2012) Stream cipher for binary image encryption using ant colony optimization based key generation. Appl Soft Comput 12:2879–2895
Article Google Scholar
Golomb SW (1967) Shift register sequences: USA[P]. Holden-Day Inc., San Francisco
Google Scholar
Filipponi P, Wolfowicz W (1987) A statistical property of nonadjacent ones binary sequences. Note Recensioni Notizie 36(3):103–106
Google Scholar
Mansour T (2002) Combinatorial identities and inverse binomial coefficients. Adv Appl Math 28:196–202
Article MathSciNet MATH Google Scholar
National Institute of Standards and Technology (2001) FIPS 140–2-2001 security requirements for cryptographic modules. American National Standards Institute, Washington DC
Google Scholar

Download references

Acknowledgments

The authors acknowledge Tianqi Cai for language polishing.

Funding

This work was supported in part by the National Natural Science Foundation of China under Grant 61832014.

Author information

Authors and Affiliations

School of Computer Science, Wuhan University, Wuhan, 430072, China
Yun Wu & Hengjin Cai
School of Electronic Information and Electrical Engineering, Shanghai Jiao Tong University, Shanghai, 200240, China
Liangshun Wu

Authors

Yun Wu
View author publications
You can also search for this author in PubMed Google Scholar
Liangshun Wu
View author publications
You can also search for this author in PubMed Google Scholar
Hengjin Cai
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Yun Wu: data curation, investigation, visualization, and writing. Liangshun Wu: software, validation, methodology, and formal analysis. Hengjin Cai: conceptualization, ideas, funding acquisition, project administration, resources allocation, and supervision. All authors reviewed the manuscript. The author(s) read and approved the final manuscript.

Corresponding author

Correspondence to Hengjin Cai.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Wu, Y., Wu, L. & Cai, H. Cloud-edge data encryption in the internet of vehicles using Zeckendorf representation. J Cloud Comp 12, 39 (2023). https://doi.org/10.1186/s13677-023-00417-7

Download citation

Received: 28 December 2022
Accepted: 01 March 2023
Published: 17 March 2023
DOI: https://doi.org/10.1186/s13677-023-00417-7

Cloud-edge data encryption in the internet of vehicles using Zeckendorf representation

Abstract

Graphical Abstract

Introduction

Related work

The proposed algorithm

Requirements for stream ciphers of big data

Algorithm principle

Algorithm implementation

Security analysis

Randomness analysis

Infinity analysis

Stability analysis

Experiments and discussion

Statistical properties test

Scrambling effect

Sensitivity test

Algorithm performance

Conclusion

Availability of data and materials

Abbreviations

References

Acknowledgments

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords