An efficient and secure data sharing scheme for mobile devices in cloud computing

With the development of big data and cloud computing, more and more enterprises prefer to store their data in cloud and share the data among their authorized employees efficiently and securely. So far, many different data sharing schemes in different fields have been proposed. However, sharing sensitive data in cloud still faces some challenges such as achieving data privacy and lightweight operations at resource constrained mobile terminals. Furthermore, most data sharing schemes have no integrity verification mechanism, which would result in wrong computation results for users. To solve the problems, we propose an efficient and secure data sharing scheme for mobile devices in cloud computing. Firstly, the scheme guarantees security and authorized access of shared sensitive data. Secondly, the scheme realizes efficient integrity verification before users share the data to avoid incorrect computation. Finally, the scheme achieves lightweight operations of mobile terminals on both data owner and data requester sides.


Introduction
With the rapid development of information technology and Internet of Things (IoT), enterprises generate more and more big data, which needs to be stored and processed efficiently and securely. Cloud computing is a developed storage platform and has many advantages including low cost and scalability [1][2][3]. Therefore, many enterprises and individuals are apt to outsource their data to cloud for storage and sharing with authorized data requesters. For example, in a cloud based health information system, patients upload their health information to cloud for sharing with medical experts to diagnose diseases. Similarly, the manager of an enterprise not only want to store the big data in cloud, but also want to share the data among their authorized employees wherever needed. Outsourcing data for sharing in cloud not only saves local storage space, but also greatly reduces the cost of enterprises in software purchase and hardware maintenance [4][5][6]. Although people take advantages of this new technology and service, their concerns about data security arises as well. Security problem in cloud is the most critical issue because of the valuable information that data owners share. Cloud providers should address privacy and security issues as a matter of high and urgent priority [7][8][9]. One of the prominent security concerns in data sharing is data privacy. In addition, terminals of users are usually resource-constrained mobile devices with small storage space and low processing speed. Therefore, it is essential to propose an efficient and secure data sharing scheme for mobile devices in cloud computing.

Main contributions
We propose a lightweight and secure sensitive data sharing scheme for mobile devices in cloud computing. The main contributions of the paper are as follows.

1) We design an efficient integrity mechanism based
on algebraic signature for sensitive data before data sharing. 2) We guarantee privacy preserving of sensitive data in sharing process and access authorization control for data requesters. 3) We achieve lightweight computation operations on data owner and data requester sides with mobile devices.

Organization
The rest of paper is organized as follows. Section II introduces the related works in secure data sharing. Section III describes architecture and security requirements. Section IV presents the definitions and preliminaries. In section V, we describe the implementations of the efficient data sharing scheme. We analyze security in section VI and evaluate the scheme performance in Section VII. Finally, we conclude this paper in Section VIII.

The related works
With more and more sensitive information sharing among enterprise employees, preserving data integrity and guaranteeing access control for only authorized users to access the data has become the core security problem. Therefore, security problems of data sharing in cloud mainly focus on access control and data integrity [10][11][12][13][14]. At present, data sharing schemes mainly employ access control mechanism to achieve authorized access. Akl and Taylor [15] proposed use of cryptography to realize access control in hierarchical structures. As data owner and the data center are not in the same trusted domain in cloud storage system, access control schemes employing Attributed Based Encryption (ABE) [16] are put forward. ABE commonly comes in Key Policy ABE (KP-ABE) and Cipher text Policy ABE (CP-ABE). KP-ABE uses attributes to describe the encryption data and builds policies into user's key [17]. CP-ABE uses attributes to describe user's credentials and the user encrypting the data determines a policy on who can decrypt the data [18]. Lewko [19] proposed the first unbounded KP-ABE scheme. Waters [20] firstly put forward a fully expressed CP-ABE scheme in the standard model. Cheung and Newport [21] proposed another CP-ABE scheme and proved its security in the standard model. To ensure data security in smart health, Zhang and Zhen [22] realize the fine-grained access control, cipher text-policy attribute based encryption (CP-ABE). To achieve privacy and authorized access, many other schemes [23][24][25][26][27][28][29][30][31][32][33] are proposed in diverse fields and applications.
To verify integrity of data in cloud, many integrity verification schemes [26,[34][35][36][37][38][39][40][41][42] have been proposed in recent years. Ateniese [34] proposed the first public auditing scheme, which allows any public verifier to check the data integrity. Later, to prove the integrity of dynamic data, Ateniese [26] proposed another scheme based on the symmetric key provable data possession (PDP) scheme. To support dynamic operation of data, Erway et al. [35] proposed a dynamic provable data possession (DPDP) scheme by introducing an authenticated skip list. Later many auditing schemes are proposed by using the authenticated data structure to support dynamic data update. Zhu et al. [36] introduced an indexhash table (IHT) for dynamic verification. Yang et al. [37] proposed another authenticated data structure called index table (ITable) to store the abstract information of blocks. Tian [38] proposed a data structure named Dynamic-Hash-Table (DHT). Wang et al. [39] and Liu et al. [40] respectively proposed dynamic public auditing schemes based on Merkle Hash Tree (MHT). The two schemes can achieve both public verification and dynamic data operations. However, block signatures are generated by users, which would incur multitude computation and communication overhead on the user side. In 2018, Gan [41] proposed an auditing scheme on algebraic signatures that can achieve lower computation and communication costs. The properties of algebraic signature allows cloud to return a sum of the selected blocks in the proof instead of the original data file, which saves the bandwidth between the cloud and the verifier and make the algebraic signature suited for cloud computing [42]. Nowadays, more and more schemes [23,[43][44][45][46][47][48][49][50][51][52] on cloud computing security are put forward to achieve great advantages.

System model
The system model consists of four entities named Key Generator Center (KGC), Data Owner (DO), Cloud Servers (CS) and Data Requester (DR) as depicted in Fig. 1.

Key generation center (KGC)
It is responsible for generating public parameters and master key for the system and issuing private key for other entities.

Data owner (DO)
It is responsible to generate and encrypt the shared data, define access structures, and divide encrypted data into blocks.

Cloud servers (CS)
Cloud servers comes in Cloud Storage Servers (CSS) and Cloud Manage Servers (CMS) based on their roles. CSS is responsible for storing shared data, block tags and supply the data integrity proof. To save computation and communication costs of mobile terminals of DO and DR, CMS is employed to manipulate complex computations including generating algebraic signatures of blocks, verifying data integrity of shared data and computing the intermediate data for encryption and decryption.
Data Requester (DR): It is responsible to download and decrypt the shared data for utilization. In the scheme, only the authorized DR is able to download shared data from CSS and decrypt the data.
In our secure data sharing scheme for mobile terminal devices, DO has large sensitive data to share with legitimate DR. Before sharing, DO encrypts the data with his private key and outsources the data to CSS. If a DR wants to access the data, he must register his identity to KGC and obtain his private key for decryption. To achieve authorized access, only legitimate DRs with correct attributes can download and utilize the shared data. To ensure cloud data intact and decrease computation burden of requesters, CMS helps DR to verify the integrity of data before sharing. Only when data is undamaged, DR downloads and decrypts shared data with his private key.

Security requirement
In the scheme, we suppose CSS and CMS are both semitrusted. CSS is responsible to store data and block tags for data sharing. However, once data is corrupt or lost, it might launch forge attack or replace attack for economic reasons. Similarly, CMS is curious about the content of sensitive data, so the data should preserve secret to CMS. In the scheme, we assume KGC is a fully trusted authority and can honestly generate private key for the system and other entity. Therefore, the following security requirements of the scheme should be satisfied.

Data confidentiality
The shared data must keep confidential to CSS, CMS and any unauthorized DRs for privacy and security. Any disclosure of shared data is undoubtedly harmful to enterprise benefits. Consequently, it is important to ensure the confidentiality of shared data.

Data integrity
The data should keep intact before shared by DR. It means that the data is undamaged in an unauthorized manner during storage and sharing process.

Authorized access
To achieve authorization, only DR with correct attributes can access shared data stored in CSS.

User revocation
The membership of DR must be revoked to stop his access to shared data when he leaves the organization. To achieve security of the scheme, user revocation should be required in the data sharing scheme.

Design goals
The data sharing scheme for mobile devices is designed to achieve data privacy preservation, data security and lightweight operations.

Privacy preservation
The scheme should satisfy data privacy during data sharing process. As sensitive data is encrypted by data owner before outsourcing to cloud and only authorized data requesters can access the encrypted data, the shared data is private to CSS, CMS and any unauthorized DRs.

Data security
The scheme should achieve sensitive data security during the whole sharing process. The security requirement can be guaranteed by data confidentiality, data integrity, authorized access and user revocation in the scheme.

Lightweight operations
The scheme should decrease computation operations of DO and DR for efficiency. In our scheme, CMS is responsible to divide encrypted data into blocks and computes block tags. Furtherly, when DR wants to access shared data, CMS compute intermediate data of decryption to less DR's computation burden.

Definitions and preliminaries
Denifitions 1) Discrete Logarithm (DL) Assumption. Suppose g is a generator of multiplicative cyclic group G with prime order q. On input y∈G, there does not exist probabilistic polynomial time algorithm that outputs a value x∈Z Ã q such that g x = y with nonnegligible probability.
Suppose g is a generator of multiplicative cyclic group G with prime order q. On input g x ;g y ∈G, there does not exist probabilistic polynomial time algorithm that outputs g xy ∈G with non-negligible probability. 3) Access Structure. Suppose P = {P 1 , P 2 , …, P n } is a set of parties. A collection of W ⊆ 2 P is monotone if ∀B,C : B ∈ W and B ⊆ C then C ∈ W. An access structure is the collection W with non-empty subsets of P, i.e., W ⊆ 2 P \{∅}. The sets in W are named as authorized sets, and the sets not in W are named as the unauthorized sets.

Preliminaries
1. Linear secret-sharing schemes (LSSS). LSSS is a share-generating matrix A with rows labeled by attributes. Assume S ∈ A is an authorized set and I is defined as I = {i| ρ(i) ∈ S}. Then there exists con- Algebraic signature. The algebraic signature of a file block composed of strings s 0 ,s 1, …,s n − 1 is defined as sig g ðs 0 ; s 1; …; s n − 1 Þ¼ P n − 1 i¼0 s i Á g i . The algebraic signature has the following two properties: i) the algebraic signature of a combination of file F 1 and file the algebraic signature of a combination of n blocks in file F is equal to the combination of algebraic signatures of each block named m i ∈G, which is described as P n i¼1 sig g ðm i Þ¼sig g P n i¼1 m i . 3. XOR-homomorphic function. A XORhomomorphic function h is a pseudo-random function that can ensure data privacy. Its properties is as follows. For any inputs x,y, there exists h(x ⨁ y) = h(x) ⨁ h(y). 4. Bilinear maps. Suppose G 1 ;G 2 are two multiplicative groups with same large prime order q, and g is a generator of G 1 . A bilinear map e is a map function e:G 1 ÂG 2 →G 1 with the following properties: i) computability. ∀ u;v∈G 1 , an efficient algorithm exists to compute e(u, v). Ii) Binearity. ∀a,b ∈ Z q , ∃ e(u a , v b ) = e(u, v) ab . Iii) nondegeneracy. e[g, g] ≠ 1. Iv) security. It is hard to compute discrete logarithm (DL) in G 1 .
Notations Table 1 describes the main notations in the scheme.

Scheme implementations
In this section, we present the efficient and secure data sharing scheme for mobile devices in cloud computing in detail. We divide the sharing scheme into four phases named initial phase, data processing phase, integrity verification phase and data sharing phase.

Initial phase
This phase consists of three algorithms named Para-Setup,KeyGen,IdReg. Algorithm ParaSetup is mainly responsible to generate system parameters before data sharing. Algorithm KeyGen is mainly used to obtain the private key for DR to decrypt the cipher-text of shared data. Algorithm IdReg is responsible for registering DR's identity information in a table for checking the validity of DR. Figure 2 describes the data flow of the phase.
Given system security parameter λ, KGC constructs the bilinear map group system Θ ¼ ð G 1 ; G 2 ; q; eÞ where G 1 ; G 2 are multiplicative groups with prime order q, and e is a bilinear map e : G 1 Â G 1 →G 2 . Suppose A is the attribute universe A whose attribute number is |A|. KGC picks random α; β; s 1 ; s 2 ; …; s jAj ∈Z Ã q and computes θ = e(g, g) α , σ = g β . Then KGC defines heteromorphic function h : G 1 →G 1 and algebraic signature sig g (m i ) = m i · g i , where m i ∈G 1 and g is a generator of G 1 . Next, KGC selects three secure hash function H 1 : f0; 1g Ã →G 1 ; H 2 : G 1 →f0; 1g len ; H 3 : f0; 1g Ã →Z Ã q . KGC publishes public parameters PK ¼ ðe; g; H 1 ; H 2 ; H 3 ; h; sig g ; θ; σ; f i ¼ g s i ; 1 ≤ i ≤ jAj Þ and keeps master key MK = (α, β) secretly. Before DR with attribute S share the data, he should get the private key to decrypt the cipher text of shared data. DR first sends key generation request request KeyGen (S) to KGC. After receiving the request, KGC computes Uid = H 3 (S) as the DR identity and computes Upk = g Uid as DR's public key. Next KGC selects t∈Z q ;V ;V 0 ;V x ∈G 1 ; x∈S and computes the private key Usk for DR as follows:  adefines access policy AS = (M, ρ), where M is a l × n matrix and ρ is the function that maps each row of the matrix to an associated attribute. That is, for i ∈ [1, l], the value ρ(i) is the attribute associated with row i. Suppose the shared data is F ∈ {0, 1} len , where len is the max length of F and the identity is Fid. DO selects key SK ∈G 1 ; s∈Z q and a random column vector y ! ¼ ðs; y 2 ; …; y n Þ T ∈Z n q . Next, he encrypts F as F ′ = H 2 (SK) F and computes It is run by CMS. To decrease the computation burden of mobile devices at DO side, CMS helps to compute the intermediate encryption data. He selects r i ∈ Z q , i ∈ [1, l] and computes For later integrity verification, CMS stores the intermediate data C i , , E i locally.

Integrity verification phase
When DR wants to access shared data, he first sends integrity request to CMS for verifying whether the Then he sends proof P = (DP, TP) to CMS.
3) IntegityVer( P) → (true, false).After getting the proof P , CMS computes L ¼ P c i¼1 sig g ðhðR i Þ⊕H 1 ðt i ∥v i ÞÞ and verifies whether the following equation holds.
If the equation holds, it indicates that the data is intact and outputs true, Otherwise, CMS outputs false.

Data sharing phase
If the shared data is intact, DR downloads and decrypts F ′ . This phase consists of the following four algorithms and Fig. 5 describes the data flow of the phase.

1) InvalidyCheck( S, Uid) → (true, false). It is run by
CSS. Before downloading shared data F ′ , DR sends request request download (Fid, S, Uid) to CSS. Then CSS scans DRTable to verify whether DR is legitimate. If DR is in DRTable, CSS transfers {C, F ′ } to DR. Otherwise, CSS refuses the request. 2) InterDec( C i , E i , K ′ , K ρ(i) ) → (W). If the shared data is intact, CMS computes the intermediate value for DR to decrease the computation burden of DR side. Let ={i : where S is the attribute of DR. Suppose ω i ∈ Z q , i ∈ I is constant and satisfies the equation downloads F ′ and computes the follows.
Then DR computes SK = C/CK ′ , F = H 2 (SK) ⨁ F ′ and recovers the plaintext of shared data .

4) Revocation( Uid) → (DRTable). DR can be
revoked after leaving the organization. After revocation of DR, DSS deletes DR's information from DRTable and DR cannot download the shared file later.

Security analysis
In this section, we analyze the security of the scheme, including correctness, unforgeability and privacy. Theorem 1. Authorized DR can correctly verify the integrity of the data stored in CSS.
Proof. Theorem 1 can be proved by verifying the correctness of eq. (4). The proof is as follows.
From the proof of eq. (4), DR can verify whether the data is undamaged stored in CSS.
Theorem 2. Authorized DR can correctly recover the shared data if he owns the legal attributes.
Proof. Theorem 2 can be proved by verifying the correctness of eq. (5). The proof is as follows.
Then DR computes SK = C/CK ′ ,F = H 2 (SK) ⨁ F ′ to recover the plaintext of shared data.
Theorem 3. It is computationally infeasible for CSS, CMS and unauthorized DR to get the plaintext of health data in the scheme.
Proof. In data processing phase, DO encrypts file F to F ′ with F ′ = H 2 (SK) ⨁ F, where SK is only secret to DO. Therefore, the file is confidential to both CSS and CMS. The confidentiality guarantee depends on the security of hash function H 2 . As H 2 is a secure one-way hash function, the data is private to CSS and CMS. In data sharing phase, CSS sends { C, F ′ } to DR, where C = SK · θ s and F ′ is the cipher text of shared data. CMS computes the intermediate value for DR to decrypt the shared data F ′ only if DR's attributes satisfy the access structure. Therefore, any unauthorized DR cannot get any information on the sensitive data.
Theorem 4. It is computationally impossible for CSS to forge an integrity proof to pass the public verification, if XOR-homomorphic function is secure.
Proof. We can prove the theorem with the following games. In the games, we suppose the adversary is the party who forge an integrity proof to pass the public verification.
Game 1 is the challenge game. The challenger generates public-private key pair ( PK, MK) and provides PK to the adversary. The adversary is able to interact with the challenger and query some data blocks. Then the challenger computes corresponding block tags and returns the tags to the adversary. When challenger launches challenge to the adversary, he can respond to the challenger with data proof and tag proof. Game 2 is another challenge game in which the challenger keeps all the tags ever issued as part of the queries. If the challenger detects the aggregated block tags TP is not equal to TP ¼ P c i¼1 T i , he declares the game fails. Game 3 is the same as game 2 with one difference that the challenger keeps all response sequences to the adversary's queries. Suppose the challenger sends ch = (i, R i ) to the adversary, the adversary's reply to the query is P = (DP, TP) where T ¼ P c i¼1 T i . In the scheme, P is the correct proof and equation DP = TP ⨁ L holds. Suppose the  adversary's proof is then the equation DP ′ = TP ′ ⨁ L also holds. We define △DP = DP ′ ⨁ DP, △TP = TP ′ ⨁ TP.We make the XOR operation on the above two verification equations and get △DP = △ DP. The above equation holds with the probability is 1 q . The probability can be negligible.

Performance evaluation
In this section, we evaluate the computation costs of DO and DR in the scheme and compare it with scheme [22].

Performance analysis of mobile terminals
To analyze computation overhead of the scheme, we define the following notations to denote the corresponding operations: Let Pair denote a paring operation, Hash denote a hash operation and Exp denote an exponentiation operation. Similarly, Mul represents a multiplication operation. Xor and Sig respectively denote XOR and algebraic signature operation of the scheme. The computation overhead of DO is mainly in encryption phase. DO first computes F 0 ¼H 2 ðK Þ⨁F;C¼K Á θ s ; C 0 ¼g s ;λ i ¼M i y ! ;1 ≤ i ≤ l . Therefore, the computation overhead of DO side is (1 + l)Mul + 2Exp + Hash + Xor.

2) Computation overhead of DR side
The computation overhead of DR is mainly in data sharing phase. DR downloads F ′ and computes CK 0 ¼ eðC 0 ;K Þ V ¼θ s . Then DR gets K = C/CK ′ and recovers the plaintext F = H 2 (K) ⨁ F ′ . Therefore, the computation overhead of DR side is Pair + Xor + 2Div. Table. 2 illustrates the computation overhead of mobile terminals on both sides.

Experimental results
We simulate our scheme with the Pairing Based Cryptography (PBC) library of version 0.5.14. We compare the computation time of DO and DU with scheme [22] by utilizing an MNT d159 curve with 160-bit group order. All the experiment results represent the average of 20 trials.

1) Computation time of DO in processing phase
The computation time of DO mainly generates in algorithm DataEnc of data processing phase. We first test the relation between DO's computation time and the size of attribute universe A as described in Fig. 6. We can see that when size of attribute universe A varies from 1 to 100, the computation time of DO increases slowly and costs less time than Zhang's. Then we test the relation between DO's computation time and the size of shared data as described in Fig. 7. From Fig. 7, we can conclude that DO's computation time in our scheme is lower than that of Zhang's scheme when shared data varies from 1 M to 10 M.

2) Computation time of DR in data sharing phase
In data sharing phase, we first test the relationship between DR's computation time and the size of attribute universe A as described in Fig. 8. Because CMS help DR to computes the intermediate data of decryption, DR's computation time is constant when the size of attribute universe A increases. We also test the relationship between DR's computation time and the size of shared data as described in Fig. 9. We can see that with the size of shared data growing, the computation time of DR increases slowly. From Figs. 8 and 9, we can conclude that DR's computation time in our scheme is less than that in Zhang's scheme.

Conclusion
In this paper, we propose an efficient and secure data sharing scheme for mobile devices. The scheme guarantees security and authorized access of shared sensitive data. Furtherly, the scheme realizes efficient integrity verification before DR shares the data to avoid incorrect computation. Finally, the scheme achieves lightweight operations of mobile terminals on both DO and DR sides.