Blockchain-cloud privacy-enhanced distributed industrial data trading based on verifiable credentials

Industrial data trading can considerably enhance the economic and social value of abundant data resources. However, traditional data trading models are plagued by critical flaws in fairness, security, privacy and regulation. To tackle the above issues, we first proposed a distributed industrial data trading architecture based on blockchain and cloud for multiple data owners. Subsequently, we realized implemented distributed identity management by the distributed verifiable credentials scheme that possesses the desirable properties, i.e., selective disclosure, multi-show unlinkability, threshold traceability, and public verifiability. Finally, we presented a fair trading mechanism without trusted third parties based on smart contracts, and we employed blockchain and multi-signature to ensure data integrity during data storage and trading. The security and performance analysis shows that our proposal is feasible for sensitive data trading for multiple data owners and provides a useful exploration for future industrial data trading and management.


Introduction
Industrial Internet logging rapid expansion in recent years will enhance productivity, and abundant industrial data is a paramount driving force to foster the integrated development of the digital economy and industry [1].Data trading is conducive to utilize efficiently of data resources by endowing the value of data and bringing benefits to data owners and users [2].
However, a large amount of valuable industrial data has not been collected or merely exploited in a private storage environment, such as the data referring to train control, passenger flow or resource allocation in intelligent transportation systems [3][4][5].Data trading is still in its infancy, and there are no industrial privacy-enhanced data trading architectures and practical solutions.Traditional data trading, which relies mainly on private deals and third-party data markets, has been constrained by critical flaws such as fairness and transparency, security of data and identities, as well as the difficulty of appeals and evidence collection [2,6,7].To enhance the value of data, it is urgent to study the trusted sharing and trading mechanism of industrial datasets.
1) The management security of participants is the premises of trading and Industrial Internet systems, but the traditional identity management schemes are trapped by excessive information disclosure, inadequate regulation and are vulnerable to internal attacks or collusive attacks, which are easy to trigger concerns about privacy disclosure and data abuse.Credentials represent the qualification of users and the lack of efficient general credentials with enhanced privacy and regulation will impact blockchain that supports identifiers management smart contracts [1,8].Therefore, it is critical to design general credential management with selective disclosure and privacy-enhanced tracking for identity authentication and supervision during industrial data trading.2) Authenticity and data confidentiality are the foundation of fair trading, however, Industrial Internet data stored in the cloud is under great threats such as unauthorized invaders, data leakage, and integrity damage.Therefore, it is non-trial to take into account the security of key access to protect data from misuse by unauthorized users and integrity to guarantee the accuracy and reliability of data [2].3) Fair and secure data trading schemes ensure that the trading participants can securely trade and obtain data or profits, however, data tradings based on private and third parties are prone to suffer from rogue tradings.In addition, the data owner's control over the data and secondary reselling of the data need to be taken into account [2,7].4) The demand for multiple data owners and users is general in distributed industrial scenarios.The distributed industrial setting requires multiple participants to realize data sharing, data trading, and collaborative calculation, thus it is paramount and more practical to study the multiple data owners' industrial data trading scheme [6].
Cloud computing is deemed to be an adoptable alternative for data service [2,9,10] in industrial application-oriented scenarios through numerous cloud service models.Blockchain-cloud paradigm increases the confidence of both consumers and data providers by building a more trustworthy cloud ecosystem [11].Blockchain is the ideal choice for enhancing both functionality and security/privacy of the cloud ecosystem in varied manners due to outstanding decentralization, immutability, traceability, anonymity, and transparency [11,12].The integration of blockchain technology and cloud computing has great potential in enhancing identity authentication and access control, data security and privacy, transaction fairness and efficiency of the decentralized data sharing and supply chain management applications [10,11,13,14].In particular, the smart contract in blockchain-enabled, applications is an expectable drivening factor to improve the efficiency and accuracy of data processing by automating payment once predefined conditions are satisfied [12].
Thus, we first designed a distributed data trading framework for multiple data owners based on blockchain and cloud aimed at identity, data, and trading security.In this framework, the encrypted data is stored on the cloud service platform, and the smart contract of certificate management and data trading is deployed on the blockchain for identity and trading management.
Secondly, we explored privacy-enhanced user identity management with properties including fine granularity, unlinkability, selective disclosure, and threshold traceability based on the verifiable credentials, which stems from the multi-message version of PS multi-signature.
Finally, we designed a distributed data trading scheme for multiple data owners without a trusted third party.The honest data owners are paid to ensure that the data user receives the correct industrial dataset by providing the correct decryption key, otherwise, the data user can initiate dispute resolution.
The rest of this paper is structured as follows.Some related definitions are recalled and the comparison is discussed in "Related works" section.Then " System model" section describes the system architecture, security model, and design objectives."Preliminaries" section describes the relevant knowledge and "Proposal" section focuses on the details of the construction.In "Analysis" section, the security and performance are analyzed.Finally, "Conclusion" section concludes the article.

Related works
Researchers explored decentralized data sharing and trading architectures based on blockchain and cloud servers.Fan et al. provided one-to-many data sharing architectures in vehicular networks.Specifically, the data owner outsources the encrypted data to the cloud server and uses the blockchain as a broadcast channel to publish access policies based on attribute-based encryption [9].Zhang et al. provided a data security sharing model based on privacy protection for IIoT and used blockchain logging technology to trace and account for illegal access [15].Dai et al. and Zhang et al. designed a data trading ecosystem based on blockchain-cloud and Software Guard Extensions (SGX) architecture [13,14].Li et al. realized fair trading without a trusted third party by trading decrypting ciphertext on the blockchain and eliminating secondary sales on the chain [7].Liu et al. proposed a decentralized transparent data trading scheme [10].Koutsos et al. realized the privacy-enhanced decentralized data market based on blockchain, functional encryption, and zero-knowledge proof [16].Liang et al. a blockchain-based fair and fine-grained data trading scheme with privacy preservation using the attribute-based anonymous credential, an authenticated data structure, and zeroknowledge proof [12].
However, most current studies on data trading schemes only focus on a single data owner [6].Sensitive industrial data usually are traded after being approved by multiple departments.However, some challenges are remaining to be addressed before secure efficient group data trading is widely applied.The previous works [17,18] have studied group data sharing but failed to take into account data trading settings.Koutsos et al. proposed A decentralized data market scheme for multiple data owners, which has only focused on data computing privacy but overlooked identity privacy [16].The scheme of Cao et al. proposed an iterative auction mechanism in the data market with multiple data owners but failed to take into account trading privacy [19].
Cloud servers probably tend to evade the obligations of notified users actively after data corruptions, which are caused by network or operation exceptions, cyberattacks software, and hardware failures [7,10,20].Additionally, data owners are concerned about losing control over their data when using cloud services.Therefore, data integrity verification in the process of data storage and trading based on the cloud is an important security requirement [21][22][23][24].The common techniques of data integrity verification are those based on hash values [7,9,18] or Third-Party Auditor (TPA) [22].However, the methods used in the scheme [7,9] have a bottleneck on scalability and public verifiability.Public verifiability means that any entity in the network can check the integrity of data stored in the cloud and the technology with TPA is capable of providing public verifiability [22].However, TPA may dishonestly perform the challenge and response protocol and even deceive users by colluding with the cloud To address the above issues, blockchain is adopted to generate randomly challenging information against malicious TPA [12,23,24].As stated in the previous paragraph, the data with public verifiability of multiple data owners especially the schemes is attractive in practical industrial environments.Fortunately, Wang et al. proposed an efficient public verifiable multi-owner data integrity verification solution based on multiple signatures [22], which contributes to our goal.
The precondition of secure data trading is privacyenhanced identity management, especially identity privacy and traceability.The works in [10,14,17,18] support identity privacy, while other schemes do not.The schemes [10,17,18] also provide the tracking of malicious users.However, most existing schemes employed ordinary identity authentication technologies that have the disadvantage of coarse granularity, privacy disclosure, being hard to trace, and lack of generalization and flexibility in the distributed setting.Verifiable Credentials (VCs) is the standardized digital credential with cryptographic security, privacy protection, and machine-readability and is one of the promising evolution supporting decentralized identity authentication [25].
As shown in Fig. 1, the credential holder requests a credential from the issuer, who provides trust endorsement for the holder by issuing credentials on relevant attributes.Then, the holder either saves the credential in the local wallet or hosts it on the blockchain.When trying to access a certain service, the credential holder presents the credential to the verifier who confirms whether the holder has met the requirements by verifying claims of the credential.

Fig. 1 Verifiable credentials model
Verifiable credentials have captured increasing interest and growing Support due to notable benefits [25].Li proposed a verifiable credential scheme with selective disclosure based on the Bohen-Lynn-Shacham aggregate signature [26].Yoon et al. proposed a personal data trading system based on blockchain, which uses verifiable credentials to prove the ownership of data [27].Fotiou et al. proposed a personal data trading system based on verifiable credentials [28].Moreover, the methods of [10] essentially support verifiable credentials.More works focus on the exploration of various fields [25,[29][30][31].Therefore, we aim to construct the fair trading of multiple data owners based on cloud storage and blockchain architecture and privacy-enhanced identity management based on verifiable credentials.
Unfortunately, in order to enhance privacy and regulation, verifiable credentials are expected to prove the credentials' ownership in the multi-show unlinkable, blind issuance, fine-grained attribute privacy, minimized disclosure, and traceability.Multi-show unlinkability allows users to present credentials multiple times whereas verifiers are prevented from tracking users by link continuous session authentication services.Minimizing disclosures aims to void to leak of unnecessary information [25,29].However, most previous proposals failed to support all the above requirements.Essentially, the multi-show unlinkability of credentials stems from the re-randomizablity of the signature, which allows the signatures of the same message to be randomized into another unlinkable signature version without knowledge of the secret key [32,33].Minimized disclosure is realized by means of selective disclosure and zero-knowledge-proof technology [12].Selective disclosure allows the credential holder to present corresponding subsets of credential attributes to verifiers who are still allowed to validate the whole credential.Zero-knowledge proof refers to the prover can prove the authenticity of the claim to the verifier, but does not reveal any further information about the claim except the authenticity of the claim.
Fortunately, the emerging Pointcheval-Sanders (PS) signature, which supports re-randomizable, efficient knowledge proof and multi-message signatures is expected to propel the evolution of privacy-enhanced verifiable credentials [32,33].Yu et al. proposed an anonymous authentication scheme called BASS supporting attribute privacy, selective revocation, credential soundness, and multi-showing-unlinkability, but the scheme is inefficient and inadequate in the corrupted distributed authorities setting [34].García-Rodríguez et al. considered distributed credential issuance but overlooked corrupted authority [31].Sonnino et al. proposed a scheme called Coconut which supports threshold issuance, selective disclosure, and multiple unlinkable selective [8].
However, unconditional identity privacy is not harmonious with the supervision of the regulator and may trigger illegal profits and data abuse in data trading.In the scheme with traceability, the issuing authority and the tracing authority are usually single or monolithic and thus are assumed to be trusted.However, corrupted issuing authorities may forge certificates, and corrupted tracking authorities can remove the anonymity of certificates.The proposals with multiple issuance authorities and tracking authorities are desirable and practical for distributed industrial applications.It is also very necessary to reveal the real identity of illegal users in a privacyenhanced way.The scheme [35] separates the credential issuance and trace function, and also exposes the user's identity through threshold trace to prevent the corruption of a single trace authority from harming the privacy of legitimate users.Camenisch et al. fine-tuned an authentication mechanism for V2V and Liu et al. for the data market [10,35].
The features of the above schemes are presented in Table 1.Following previous excellent works, we ameliorated the privacy-enhanced verifiable credentials to the needs of a distributed Industrial data trading scheme, which supports distributed blindly insurance, finegrained, threshold traceability, multi-show unlinkability, and selective disclosure in the setting of multiple data owners and regulators.

System model
In this section, we briefly define the system model, security model, and security goals of our scheme (Table 2).
BC: Credential management smart contract (CMSC) and data trading smart contract (DTSC), and the credential revocation registry are deployed on BC.The CMSC and DTSC are in charge of handling and recording the operations of entities in the trading process.In addition, BC plays the role of the public verifier and executes data integrity verification protocols.CS: The semi-trusted cloud server is responsible for storing data and providing data integrity proof to DOs and DUs.DOs: DOs expect to gain benefits from selling data and perform data trading through CS and DTSC.In addition, DOs are in charge of issuing credentials to data users.DUs: DUs will pay DOs after obtaining the correct data through CS and DTSC.DUs obtain multiple credentials from distributed DOs through CMSC.DUs selectively disclose the aggregated credential to CS for authentication.RGs: RGs consist of multiple independent regulators departments and play the role of the dispute arbitration commission and the tracing authorities.In addition, RGs are responsible for generating global parameters and maintaining the blockchain.

Security model
We regard that RGs, CS, DOs, and DUs are semi-trusted and may deviate from the agreement for benefits.Thus, we introduce the adversaries A CS , A DOs , A DUs and A RGs in the model [7,9,23,35].A CS is curious to infer sensitive information about cloud users and data.Additionally, CS usually carries out the protocols honestly but may conceal the data corruptions that are caused by network or operation exceptions, cyberattacks, as well as software or hardware failures.A DOs may launch attacks such as issuing untraceable illegal credentials for malicious users, uploading forged or wrong ciphertext to gain benefits, and denying or claiming ownership of the dataset.A DUs may launch attacks such as stealing data with illegal credentials, refusing payment by forging evidence (keys or signatures), and second reselling.A RGs tries to corrupt the regulators to reveal the user's identity.
The security assumptions are as follows: assume that there are secure authenticated broadcast channels between regulators and multiple data owners; assume that BC is a secure distributed infrastructure without vulnerabilities.

Security goals
Our goal is to create a distributed privacy-enhanced and supervisable industrial data trading scheme, and we proposed the security goal from the perspectives of identity, data, and trading security [7,21,35].

Identity security
Identity security should meet the following attributes: Anonymity: Anonymity ensures that the identity information of DUs is not disclosed as long as the

Multi-show unlinkability
Selective disclosure issuer is honest and the presenting credential is not opened.
Threshold traceability: A certain amount of RGs can jointly reveal the real identity of DUs.
Blind issuance: DOs do not obtain the user's secret identity information when issuing credentials.
Multi-show unlinkability: Multi-show allows DUs to present the same credentials more than once, whereas unlinkability means it is impossible to associate the presented credentials even if all authorities conspire.Multi-show unlinkability can prevent the verifier from tracking the user's activities through the authentication service of continuous sessions.
Selective disclosure: DUs only show a subset of the attributes required to prove the ownership of the credential, whereas the verifier only sees the disclosed attributes and still verifies the entire credentials.In other words, the credential verifier only obtained the cryptographic verification result of the attribute instead of the plaintext or ciphertext of the attribute.

Data security
Data security includes confidentiality, integrity, and data authenticity.

Trading security
Trading security aims to ensure trading fairness and nonrepudiation and to resist the second reselling attack.
Trade Fairness: Fair trading means DU must pay the corresponding fee once obtains the correct datasets and the DOs gain corresponding benefits from selling their datasets.Meanwhile, DU is allowed to refuse to pay after finding the data problem and to submit a complaint [7].
Trade non-repudiation: On the one hand, DOs or DUs may deny the trade for some reason.On the other hand, the proposal should resist the forged-evi-

Multi-signature
Multi-signature allows multiple signers to sign for an identical message.Wang et al. proposed a blockless verifiable multi-signature scheme and employed it in the public verification mechanism in the cloud environment.The multisignature scheme consists of the following algorithms [22]: MS.Setup(1 ) → params : input a security param- eter , and outputs params ← (p, G, G T , e, g, H) , where be the set of n signers and given global parameters params , p and its identifier mid , I i ∈ I computes local signature respec- tively ψ i = [H(mid)g m ] v i and broadcasts ψ i .

MS.SAgg({vpk
) → ψ : after receiving n local signatures {ψ i } n i=1 on block m , the designed signer outputs an aggregated signature ψ = n i=1 ψ i .MS.Verify(m, mid, ψ, {vpk i } n i=1 ) → (1,0) : given block m ∈ Z * p and its identifier mid , signature ψ and all the signers public keys {vpk i } n i=1 , the verifier computes pk s = n i=1 vpk i and outputs 1 if e(ψ, g) =e(H(mid) • g m , pk s ) or 0 other- wise.The correctness of the above equation can be proved as

Multi-message Pointcheval-Sanders multi-signature
Pointcheval-Sanders multi-signature is an interesting privacy-enhanced cryptographic primitive due to supporting public key aggregation, signature re-randomization, and efficient signature zero-knowledge proof.PS multi-signature consists of the following algorithms [32,35]: PSM.Setup(1 ) → params : input a security parameter , and output global parameters params ← p, G, G, G T , e, g, g, n, k , where g ∈ R G and g ∈ R G are the generators of G and G e ψ, g = e n i=1 ψ i , g = e n i=1 (H(mid)g m ) v i , g = e H(mid)g m , pk s respectively.e : G × G → G T is a bilinear map, n and k are the number of signers and messages respectively.

ElGamal encryption
ElGamal encryption is a public key encryption primitive and consists of the following algorithms [36]: ElG.KGen(G) → (sk, pk) : given G , sets sk = x ∈ R Z * p , pk = g x and returns key pair (sk, pk).
ElG.Enc(pk,m) → c : given public key pk , message m ∈ Z * p , randomly selects r ∈ R Z * p and returns corresponding ciphertext c = (c 1 , c 2 ) = (g r , pk r g m ).
ElG.Dec(sk,c) → g m : given private key sk and cipher- text c , returns c 2 /c sk 1 = g m .

Plaintext checkable encryption
Plaintext checkable encryption (PCE) is a new public key encryption primitive, which provides a feasible way to check whether c ∈ C is the ciphertext of the plaintext mes- sage m ∈ M under the public key ppk ∈ G (where M and C respectively represent plaintext space and ciphertext space).PCE consists of the algorithms described below [37]: This algorithm first verifies that the correctness of the ciphertext and return 0 if e(g,c 4 ) = e(c 2 ,c 3 ) , and then verifies if e(c 1 /m,c 3 ) = e(ppk,c 4 ) , the algorithm outputs 1 means that c is an encryption result of m and 0 otherwise.

Zero-knowledge proofs
Zero-knowledge proof(ZKP) refers that the prover can prove the authenticity of the claim to the verifier, but does not reveal any further information about the claim except the authenticity of the claim.In general, it consists of algorithms described as follows: NIZK.Setup(1 ) → params : inputs a security parameter , and outputs global parameters params.NIZK.Proof {x:R = (s, w)} → π : inputs statements s and w , the prover constructs proof π to prove the secret value x satisfies relation R= (s, w) in a zero-knowledge way.

Verifiable secret sharing
The (τ , η) VSS scheme allows a dealer to share the secret s with η participants {S 1 • • • S η } and τ − out − of − η par- ticipants can jointly combine shares and recover s .To prevent dishonest dealers from cheating on generating shares, VSS can support proof of encryption shares and public verifiability.Specifically, the VSS scheme consists of the following algorithms [38]: VSS.KGen(params,i ∈ [1, η]) → (opk i , osk i ) : Each par- ticipant S(i) runs ElG.KGen algorithm to obtain key pair (opk i , osk i ) , where The dealer distributes the share of the secret s to S(i) .Firstly, the dealer randomly selects (p 1 , • • • , p τ ) ∈ R Z p , and computes the polynomial P(χ) ← s + � τ ℓ=1 p ℓ χ ℓ ∈ Z p [χ] and H s ← h s .Sec- ondly, for ∀i ∈ [1, η] , the dealer sets s i ← P(i) as the share of the secret s and distribute s i to S(i) .Next, for ∀ℓ ∈ [1,τ ] , the dealer computes and publishes V ℓ ← h p ℓ which is the verification value.Then, for ∀i ∈ [1,η] , the dealer randomly selects r i ∈ R Z p and computes both C i := ( C i,0 , C i,1 ) ← ( g r i , f r i i g s i ) and Finally, the dealer broadcasts ((V ℓ ) ℓ∈ [1,τ ] , ( C i , π Pi ) i∈ [1,η] ) to all S(i).
VSS.Verf ( Ci , π Pi ) → (1 i , 0) : Each participant S(i) checks the correctness of the proof π Pi in the way of Zero-knowledge Proof.
VSS.Recover({ Ci , opk i } τ i=1 ) → gs : Let O be the set of τ participants.Firstly, each S i ∈ O computes C * i = Ci,1 /( Cz i i,0 ) = gs i with the secret key and broad- casts C * i to other participants.If the recovered share copies from other participants S(j) j∈O{i} and C * i � = ⊥ (namely all the recovered shares are correct), for all j ∈ O , the openers calculate each Lagrange coef- ficient w j ← j∈O{i} j/(j − i) , and then calculate j∈O ( C * i ) w j = gs , the secret s is finally reconstructed over g.

Overview
We aim to propose a verifiable distributed industrial dataset trading architecture model without trusted third parties.Furthermore, we designed a privacy-enhanced verifiable distributed identity management scheme and a verifiable data trading scheme.
In the identity management phase, DU requests verifiable credentials from DOs through CMSC and DOs employ PSM multi-signature to issue local credentials for validated DU.Significantly, we provided privacyenhanced identity management for DU by employing the PSM multi-message signature to realize fine-grained attributes, selective attribute disclosure, and threshold traceability [35].CMSC is in charge of aggregating and distributing credentials to DU. CMSC or DU randomize and selectively disclose the credentials to CS before data trading.CS verifies the validity of the credentials and provides corresponding services to DU with unrevoked and validated credentials.Although verifiable credentials protect users' privacy through anonymity and selective disclosure, DUs are not always honest.Therefore, RGs need to disclose or even revoke malicious credential holders in the way of the threshold.It is worth noting that threshold traceability contributes to avoiding the abuse of the power of RGs and further improving the privacy of data users.
The data trading phase aims to secure fair trading of the industrial dataset through the trading management smart contract DTSC.Before trading, DOs encrypt the dataset using symmetric encryption technology and upload the ciphertext as well as the signature of the dataset to CS.Then CS verifies the signatures and registers the datasets to DTSC.Later, DU retrieves the interested dataset and submits the data trading request to DTSC.Once DTSC verifies the trading request of DU successfully and CS returns the correct data integrity verification result, DOs send the decryption key encrypted by the PCE algorithm to the authorized DU through DTSC.After receiving the PCE ciphertext, DU recovers the AES key and verifies the decryption result with his PCE private key.Next, DU downloads the ciphertext of the dataset and decrypts the original data.Finally, DU pays the fee before the specified time if the decryption result is correct.Otherwise, DU will refuse payment and submit evidence for dispute resolution.In the dataset integrity verification stems from a public verifiable multi-owner data scheme [22], and blockchain is employed to replace traditional TPA to reduce the problem of malicious third parties in the public audit.

Distributed identity management
Distributed identity management includes the stages of initialization, credential request, generation, aggregation, presentation, verification, trace and revocation (Fig. 3).Notations are shown in Table 3.

Initialization
In the initialization stage, public parameters and keys related to the subsequent processes are generated.
(1) RGs run function CredSetup to generate pub- lic parameters params , and then upload params to CMSC.It is assumed that params are implicit inputs of all other algorithms.(2) DO I i ∈ I runs function IKGen to generate the issuing key pair (isk i ,ipk i ) and initializes the registration list RegList i .Then, DOs upload ipk := {ipk i } i∈ [1,n] to CMSC.(3) RG S j ∈ S runs function OKGen to generate the trace key pair (opk j , osk j ) and initializes the trace list T raceList j .Then, RGs upload opk := {opk j } j∈ [1,η] to CMSC.(4) CMSC runs the KAgg algorithm to gener- ate the aggregated public key apk and outputs gpk := (ipk, opk, apk).

Credential request
In this stage, U uid ∈ U with unique identification uid ∈ R Z * p runs function CredReq to send the creden- tial request CredReq uid to I i ∈ I for attribute sets Attr through CMSC.Firstly, U uid generates additional sca- lar a ′ and public base h , then computes the user's secret identity H uid and G uid as well as zero-knowledge proof π uid .Next, U uid calls the algorithm VSS.Share to com- pute the share uid j of the identity uid for S j ∈ S and send them to all RG S through CMSC.Finally, U uid uploads to CMSC, where Cj is the ciphertext of share uid j and π Pj is corresponding zero-knowledge proof, {V ℓ } ℓ∈ [1,τ ] is the verification value of the secret sharing algorithm.

Credential generation
After receiving CredReq uid , I i ∈ I runs function CredGen to generate local credential CredGen i,uid for valid users.Specifically, I i will reject the credential request if there is duplicate registration, illegal identity, or incorrect share encryption.Otherwise, I i will call the algorithm PSM.Sign to sign blindly on the user's identity uid and attributes Attr .Finally I i updates RegList i and uploads CredGen i,uid to CMSC.

Credential aggregation
After receiving all local credentials from I i ∈ I , CMSC runs the function CredAgg to generate aggregated cre- dentials CredAgg uid .Specifically, CMSC first calls the algorithm PSM.Verf to verify whether the local creden- tials are correct.Next, Run the algorithm PSM.SAgg to aggregate {CredGen i,uid } i= [1,n] into complete credentials CredAgg uid .Then the algorithm PSM.Verf is called to verify the validity of CredAgg uid .If the validation is passed, CredAgg uid will be issued to U uid who will store his credential in the local wallet.

Credential presentation
When required to present trading qualifications, U uid personally present credentials Cred uid, ρ on the selec- tive presentation attribute subset ρ ⊆ Attr .The func- tion CredProve will be called to randomize the signature of credential CredAgg uid and generate the proof of signature.

Credential verification
During the trading, CS will check the validity of Cred uid, ρ through function CredVerify , namely, verify the signature and its zero-knowledge proof in Cred uid, ρ .

Threshold trace and user revocation
When RGs receive complaints or inspect abnormal credentials, τ + 1 RGs O i ∈ O will run the function Trace to recover the identity of U uid who computed the credential Cred uid, ρ .Firstly, O i ∈ O runs CredVerify(apk, ρ, Attr, Cred uid, ρ ) to verify the validity of Cred uid, ρ .Then, O j ∈ O downloads

Trading
The data trading stage mainly includes data trading request, data trading response, data integrity verification, key verification and decryption as well as disputes,

1) Data trading request
U uid retrieves the needed data through the description ProData and submits the smart con- tract RequestData to DTSC.As shown in Fig. 6, RequestData contains basic data information, trade information and contract code.In addition, U uid needs to provide his PCE public key ppk uid , creden- tials Cred uid, ρ and access information AccInfo .The contract code stipulates that the data owner must provide correct PCE ciphertext c Ks and data integ- rity verification results.Apparently, RequestData guarantees that U uid will pay the data trade fee Fee Trd to DOs before the time T pay if U uid can correctly decrypt the dataset.

2) Data integrity verification
In the data integrity verification process, blockchain sends challenge information to CS. CS calculates evidence based on the challenge information and returns the evidence.

3) Data trading response
Before the data trading response, the validity of dataset and data user identity need to be checked.Firstly, DTSC checks whether there is the same hash index Index of the dataset on the chain and checks the data validity period.Then, DTSC initi- ates the data integrity verification request and CS returns the data integrity verification result.If there are other duplicate indexes or the dataset expires or data integrity verification failed, DTSC will reject the request and notify DOs.Meanwhile, CMSC will check the revocation registry and will call the function CredVerify to verify the validity of the creden- tials Cred uid, ρ , and reject the data request of U uid if the credential is revoked or illegal.Then DTSC will check the access control policy, and will reject the request if it does not match.After all checks are completed, DOs will respond to the request and submit the data sell contract SellData to DTSC.As shown in Fig. 7, SellData provides the correct ciphertext of Ks , i.e. c Ks ← PCE.Enc(ppk uid ,Ks) , and meanwhile promises that U uid is allowed to deny the validity of c Ks and initiate dispute handling before the dispute closing time T close .The contracts RequestData and SellData ensure that the fair trading between DOs and U uid , that is, honest DOs get paid, and honest U uid gets the correct data.

4) Key verification and decryption
After receiving the PCE ciphertext, U uid first recovers the AES key Ks ′ through K ′ S ← PCE.Dec(psk uid , c Ks ) , and checks whether Ks ′ is the correct decryption of c Ks ( i.e.PCE.Check(Ks ′ , ppk uid , c Ks ) ? = 1 ).Then, U uid downloads and decrypts the ciphertext of the dataset DATA * from CS to obtain the original data (i.e.DATA ′ ← DEC(Ks ′ DATA * ) ). Next, U uid calcu- lates the index value of DATA ′ and checks whether it is consistent with the index value stored on the platform (i.e.Hash(DATA ′ � did) ?= Hash(DATA � did)).If the above checks pass, U uid will pay Fee Trd before D pay .Otherwise, if Ks is incorrect decryption (i.e.PCE.Check(Ks ′ , ppk uid , c Ks ) == 0 ) or DATA * is incorrect decryption (i.e.DEC(Ks ′ DATA * == ⊥) ), U uid will refuse to pay within T pay and initiate the trading dispute before the trading closed time T close .

5) Dispute
In the dispute stage, Fee Trd will continue to be locked until the evidence is confirmed.U uid invokes the Dispute contract and submits the evidence, including Hash(DATA ′ � did),Ks ′ and signature ψ j .If DTSC verifies Ks successfully but fails to decrypt the cipher- text correctly (i.e.PCE.Check(Ks ′ , ppk uid , c Ks ) == 1 and Hash(DATA ′ � did) � = Hash(DATA � did) ), meanwhile the signature verification is passed MS.Verify(m, mid, ψ j , {vpk 1 • • • vpk n }) = 1 , the trad- ing fails and the locked Fee Trd will not be paid to the data owner.If anyone is not satisfied, it means the trading is successful and the locked Fee Trd will con- tinue to be paid to the data owner (Fig. 8).

Identity security
In this scheme, user authentication is realized by finegrained verifiable credentials with selective disclosure, which meet the anonymity, threshold traceability, and multi-show unlinkability.Our definition follows the security concept of dynamic group signature in [35] and the security definition of anonymous credentials in [8].

1) Anonymity
Anonymity ensures that U uid prove valid credential without revealing its true identity.In the credential request stage, U uid generates the secret identity H uid := h uid as well as zero- knowledge proof π uid .Apparently, it is unfeasi- ble for an adversarial to extract uid from H uid .In the credential generation stage, each I i ∈ I Fig. 7 Trading response smart contract verifies the proof π uid and then blindly signs H uid with isk i , and generates PS multi-message sig- nature σ i,2 = h x i +y i,0 •uid+ k j=1 y i,j •a j +y i,(k+1) •a ′ .Due to zero-knowledge proof and blind issuance, I i is unfeasible to learn the true identity of π uid .In the credential presentation stage, U uid gen- erates 2 ) and computes zero-knowledge proof π ρ .Due to zero-knowledge proof and re-ran- domness of PS signature, CS can convince the validation of Cred uid, ρ through CredVerify , but is unfeasi- ble to infer any information about uid .Only τ +1 RGs are capable of recovering identity of U uid through threshold trace operation.Therefore, anonymity is guaranteed and the identity information U uid is unfeasible to be disclosed as long as I i ∈ I is honest and the presenting credential is not opened.

2) Threshold traceability
Threshold traceability refers to that τ + 1 RGs can recover the identity of valid credential Cred uid, ρ but any τ corrupt RGs are incapable.Essentially, the tinformation of threshold traceability originates from the secret share of identity during the issuance process.Specially, U uid shared C j := C j,0 , C j,1 ← g r j f r j j Y ′ uid j 0 (i.e. the ciphertext of the share uid j about his secret identity uid ) to all S j ∈ S , where f r j j Ỹ ′uid j 0 is the ElGamal ciphertext of uid j under the trace public key opk j = fj of S j .Cj means that the key Ỹ ′uid 0 is secretly shared between all RGs.In the issuance, I i blindly signs to secret identity H uid after Cj is validated through VSS.Verf .The aggregated credential contains the signature that includes the information of Ỹ ′uid 0 .In the trace stage, τ + 1 RGs compute the ciphertext of share about the secret identity of uid using trace secret key osk j = z j (i.e.C * j = Cj,1 /( Cz j j,0 ) = Ỹ ′uid j 0 ) and recover the identity uid of U uid .

3) Blind issuance
In the credential generation stage, each I i ∈ I firstly verifies the proof π uid and then blindly signs the user's secret identity H uid := h uid with isk i , and finally gen- erates signature σ i,2 = h x i +y i,0 •uid+ k j=1 y i,j •a j +y i,k+1 •a ′ .Due to zero-knowledge proof and blind issuance, I i is unfeasible to learn the true identity of π uid 4) Multi-show unlinkability It mainly stems from the re-randomness of the PS signature (namely, when no one knows the secret key knowledge, the signature of the same message can be randomized into another unlinkable signature version).As shown in the presented credential Cred uid, ρ = σ r 1 , σ r 2 , π ρ (where r ∈ R Z * p ), the newly Cred uid, ρ varies with r and is unlinkable with previ- ous ones.

Data security 1) Data confidentiality
The dataset is encrypted by symmetric encryption (i.e.DATA * = {Enc Ks (m j )} j∈ [1,k] ) and the decryption key Ks is encrypted by PCE encryption (i.e.c Ks ← PCE.Enc(ppk uid ,Ks) .Meanwhile, only DUs who hold the unrevoked valid credential and match the specified access policy are allowed to obtain Ks.

2) Data integrity & public verifiability
We employed DTSC to replace traditional TPA to prevent collusion attacks with malicious third parties due to the public verifiability of the multi-owner data integrity verification scheme.Therefore, the proposal inherits the merits of the multi-owner data integrity verification scheme.

2) Trade non-repudiation
The operations of entities in the trading process are recorded on the blockchain and the immutability of blockchain prevents the repudiation of participating entities.Moreover, the proposal is capable of resisting the forged decryption key attack and forged signature attack during the dispute, thus preventing malicious DUs from skipping out on the bill.Firstly, the decryption key Ks is encrypted by the PCE algo- rithm, so the forged key K s cannot be verified (i.e.PCE.Check( K s, ppk uid , c Ks ) == 0 ), namely, the forged key K s cannot be used as evidence of dispute.Secondly, it is impossible to forge the signature as evidence due to the unforgery of the multi-signature [7,22].3) Anti-second-reselling-attack DTSC records the index hash Index := Hash(DATA � did) and the datasets with duplicate indexes are deemed illegal.Meanwhile, RGs are liable to tracing malicious DUs or DOs and thus prevent privately trading data off the chain to a certain extent due to the immutability and traceability of blockchain [7].

Performance evaluation
We implemented the off-chain cryptographic operations overheads using the JPBC cryptography Library on the computer with a 2.80 GHz processor and 16 GB memory.The computational cost and the execution time of distributed identity management (set k = 1, η = 5, n = 5, τ = 3) are shown in Table 4, where the exponentiation operation and multiplication operation in G and G are denoted by E 1 ,E 2 ,M 1 and M 2 respectively.In the G T group, the pairing operation, exponentiation operation and multiplication operation are denoted by P, F and T respec- tively.The execution time of each operation of Type F in JPBC is T E 1 =1.51ms , T E 2 = 2.78ms , T M 1 = 0.01ms , T M 2 = 0.02ms , T P = 54.77ms,T F = 12.01ms , T T = 0.15ms .We omitted the cost of hash.The cost of IKGen and CredGen for each DO is (k + 3)E 2 and (k + 3)E 1 + (k + 2)M 1 respectively, which both grow line- arly with k .The cost of OKGen and Trace for each RG is E 2 and τ E 2 + (τ − 1)M 2 respectively.The cost of KAgg and CredAgg for each CMSC is (k + 2)(nE 2 + (n − 1)M 2 ) and (n + 1)(2P + (k+2)E 2 + (k+2)M 2 )+nE 1 respectively, which are both grows linearly with k and η .The cost of aggregation operation is nE 1 which also grows linearly with n and the verification operation for each partial signa- ture and aggregate signature is 2P + (k+2)E 2 + (k+2)M 2 which is grows linearly with k .The cost of CredReq for DU is (τ + 2)E 1 + 3ηE 2 + ηM 2 , which mainly stems from VSS.Share algorithm.The cost of CredProve for DU is constant 1F +2P + 4E 1 and the cost of CredVerify is 3F +4P + 4E 1 + (k + 2)E 2 + (k + 2)M 1 , which is grows linearly with k .It is interesting to note that both the cost of attribute display and verification are O(1) due to only one aggregate credential being involved in the display and validation of attributes.Moreover, the partial credentials and aggregate credentials are both composed of two group elements.Therefore, distributed identity management has an advantage in computation efficiency and is suited for practical applications.
Regarding the PCE algorithm, the ciphertext length contains 4 group elements in G , and the cost of key gen- eration, encryption, decryption and verification is E 1 ,2E 1 +2E 2 , 2P+E 1 +M 1 and 4P + E 1 +M 1 .Accordingly, the execution time is 1.51 ms, 8.58 ms,111.06ms, and 220.60 ms, respectively.In the data integrity verification phase, the cost of signature, evidence generation, and verification is about 2nE , ℓE and ℓE + 2P respectively.Accordingly, the the execution time is 15.1 ms, 34.16 ms, and 143.7msrespectively ( T E =8.54ms, T P = 5.29ms and set n = 5, ℓ = 4).
The smart contract needs to perform cryptographic operations, including the credentials key aggregation, credentials aggregation, integrity verification, and PCE Check.Every OPcode in the Ethereum Virtual Machine (EVM) specification has an associated gas fee [12].The gas cost can be estimated by the precompiled contract that implements ate pairing check on the elliptic curve alt-bn128, namely, the gas cost of ECAdd, ECMul, and ECParingare 40000gas, 500gas, and 80,000*k + 100,000, respecivly.
In the data trading stage, large files are stored off-chain, and the hash index is stored on the blockchain, which results in a total 512-bit storage cost.That is, the storage cost is very low.

Conclusion
In this article, we presented a distributed industrial data trading architecture model for multiple data owners scenarios based on blockchain and cloud.In the framework, a fair trading mechanism for industrial datasets without a trusted third party is designed based on smart contracts and a public verifiable data integrity scheme for multi-data ownership is used to ensure the integrity of data.Additional, verifiable credentials with selective disclosure and threshold tracking provide privacy-enhanced authentication and malicious user tracking.Analysis results show that our proposal has successfully achieved the security goals.
Data confidentiality: The original data can only be stored and sold through encryption, and only authorized users can obtain the decryption key.Data integrity & public verifiability: Any verifier is able to audit the correctness of multi-owner data in the cloud with the public keys of all the owners.Data authenticity: Data cannot be forged (i.e.unforgeability of trading data), and a malicious DO cannot deny or illegally claim the ownership of a dataset (i.e.Non-repudiation of data ownership).

30 Fig. 3
Fig.3The details of distributed identity management

Table 1
Related works

Table 2
Scheme comparison of verifiable credentials SellData andDispute guarantees the fair trading between the DOs and DU.Specially, RequestData guarantees that U uid will pay Fee Trd to DOs before D pay if U uid correctly decrypts DATA * , SellData provides the correct ciphertext c Ks of the data decryption key Ks , and promises that U uid is allowed to deny the valid- ity of c Ks and initiate a dispute before dispute closing time D close .Dispute guarantee that U uid will refuse to pay within D pay and initiate the trading dispute before D close when Ks is incorrectly decrypted (i.e. PCE.Check(Ks ′ , ppk uid , c Ks ) == 0 ) or DATA * is incorrectly decrypted (i.e.DEC(Ks ′ DATA * ) == ⊥).

Table 4
The complexity analysis of distributed identity management