Blockchain-cloud privacy-enhanced distributed industrial data trading based on verifiable credentials

Fang, Junli; Feng, Tao; Guo, Xian; Ma, Rong; Lu, Ye

doi:10.1186/s13677-023-00530-7

Research
Open access
Published: 02 February 2024

Blockchain-cloud privacy-enhanced distributed industrial data trading based on verifiable credentials

Junli Fang¹,
Tao Feng¹,
Xian Guo¹,
Rong Ma¹ &
…
Ye Lu¹

Journal of Cloud Computing volume 13, Article number: 30 (2024) Cite this article

1094 Accesses
2 Citations
1 Altmetric
Metrics details

Abstract

Industrial data trading can considerably enhance the economic and social value of abundant data resources. However, traditional data trading models are plagued by critical flaws in fairness, security, privacy and regulation. To tackle the above issues, we first proposed a distributed industrial data trading architecture based on blockchain and cloud for multiple data owners. Subsequently, we realized implemented distributed identity management by the distributed verifiable credentials scheme that possesses the desirable properties, i.e., selective disclosure, multi-show unlinkability, threshold traceability, and public verifiability. Finally, we presented a fair trading mechanism without trusted third parties based on smart contracts, and we employed blockchain and multi-signature to ensure data integrity during data storage and trading. The security and performance analysis shows that our proposal is feasible for sensitive data trading for multiple data owners and provides a useful exploration for future industrial data trading and management.

Introduction

Industrial Internet logging rapid expansion in recent years will enhance productivity, and abundant industrial data is a paramount driving force to foster the integrated development of the digital economy and industry [1]. Data trading is conducive to utilize efficiently of data resources by endowing the value of data and bringing benefits to data owners and users [2].

However, a large amount of valuable industrial data has not been collected or merely exploited in a private storage environment, such as the data referring to train control, passenger flow or resource allocation in intelligent transportation systems [3,4,5]. Data trading is still in its infancy, and there are no industrial privacy-enhanced data trading architectures and practical solutions. Traditional data trading, which relies mainly on private deals and third-party data markets, has been constrained by critical flaws such as fairness and transparency, security of data and identities, as well as the difficulty of appeals and evidence collection [2, 6, 7]. To enhance the value of data, it is urgent to study the trusted sharing and trading mechanism of industrial datasets.

1)
The management security of participants is the premises of trading and Industrial Internet systems, but the traditional identity management schemes are trapped by excessive information disclosure, inadequate regulation and are vulnerable to internal attacks or collusive attacks, which are easy to trigger concerns about privacy disclosure and data abuse. Credentials represent the qualification of users and the lack of efficient general credentials with enhanced privacy and regulation will impact blockchain that supports identifiers management smart contracts [1, 8]. Therefore, it is critical to design general credential management with selective disclosure and privacy-enhanced tracking for identity authentication and supervision during industrial data trading.
2)
Authenticity and data confidentiality are the foundation of fair trading, however, Industrial Internet data stored in the cloud is under great threats such as unauthorized invaders, data leakage, and integrity damage. Therefore, it is non-trial to take into account the security of key access to protect data from misuse by unauthorized users and integrity to guarantee the accuracy and reliability of data [2].
3)
Fair and secure data trading schemes ensure that the trading participants can securely trade and obtain data or profits, however, data tradings based on private and third parties are prone to suffer from rogue tradings. In addition, the data owner’s control over the data and secondary reselling of the data need to be taken into account [2, 7].
4)
The demand for multiple data owners and users is general in distributed industrial scenarios. The distributed industrial setting requires multiple participants to realize data sharing, data trading, and collaborative calculation, thus it is paramount and more practical to study the multiple data owners’ industrial data trading scheme [6].

Cloud computing is deemed to be an adoptable alternative for data service [2, 9, 10] in industrial application-oriented scenarios through numerous cloud service models. Blockchain-cloud paradigm increases the confidence of both consumers and data providers by building a more trustworthy cloud ecosystem [11]. Blockchain is the ideal choice for enhancing both functionality and security/privacy of the cloud ecosystem in varied manners due to outstanding decentralization, immutability, traceability, anonymity, and transparency [11, 12]. The integration of blockchain technology and cloud computing has great potential in enhancing identity authentication and access control, data security and privacy, transaction fairness and efficiency of the decentralized data sharing and supply chain management applications [10, 11, 13, 14]. In particular, the smart contract in blockchain-enabled, applications is an expectable drivening factor to improve the efficiency and accuracy of data processing by automating payment once predefined conditions are satisfied [12].

Thus, we first designed a distributed data trading framework for multiple data owners based on blockchain and cloud aimed at identity, data, and trading security. In this framework, the encrypted data is stored on the cloud service platform, and the smart contract of certificate management and data trading is deployed on the blockchain for identity and trading management.

Secondly, we explored privacy-enhanced user identity management with properties including fine granularity, unlinkability, selective disclosure, and threshold traceability based on the verifiable credentials, which stems from the multi-message version of PS multi-signature.

Finally, we designed a distributed data trading scheme for multiple data owners without a trusted third party. The honest data owners are paid to ensure that the data user receives the correct industrial dataset by providing the correct decryption key, otherwise, the data user can initiate dispute resolution.

The rest of this paper is structured as follows. Some related definitions are recalled and the comparison is discussed in “Related works” section. Then “ System model” section describes the system architecture, security model, and design objectives. “Preliminaries” section describes the relevant knowledge and “Proposal” section focuses on the details of the construction. In “Analysis” section, the security and performance are analyzed. Finally, “Conclusion” section concludes the article.

Related works

Researchers explored decentralized data sharing and trading architectures based on blockchain and cloud servers. Fan et al. provided one-to-many data sharing architectures in vehicular networks. Specifically, the data owner outsources the encrypted data to the cloud server and uses the blockchain as a broadcast channel to publish access policies based on attribute-based encryption [9]. Zhang et al. provided a data security sharing model based on privacy protection for IIoT and used blockchain logging technology to trace and account for illegal access [15]. Dai et al. and Zhang et al. designed a data trading ecosystem based on blockchain-cloud and Software Guard Extensions (SGX) architecture [13, 14]. Li et al. realized fair trading without a trusted third party by trading decrypting ciphertext on the blockchain and eliminating secondary sales on the chain [7]. Liu et al. proposed a decentralized transparent data trading scheme [10]. Koutsos et al. realized the privacy-enhanced decentralized data market based on blockchain, functional encryption, and zero-knowledge proof [16]. Liang et al. a blockchain-based fair and fine-grained data trading scheme with privacy preservation using the attribute-based anonymous credential, an authenticated data structure, and zeroknowledge proof [12].

However, most current studies on data trading schemes only focus on a single data owner [6]. Sensitive industrial data usually are traded after being approved by multiple departments. However, some challenges are remaining to be addressed before secure efficient group data trading is widely applied. The previous works [17, 18] have studied group data sharing but failed to take into account data trading settings. Koutsos et al. proposed A decentralized data market scheme for multiple data owners, which has only focused on data computing privacy but overlooked identity privacy [16]. The scheme of Cao et al. proposed an iterative auction mechanism in the data market with multiple data owners but failed to take into account trading privacy [19].

Cloud servers probably tend to evade the obligations of notified users actively after data corruptions, which are caused by network or operation exceptions, cyberattacks software, and hardware failures [7, 10, 20]. Additionally, data owners are concerned about losing control over their data when using cloud services. Therefore, data integrity verification in the process of data storage and trading based on the cloud is an important security requirement [21,22,23,24]. The common techniques of data integrity verification are those based on hash values [7, 9, 18] or Third-Party Auditor (TPA) [22]. However, the methods used in the scheme [7, 9] have a bottleneck on scalability and public verifiability. Public verifiability means that any entity in the network can check the integrity of data stored in the cloud and the technology with TPA is capable of providing public verifiability [22]. However, TPA may dishonestly perform the challenge and response protocol and even deceive users by colluding with the cloud To address the above issues, blockchain is adopted to generate randomly challenging information against malicious TPA [12, 23, 24]. As stated in the previous paragraph, the data with public verifiability of multiple data owners especially the schemes is attractive in practical industrial environments. Fortunately, Wang et al. proposed an efficient public verifiable multi-owner data integrity verification solution based on multiple signatures [22], which contributes to our goal.

The precondition of secure data trading is privacy-enhanced identity management, especially identity privacy and traceability. The works in [10, 14, 17, 18] support identity privacy, while other schemes do not. The schemes [10, 17, 18] also provide the tracking of malicious users. However, most existing schemes employed ordinary identity authentication technologies that have the disadvantage of coarse granularity, privacy disclosure, being hard to trace, and lack of generalization and flexibility in the distributed setting. Verifiable Credentials (VCs) is the standardized digital credential with cryptographic security, privacy protection, and machine-readability and is one of the promising evolution supporting decentralized identity authentication [25].

As shown in Fig. 1, the credential holder requests a credential from the issuer, who provides trust endorsement for the holder by issuing credentials on relevant attributes. Then, the holder either saves the credential in the local wallet or hosts it on the blockchain. When trying to access a certain service, the credential holder presents the credential to the verifier who confirms whether the holder has met the requirements by verifying claims of the credential.

Verifiable credentials have captured increasing interest and growing Support due to notable benefits [25]. Li proposed a verifiable credential scheme with selective disclosure based on the Bohen-Lynn-Shacham aggregate signature [26]. Yoon et al. proposed a personal data trading system based on blockchain, which uses verifiable credentials to prove the ownership of data [27]. Fotiou et al. proposed a personal data trading system based on verifiable credentials [28]. Moreover, the methods of [10] essentially support verifiable credentials. More works focus on the exploration of various fields [25, 29,30,31]. Therefore, we aim to construct the fair trading of multiple data owners based on cloud storage and blockchain architecture and privacy-enhanced identity management based on verifiable credentials.

Unfortunately, in order to enhance privacy and regulation, verifiable credentials are expected to prove the credentials’ ownership in the multi-show unlinkable, blind issuance, fine-grained attribute privacy, minimized disclosure, and traceability. Multi- show unlinkability allows users to present credentials multiple times whereas verifiers are prevented from tracking users by link continuous session authentication services. Minimizing disclosures aims to void to leak of unnecessary information [25, 29]. However, most previous proposals failed to support all the above requirements. Essentially, the multi-show unlinkability of credentials stems from the re-randomizablity of the signature, which allows the signatures of the same message to be randomized into another unlinkable signature version without knowledge of the secret key [32, 33]. Minimized disclosure is realized by means of selective disclosure and zero-knowledge-proof technology [12]. Selective disclosure allows the credential holder to present corresponding subsets of credential attributes to verifiers who are still allowed to validate the whole credential. Zero-knowledge proof refers to the prover can prove the authenticity of the claim to the verifier, but does not reveal any further information about the claim except the authenticity of the claim.

Fortunately, the emerging Pointcheval-Sanders (PS) signature, which supports re-randomizable, efficient knowledge proof and multi-message signatures is expected to propel the evolution of privacy-enhanced verifiable credentials [32, 33]. Yu et al. proposed an anonymous authentication scheme called BASS supporting attribute privacy, selective revocation, credential soundness, and multi-showing-unlinkability, but the scheme is inefficient and inadequate in the corrupted distributed authorities setting [34]. García-Rodríguez et al. considered distributed credential issuance but overlooked corrupted authority [31]. Sonnino et al. proposed a scheme called Coconut which supports threshold issuance, selective disclosure, and multiple unlinkable selective [8].

However, unconditional identity privacy is not harmonious with the supervision of the regulator and may trigger illegal profits and data abuse in data trading. In the scheme with traceability, the issuing authority and the tracing authority are usually single or monolithic and thus are assumed to be trusted. However, corrupted issuing authorities may forge certificates, and corrupted tracking authorities can remove the anonymity of certificates. The proposals with multiple issuance authorities and tracking authorities are desirable and practical for distributed industrial applications. It is also very necessary to reveal the real identity of illegal users in a privacy-enhanced way. The scheme [35] separates the credential issuance and trace function, and also exposes the user’s identity through threshold trace to prevent the corruption of a single trace authority from harming the privacy of legitimate users. Camenisch et al. fine-tuned an authentication mechanism for V2V and Liu et al. for the data market [10, 35].

The features of the above schemes are presented in Table 1. Following previous excellent works, we ameliorated the privacy-enhanced verifiable credentials to the needs of a distributed Industrial data trading scheme, which supports distributed blindly insurance, fine-grained, threshold traceability, multi-show unlinkability, and selective disclosure in the setting of multiple data owners and regulators.

Table 1 Related works

Full size table

System model

In this section, we briefly define the system model, security model, and security goals of our scheme (Table 2).

Table 2 Scheme comparison of verifiable credentials

Full size table

Architecture

The system model is depicted in Fig. 2, which involves five entities: cloud server (CS), blockchain (BC), data owners (DOs), data users (DUs), and regulators (RGs).

BC: Credential management smart contract (CMSC) and data trading smart contract (DTSC), and the credential revocation registry are deployed on BC. The CMSC and DTSC are in charge of handling and recording the operations of entities in the trading process. In addition, BC plays the role of the public verifier and executes data integrity verification protocols.
CS: The semi-trusted cloud server is responsible for storing data and providing data integrity proof to DOs and DUs.
DOs: DOs expect to gain benefits from selling data and perform data trading through CS and DTSC. In addition, DOs are in charge of issuing credentials to data users.
DUs: DUs will pay DOs after obtaining the correct data through CS and DTSC. DUs obtain multiple credentials from distributed DOs through CMSC. DUs selectively disclose the aggregated credential to CS for authentication.
RGs: RGs consist of multiple independent regulators departments and play the role of the dispute arbitration commission and the tracing authorities. In addition, RGs are responsible for generating global parameters and maintaining the blockchain.

Security model

We regard that RGs, CS, DOs, and DUs are semi-trusted and may deviate from the agreement for benefits. Thus, we introduce the adversaries ${\mathcal{A}}_{{{\text{CS}}}}$, ${\mathcal{A}}_{DOs}$, ${\mathcal{A}}_{DUs}$ and ${\mathcal{A}}_{RGs}$ in the model [7, 9, 23, 35].

${\mathcal{A}}_{{{\text{CS}}}}$ is curious to infer sensitive information about cloud users and data. Additionally, CS usually carries out the protocols honestly but may conceal the data corruptions that are caused by network or operation exceptions, cyberattacks, as well as software or hardware failures.

${\mathcal{A}}_{DOs}$ may launch attacks such as issuing untraceable illegal credentials for malicious users, uploading forged or wrong ciphertext to gain benefits, and denying or claiming ownership of the dataset.

${\mathcal{A}}_{DUs}$ may launch attacks such as stealing data with illegal credentials, refusing payment by forging evidence (keys or signatures), and second reselling.

${\mathcal{A}}_{RGs}$ tries to corrupt the regulators to reveal the user’s identity.

The security assumptions are as follows: assume that there are secure authenticated broadcast channels between regulators and multiple data owners; assume that BC is a secure distributed infrastructure without vulnerabilities.

Security goals

Our goal is to create a distributed privacy-enhanced and supervisable industrial data trading scheme, and we proposed the security goal from the perspectives of identity, data, and trading security [7, 21, 35].

Identity security

Identity security should meet the following attributes:

Anonymity: Anonymity ensures that the identity information of DUs is not disclosed as long as the issuer is honest and the presenting credential is not opened.
Threshold traceability: A certain amount of RGs can jointly reveal the real identity of DUs.
Blind issuance: DOs do not obtain the user’s secret identity information when issuing credentials.
Multi-show unlinkability: Multi-show allows DUs to present the same credentials more than once, whereas unlinkability means it is impossible to associate the presented credentials even if all authorities conspire. Multi-show unlinkability can prevent the verifier from tracking the user’s activities through the authentication service of continuous sessions.
Selective disclosure: DUs only show a subset of the attributes required to prove the ownership of the credential, whereas the verifier only sees the disclosed attributes and still verifies the entire credentials. In other words, the credential verifier only obtained the cryptographic verification result of the attribute instead of the plaintext or ciphertext of the attribute.

Data security

Data security includes confidentiality, integrity, and data authenticity.

Data confidentiality: The original data can only be stored and sold through encryption, and only authorized users can obtain the decryption key.
Data integrity & public verifiability: Any verifier is able to audit the correctness of multi-owner data in the cloud with the public keys of all the owners.
Data authenticity: Data cannot be forged (i.e. unforgeability of trading data), and a malicious DO cannot deny or illegally claim the ownership of a dataset (i.e. Non-repudiation of data ownership).

Trading security

Trading security aims to ensure trading fairness and non-repudiation and to resist the second reselling attack.

Trade Fairness: Fair trading means DU must pay the corresponding fee once obtains the correct datasets and the DOs gain corresponding benefits from selling their datasets. Meanwhile, DU is allowed to refuse to pay after finding the data problem and to submit a complaint [7].
Trade non-repudiation: On the one hand, DOs or DUs may deny the trade for some reason. On the other hand, the proposal should resist the forged-evidence-attack, namely, should prevent malicious DUs from rejecting payment using forged decryption keys or signature of data ciphertext during the dispute.
Anti-second-reselling-attack: The second resale attack refers that malicious DUs may resell the data to gain benefits. The proposal should avoid the second resale of data on the chain to protect the interests of DOs.

Preliminaries

Multi-signature

Multi-signature allows multiple signers to sign for an identical message. Wang et al. proposed a blockless verifiable multi-signature scheme and employed it in the public verification mechanism in the cloud environment. The multi-signature scheme consists of the following algorithms [22]:

$\textbf{MS}\boldsymbol.\textbf{Setup}\boldsymbol(1^\lambda\boldsymbol)\rightarrow params$: input a security parameter $\lambda$, and outputs $params \leftarrow {(}p,{\mathbb{G}},{\mathbb{G}}_{T} ,e,g,{\mathcal{H}}{)}$, where $g \in_{R} {\mathbb{G}}$ is a generator of ${\mathbb{G}}$, $e:{\mathbb{G}} \times {\mathbb{G}} \to {\mathbb{G}}_{T}$ is a bilinear map, ${\mathcal{H}}:\{ 0,1\}^{ * } \to {\mathbb{G}}$ is a hash function.

$\textbf{MS}\boldsymbol.\textbf{KGen}\boldsymbol(params\boldsymbol)\rightarrow\boldsymbol(vsk_i,vpk_i\boldsymbol)$: Let $\mathcal I=\boldsymbol\{I_i{\boldsymbol\}}_{\in\lbrack1,n\rbrack}$ be the set of $n$ signers and given global parameters $params$, $I_{i} \in {\mathcal{I}}$ generates $vsk_{i} = v_{i} \in_{R} {\mathbb{Z}}_{p}^{*}$ and $vpk_{i} = g^{{v_{i} }} \in {\mathbb{G}}$.

$\textbf{MS}\boldsymbol.\mathbf{Sign}\boldsymbol(m\boldsymbol,mid\boldsymbol,vsk_i\boldsymbol)\rightarrow\psi_i$: given block $m \in {\mathbb{Z}}_{p}^{*} \,$ and its identifier $mid$, $I_{i} \in {\mathcal{I}}$ computes local signature respectively $\psi_i=\boldsymbol\lbrack\mathcal H\boldsymbol(mid\boldsymbol)g^m\boldsymbol\rbrack^{v_i}$ and broadcasts $\psi_{i}$.

$\mathbf{MS.SAgg}{(\{ vpk_{i} \} }_{i = 1}^{n} ,{\{ \psi_{i} \} }_{i = 1}^{n} ) \to \psi$: after receiving $n$ local signatures $\{ \psi_{i} \}_{i = 1}^{n}$ on block $m$, the designed signer outputs an aggregated signature $\psi = \prod\nolimits_{i = 1}^{n} {\psi_{i} }$.

${\mathbf{MS.Verify}}(m,mid,\psi ,\{ vpk_{i} \}_{i = 1}^{n} ) \to (1{,}0)$: given block $m \in {\mathbb{Z}}_{p}^{*} \,$ and its identifier $mid$, signature $\psi$ and all the signers public keys $\{ vpk_{i} \}_{i = 1}^{n}$, the verifier computes $pk_{s} = \prod\nolimits_{i = 1}^{n} {vpk_{i} }$ and outputs 1 if $e(\psi,g{)=}e(\mathcal H\boldsymbol(mid\boldsymbol)\boldsymbol\cdot g^m,pk_s)$ or 0 otherwise. The correctness of the above equation can be proved as

$$\begin{array}{lll}e\left(\psi ,g\right)& =e\left(\prod\nolimits_{i = 1}^{n} {\psi_{i} } ,g\right)\\&=e\left(\prod\nolimits_{i = 1}^{n} {({\mathcal{H}}(mid)g^{m} )^{{v_{i} }} } ,g\right)\\&=e\left(\mathcal{H}\left(mid\right)g^{m} ,pk_{s}\right)\end{array}$$

Multi-message Pointcheval-Sanders multi-signature

Pointcheval-Sanders multi-signature is an interesting privacy-enhanced cryptographic primitive due to supporting public key aggregation, signature re-randomization, and efficient signature zero-knowledge proof. PS multi-signature consists of the following algorithms [32, 35]:

$\textbf{PSM}\boldsymbol.\mathbf{Setup}\boldsymbol(1^\lambda\boldsymbol)\rightarrow params$: input a security parameter $\lambda$, and output global parameters $params \leftarrow \left(p,\mathbb{G},{\tilde{\mathbb{G}}},\mathbb{G}_{T},e,g,\tilde{g},n,k\right)$, where $g \in_{R} {\mathbb{G}}$ and $\tilde{g} \in_{R} {\tilde{\mathbb{G}}}$ are the generators of ${\mathbb{G}}$ and ${\tilde{\mathbb{G}}}$ respectively. $e:{\mathbb{G}} \times {\tilde{\mathbb{G}}} \to {\mathbb{G}}_{T}$ is a bilinear map, $n$ and $k$ are the number of signers and messages respectively.

$\textbf{PSM}\boldsymbol.\mathbf{KGen}\boldsymbol(params\boldsymbol)\rightarrow\boldsymbol(sk\boldsymbol,pk\boldsymbol)$: given $params$, returns key pair $\boldsymbol(sk\boldsymbol,pk\boldsymbol)$. This algorithm randomly selects $(x,y_1,\cdots,y_{k+1}\boldsymbol)\in_R\mathbb{Z}_p^\ast$, computes $\boldsymbol(\widetilde X,{\widetilde Y}_1,\cdots,{\widetilde Y}_{k+1}\boldsymbol)\leftarrow\boldsymbol(\widetilde g^x,\widetilde g^{y_1},\cdots,\widetilde g^{y_{k+1}}\boldsymbol)$, and sets $sk\leftarrow(x,y_1,\cdots,y_{k+1}\boldsymbol)$, $pk\leftarrow\boldsymbol(\widetilde X,{\widetilde Y}_1,\cdots,{\widetilde Y}_{k+1}\boldsymbol)$.

$\textbf{PSM}\boldsymbol.\mathbf{KAgg}(pk_1,\cdots,pk_n)\rightarrow apk$: given ${(}pk_{1} , \cdots ,pk_{n} {)}$, generates ${ (}t_{1} , \cdots ,t_{n} {)} \leftarrow {\mathcal{H}}_{{1}} {(}pk_{1} , \cdots ,pk_{n} {)}$ and then returns $\, apk: = (\tilde{X}^{\prime } ,\tilde{Y}_{1}^{\prime } , \cdots ,\tilde{Y}^{\prime}_{k + 1} ) = \prod_{i = 1}^{n} ipk_{i}^{{t_{i} }}$$=\boldsymbol(\widetilde g^{\sum\nolimits_{i=1}^nx_it_i},\widetilde g^{\sum\nolimits_{i=1}^ny_{i,0}t_i},\widetilde g^{\sum\nolimits_{i=1}^ny_{i,1}t_i},\cdots,\widetilde g^{\sum\nolimits_{i=1}^ny_{i,k+1}t_i}\boldsymbol)$.

$\textbf{PSM}\boldsymbol.\mathbf{Sign}\boldsymbol(sk\boldsymbol,\mathbf m=(m_1,\cdots,m_k\boldsymbol)\boldsymbol)\rightarrow\sigma$: given message $\mathbf m=(m_1,\cdots,m_k\boldsymbol)$, private key $sk$, generates $(m^{\prime},h) = {\mathcal{H}}_{0} {(}m_{1} , \cdots ,m_{k} {)} \in {\mathbb{Z}}_{p}^{*} \times {\mathbb{G}}^{*}$, $\sigma_{1} {: = }h, \, \sigma_{2} {: = }h^{{x + \sum_{j = 1}^{k} y_{j} \cdot m_{j} + y_{k + 1} \cdot m^{\prime}}}$ and returns $\sigma : = (m^{\prime},\sigma_{1} ,\sigma_{2} ) \leftarrow (m^{\prime},h,h^{{(x + \sum\nolimits_{j = 1}^{k} {y_{j} \cdot m_{j} } + y_{k + 1} \cdot m^{\prime})}} )$.

$\mathbf{PSM}\boldsymbol.\mathbf{SAgg}{((}pk_i)_{i=1}^n,\boldsymbol m=(m_1,\cdots,m_k\boldsymbol),(\sigma_i)_{i=1}^n)\rightarrow\sigma$: after receiving $n$ local signatures $\sigma_{i}:= \left(m^\prime\sigma_{i,1},\sigma_{i,2}\right)_{i\in[n]}$ on message $\boldsymbol m=(m_1,\cdots,m_k\boldsymbol)$, computes $\sigma_{2} {: = }\prod_{i = 1}^{n} \sigma_{i,2}^{{t_{i} }} { = }h^{{\xi + \sum_{\ell = 1}^{k} u_{\ell } m_{\ell } + u_{k + 1} m^{\prime}}}$, where $\xi {: = }\Sigma_{i = 1}^{n} x_{i} t_{i}$, $u_{\ell } = {(}\Sigma_{i = 1}^{n} y_{i,\ell } t_{i} {)}_{{\ell \in {[1},k{]}}}$, $u_{k + 1} = \Sigma_{i = 1}^{n} y_{i, (k + 1)}t_{i}$, outputs aggregate signature $\sigma = {(}m^{\prime},\sigma_{1} ,\sigma_{2} {)}$.

$\textbf{PSM}\boldsymbol.\mathbf{Verf}\boldsymbol(apk,\sigma,(m_1,\cdots,m_k\boldsymbol)\boldsymbol)\rightarrow1/0$: given message $\boldsymbol m=(m_1,\cdots,m_k\boldsymbol)$, signature $\sigma$ and aggregated public key $apk$, set $\sigma = {(}m^{\prime},\sigma_{1} ,\sigma_{2} )$. This algorithm verifies if $\sigma_{1} \ne 1_{{\mathbb{G}}}$ and $e\left(\sigma_{1} ,\tilde{X}^{\prime} \cdot \prod\nolimits_{j = 1}^{k}\tilde{Y}^{\prime^{m_j}}_{j}\tilde{Y}^{\prime^{m^\prime}}_{k+1}\right)=\left(\sigma_{2} ,\tilde{g}\right)$, outputs 1 if and 0 otherwise. The correctness of the above equation can be proved as

$$\begin{array}{llll}e\left(\sigma_2,\tilde{g}\right)&= e(h^{{\xi + \sum\nolimits_{j = 1}^{k} {u_{j} \cdot m_{j} + u_{k + 1} \cdot m^{\prime}} }} ,\tilde{g})\\&= e(h,\tilde{g}^{{\xi + \sum\nolimits_{j = 1}^{k} {u_{j} \cdot m_{j} + u_{k + 1} \cdot m^{\prime}} }} )\\&= e(h,\tilde{g}^{{\Sigma_{i = 1}^{n} x_{i} t_{i} }}\tilde{g}^{{\Sigma_{i = 1}^{n} \Sigma_{j = 1}^{k} y_{i,j} t_{i} \cdot m_{j} }} \tilde{g}^{\Sigma_{i = 1}^{n}y_{i, (k + 1)} t_{i} m^{\prime}})\\&=e(\sigma_{1} ,\tilde{X}^{\prime} \cdot \prod\nolimits_{j = 1}^{k} \tilde{Y}^{\prime{mj}}_{j} \tilde{Y}^{m^\prime}_{k+1}) \end{array}$$

ElGamal encryption

ElGamal encryption is a public key encryption primitive and consists of the following algorithms [36]:

$\mathbf{ElG.KGen} (\mathbb{G}) \to (sk,pk)$: given ${\mathbb{G}}$, sets $sk = \, x \in_{R} {\mathbb{Z}}_{p}^{*}$, $pk = g^{x}$ and returns key pair $(sk,pk)$.

$\mathbf{ElG}\boldsymbol.\mathbf{Enc}\boldsymbol(pk\boldsymbol,m\boldsymbol)\rightarrow c$: given public key $pk$, message $m \in {\mathbb{Z}}_{p}^{*}$, randomly selects $\, r \in_{R} {\mathbb{Z}}_{p}^{*}$ and returns corresponding ciphertext $c=\boldsymbol(c_1,c_2\boldsymbol)=\boldsymbol(g^r,pk^rg^m\boldsymbol)$.

$\mathbf{ElG}\boldsymbol.\mathbf{Dec}\boldsymbol(sk\boldsymbol,c\boldsymbol)\rightarrow g^m$: given private key $sk$ and ciphertext $c$, returns $c_{2} /c_{1}^{sk} = g^{m}$.

Plaintext checkable encryption

Plaintext checkable encryption (PCE) is a new public key encryption primitive, which provides a feasible way to check whether $c \in {\mathcal{C}}$ is the ciphertext of the plaintext message $m \in {\mathcal{M}}$ under the public key $ppk \in {\mathbb{G}}$ (where ${\mathcal{M}}$ and ${\mathcal{C}}$ respectively represent plaintext space and ciphertext space). PCE consists of the algorithms described below [37]:

$\textbf{PCE}\boldsymbol.\textbf{Setup}\boldsymbol(1^\lambda\boldsymbol)\rightarrow params$: input a security parameter $\lambda$, and output global parameters $params \leftarrow {(}p,{\mathbb{G}},{\tilde{\mathbb{G}}},{\mathbb{G}}_{T} ,e,g,\tilde{g}{)}$, where $g \in_{R} {\mathbb{G}}$ and $\tilde{g} \in_{R} {\tilde{\mathbb{G}}}$ is the generator of ${\mathbb{G}}$ and ${\tilde{\mathbb{G}}}$, $e:{\mathbb{G}} \times {\tilde{\mathbb{G}}} \to {\mathbb{G}}_{T}$ is a bilinear map.

$\textbf{PCE}\boldsymbol.\mathbf{KGe}\textbf{n}\boldsymbol(params\boldsymbol)\rightarrow\boldsymbol(psk\boldsymbol,ppk\boldsymbol)$: given $params$, sets $psk\boldsymbol:=x\in_R\mathbb{Z}_p^\ast$, $ppk\boldsymbol:=g^x$, and return key pair $\boldsymbol(psk,ppk\boldsymbol)$.

$\textbf{PCE}\boldsymbol.\mathbf{Enc}\boldsymbol(ppk\boldsymbol,m\boldsymbol)\rightarrow c$: given public key $ppk$, message $m$, randomly selects $\,\alpha,\;\beta\in_R\mathbb{Z}_p^\ast$,and returns corresponding ciphertex $c=\boldsymbol(mg^{\alpha x},g^\alpha,\widetilde g^\beta,\widetilde g^{\alpha\beta}\boldsymbol)$.

$\textbf{PCE}\boldsymbol.\mathbf{Dec}\boldsymbol(psk,c\boldsymbol)\rightarrow m$: given private key $psk$ and ciphertext $c=\boldsymbol(c_1\boldsymbol,c_2\boldsymbol,c_3\boldsymbol,c_4\boldsymbol)$, if $c$ meets $e\boldsymbol(g\boldsymbol,c_4\boldsymbol)=e\boldsymbol(c_2\boldsymbol,c_3\boldsymbol)$ the algorithm returns corresponding plaintext $m = c_{1} /c_{2}^{psk}$ else return $\bot$.

$\textbf{PCE}\boldsymbol.\textbf{Check}\boldsymbol(m\boldsymbol,ppk\boldsymbol,c\boldsymbol)\rightarrow\boldsymbol(1,0\boldsymbol)$: given public key $ppk$, ciphertext $c=\boldsymbol(c_1\boldsymbol,c_2\boldsymbol,c_3\boldsymbol,c_4\boldsymbol)$ and decrypted message $m$. This algorithm first verifies that the correctness of the ciphertext and return 0 if $e\boldsymbol(g\boldsymbol,c_4\boldsymbol)\neq e\boldsymbol(c_2\boldsymbol,c_3\boldsymbol)$, and then verifies if $e\boldsymbol(c_1/m\boldsymbol,c_3\boldsymbol)=e\boldsymbol(ppk\boldsymbol,c_4\boldsymbol)$, the algorithm outputs 1 means that $c$ is an encryption result of $m$ and 0 otherwise.

Zero-knowledge proofs

Zero-knowledge proof(ZKP) refers that the prover can prove the authenticity of the claim to the verifier, but does not reveal any further information about the claim except the authenticity of the claim. In general, it consists of algorithms described as follows:

$\textbf{NIZK}\boldsymbol.\mathbf{Setup}\boldsymbol(1^\lambda\boldsymbol)\rightarrow params$: inputs a security parameter $\lambda$, and outputs global parameters $params$.

${\mathbf{NIZK .Proof}}\{x{:}R = (s,w)\} \to \pi$: inputs statements $s$ and $w$, the prover constructs proof $\pi$ to prove the secret value $x$ satisfies relation $R{=(}s,w\boldsymbol)$ in a zero-knowledge way.

$\textbf{NIZK}\boldsymbol.\mathbf{Verf}(params\boldsymbol,s\boldsymbol,\pi)\rightarrow\boldsymbol(1\boldsymbol,0\boldsymbol)$: The verifier verifies the correctness of the proof $\pi$.

Verifiable secret sharing

The $\boldsymbol(\tau,\eta\boldsymbol)$ VSS scheme allows a dealer to share the secret $s$ with $\eta$ participants $\{ S_{1} \cdots S_{\eta } \}$ and $\tau - out - of - \eta$ participants can jointly combine shares and recover $s$. To prevent dishonest dealers from cheating on generating shares, VSS can support proof of encryption shares and public verifiability. Specifically, the VSS scheme consists of the following algorithms [38]:

$\textbf{VSS}\boldsymbol.\mathbf{KGen}(params\boldsymbol,i\in{\lbrack1},\eta{\rbrack)}\rightarrow(opk_i,osk_i)$: Each participant $S{(}i{)}$ runs $\mathbf{ElG}\boldsymbol.\mathbf{KGe}\textbf{n}$ algorithm to obtain key pair ${(}opk_{i} ,osk_{i} {)}$, where $osk_{i} \leftarrow z_{i} \in_{R} {\mathbb{Z}}_{p}^{*}$, $opk_{i} \leftarrow \tilde{f}_{i} : = \tilde{g}^{{z_{i} }} \in {\tilde{\mathbb{G}}}^{*}$.

$\textbf{VSS}\boldsymbol.\mathbf{Share}(\boldsymbol o\boldsymbol p\boldsymbol k,s,h\boldsymbol,\widetilde g)\rightarrow(\boldsymbol(V_\ell{\boldsymbol)}_{\ell\in{\lbrack1,}\tau\rbrack},({\widetilde C}_i,\pi_{Pi})_{i\in{\lbrack1,}\eta\rbrack})$: The dealer distributes the share of the secret $s$ to $S{(}i{)}$. Firstly, the dealer randomly selects ${(}p_{1} , \cdots ,p_{\tau } {)} \in_{R} {\mathbb{Z}}_{p}$, and computes the polynomial $\textbf{P}(\chi\boldsymbol)\leftarrow s+\Sigma_{\ell=1}^\tau p_\ell\chi^\ell\in{\mathbb{Z}}_p\lbrack\chi\rbrack$ and $\, H_{s} \leftarrow h^{s}$. Secondly, for $\forall i \in {[1},\eta {]}$, the dealer sets $s_{i} \leftarrow {\text{P}} {(i)}$ as the share of the secret $s$ and distribute $s_{i}$ to $S{(}i{)}$. Next, for $\forall \ell \in {[1, }\tau {]}$, the dealer computes and publishes $V_{\ell } \leftarrow h^{{p_{\ell } }}$ which is the verification value. Then, for $\forall i \in {[1, }\eta {] }$, the dealer randomly selects $r_{i} \in_{R} {\mathbb{Z}}_{p}$ and computes both ${\widetilde C}_i\boldsymbol:=({\widetilde C}_{i,0},{\widetilde C}_{i,1})\leftarrow(\widetilde g^{r_i},\widetilde f_i^{r_i}\widetilde g^{s_i})$ and $\pi_{Pi}\leftarrow{\mathbf N\mathbf I\mathbf Z\mathbf K\boldsymbol.\mathbf P\mathbf r\mathbf o\mathbf v\mathbf e\boldsymbol\{}r_i\boldsymbol:{\widetilde C}_{i,0}=\widetilde g^{r_i},e(h,{\widetilde C}_{i,1}/\widetilde f_i^{r_i}{)=}e(H_s\Sigma_{\ell=1}^\tau V_\ell^{i_\ell},\widetilde g)\}$. Finally, the dealer broadcasts $(\boldsymbol(V_\ell{\boldsymbol)}_{\ell\in{\lbrack1,}\tau\rbrack},({\widetilde C}_i,\pi_{Pi})_{i\in{\lbrack1,}\eta\rbrack})$ to all $S{(}i{)}$.

${\mathbf{VSS .Verf{(}}}\tilde{C}_{i} ,\pi_{Pi} {)} \to {(}1_{i} ,0{)}$: Each participant $S{(}i{)}$ checks the correctness of the proof $\pi_{Pi}$ in the way of Zero-knowledge Proof.

${\mathbf{VSS.Recover}} (\{ \tilde{C}_{i} ,opk_{i} \}_{i = 1}^{\tau } ) \to \tilde{g}^{s}$: Let ${\mathcal{O}}$ be the set of $\tau$ participants. Firstly, each $\, S_{i} \in {\mathcal{O}}$ computes $\tilde{C}_{i}^{*} { = }\tilde{C}_{i,1} /(\tilde{C}_{i,0}^{{z_{i} }} ) = \tilde{g}^{{s_{i} }}$ with the secret key and broadcasts $\tilde{C}_{i}^{*}$ to other participants. If the recovered share copies from other participants $\, S(j)_{j \in {\mathcal{O}}\{i\}}$ and $\tilde{C}_{i}^{*} \ne \bot$ (namely all the recovered shares are correct), for all $j \in {\mathcal{O}}$, the openers calculate each Lagrange coefficient $w_{j} \leftarrow \prod_{{j \in {\mathcal{O}}\{ i\} }} j/(j - i)$, and then calculate $\prod_{{j \in {\mathcal{O}}}} (\tilde{C}_{i}^{*} )^{{w_{j} }} = \tilde{g}^{s}$, the secret $s$ is finally reconstructed over $\tilde{g}$.

Proposal

Overview

We aim to propose a verifiable distributed industrial dataset trading architecture model without trusted third parties. Furthermore, we designed a privacy-enhanced verifiable distributed identity management scheme and a verifiable data trading scheme.

In the identity management phase, DU requests verifiable credentials from DOs through CMSC and DOs employ PSM multi-signature to issue local credentials for validated DU. Significantly, we provided privacy-enhanced identity management for DU by employing the PSM multi-message signature to realize fine-grained attributes, selective attribute disclosure, and threshold traceability [35]. CMSC is in charge of aggregating and distributing credentials to DU. CMSC or DU randomize and selectively disclose the credentials to CS before data trading. CS verifies the validity of the credentials and provides corresponding services to DU with unrevoked and validated credentials. Although verifiable credentials protect users’ privacy through anonymity and selective disclosure, DUs are not always honest. Therefore, RGs need to disclose or even revoke malicious credential holders in the way of the threshold. It is worth noting that threshold traceability contributes to avoiding the abuse of the power of RGs and further improving the privacy of data users.

The data trading phase aims to secure fair trading of the industrial dataset through the trading management smart contract DTSC. Before trading, DOs encrypt the dataset using symmetric encryption technology and upload the ciphertext as well as the signature of the dataset to CS. Then CS verifies the signatures and registers the datasets to DTSC. Later, DU retrieves the interested dataset and submits the data trading request to DTSC. Once DTSC verifies the trading request of DU successfully and CS returns the correct data integrity verification result, DOs send the decryption key encrypted by the PCE algorithm to the authorized DU through DTSC. After receiving the PCE ciphertext, DU recovers the AES key and verifies the decryption result with his PCE private key. Next, DU downloads the ciphertext of the dataset and decrypts the original data. Finally, DU pays the fee before the specified time if the decryption result is correct. Otherwise, DU will refuse payment and submit evidence for dispute resolution. In the dataset integrity verification stems from a public verifiable multi-owner data scheme [22], and blockchain is employed to replace traditional TPA to reduce the problem of malicious third parties in the public audit.

Distributed identity management

Distributed identity management includes the stages of initialization, credential request, generation, aggregation, presentation, verification, trace and revocation (Fig. 3). Notations are shown in Table 3.

Table 3 Notations

Full size table

Initialization

In the initialization stage, public parameters and keys related to the subsequent processes are generated.

(1)
RGs run function $\mathbf{CredSetup}$ to generate public parameters $params$, and then upload $params$ to CMSC. It is assumed that $params$ are implicit inputs of all other algorithms.
(2)
DO $I_{i} \in {\mathcal{I}}$ runs function $\textbf{IKGen}$ to generate the issuing key pair $\boldsymbol(isk_i\boldsymbol,ipk_i\boldsymbol)$ and initializes the registration list $\boldsymbol R\boldsymbol e\boldsymbol g\boldsymbol L\boldsymbol i\boldsymbol s{\boldsymbol t}_i$. Then, DOs upload $\boldsymbol i\boldsymbol p\boldsymbol k\boldsymbol:=\boldsymbol\{ipk_i{\boldsymbol\}}_{i\in{\lbrack1,}n\rbrack}$ to CMSC.
(3)
RG $S_{j} \in {\mathcal{S}}$ runs function $\mathbf{OKGen}$ to generate the trace key pair ${ (}opk_{j} ,osk_{j} {)}$ and initializes the trace list $\boldsymbol T\boldsymbol r\boldsymbol a\boldsymbol c\boldsymbol e\boldsymbol L\boldsymbol i\boldsymbol s{\boldsymbol t}_j$. Then, RGs upload $\boldsymbol o\boldsymbol p\boldsymbol k\boldsymbol:=\boldsymbol\{opk_j{\boldsymbol\}}_{j\in{\lbrack1,}\eta\rbrack}$ to CMSC.
(4)
CMSC runs the $\mathbf{KAgg}$ algorithm to generate the aggregated public key $\, apk$ and outputs $gpk\boldsymbol:=(\boldsymbol i\boldsymbol p\boldsymbol k,\boldsymbol o\boldsymbol p\boldsymbol k,apk)$.

Credential request

In this stage, $U_{uid} \in {\mathcal{U}}$ with unique identification $uid \in_{R} {\mathbb{Z}}_{p}^{*}$ runs function $\textbf{CredReq}$ to send the credential request $\boldsymbol C\boldsymbol r\boldsymbol e\boldsymbol d\boldsymbol R\boldsymbol e{\boldsymbol q}_{uid}$ to $I_{i} \in {\mathcal{I}}$ for attribute sets $\boldsymbol A\boldsymbol t\boldsymbol t\boldsymbol r$ through CMSC. Firstly, $U_{uid}$ generates additional scalar $a^{\prime}$ and public base $h$, then computes the user’s secret identity $\, H_{uid}$ and $\, G_{uid}$ as well as zero-knowledge proof $\pi_{uid}$. Next, $U_{uid}$ calls the algorithm $\textbf{VSS}\boldsymbol.\mathbf{Share}$ to compute the share $uid_{j}$ of the identity $uid$ for $S_{j} \in {\mathcal{S}}$ and send them to all RG_S through CMSC. Finally, $U_{uid}$ uploads ${\boldsymbol{CredReq}}_{uid} \leftarrow (G_{uid} ,H_{uid} ,\pi_{uid} ,\{\tilde{C}_{j} ,\pi_{Pj} \}_{j \in [1,\eta ]} ,\{V_{\ell } \}_{\ell \in [1,\tau ]} ,(a^{\prime},h))$ to CMSC, where $\tilde{C}_{j}$ is the ciphertext of share $uid_{j}$ and $\pi_{Pj}$ is corresponding zero-knowledge proof, $\{ V_{\ell } \}_{{\ell \in {[1,}\tau ]}}$ is the verification value of the secret sharing algorithm.

Credential generation

After receiving $\boldsymbol C\boldsymbol r\boldsymbol e\boldsymbol d\boldsymbol R\boldsymbol e{\boldsymbol q}_{uid}$, $I_{i} \in {\mathcal{I}}$ runs function $\textbf{CredGen}$ to generate local credential $\boldsymbol C\boldsymbol r\boldsymbol e\boldsymbol d\boldsymbol G\boldsymbol e{\boldsymbol n}_{i,uid}$ for valid users. Specifically, $I_{i}$ will reject the credential request if there is duplicate registration, illegal identity, or incorrect share encryption. Otherwise, $I_{i}$ will call the algorithm $\textbf{PSM}\boldsymbol.\mathbf{Sign}$ to sign blindly on the user’s identity $uid$ and attributes $\boldsymbol A\boldsymbol t\boldsymbol t\boldsymbol r$. Finally $I_{i}$ updates $\boldsymbol R\boldsymbol e\boldsymbol g\boldsymbol L\boldsymbol i\boldsymbol s{\boldsymbol t}_i$ and uploads $\boldsymbol C\boldsymbol r\boldsymbol e\boldsymbol d\boldsymbol G\boldsymbol e{\boldsymbol n}_{i,uid}$ to CMSC.

Credential aggregation

After receiving all local credentials from $I_{i} \in {\mathcal{I}}$, CMSC runs the function $\textbf{CredAgg}$ to generate aggregated credentials $\boldsymbol C\boldsymbol r\boldsymbol e\boldsymbol d\boldsymbol A\boldsymbol g{\boldsymbol g}_{uid}$. Specifically, CMSC first calls the algorithm $\textbf{PSM}\boldsymbol.\mathbf{Verf}$ to verify whether the local credentials are correct. Next, Run the algorithm $\mathbf{PSM}\boldsymbol.\mathbf{SAgg}$ to aggregate $\boldsymbol{\{{CredGen}}_{i,uid}{\boldsymbol\}}_{i=\lbrack1,n\rbrack}$ into complete credentials $\boldsymbol C\boldsymbol r\boldsymbol e\boldsymbol d\boldsymbol A\boldsymbol g{\boldsymbol g}_{uid}$. Then the algorithm $\textbf{PSM}\boldsymbol.\mathbf{Verf}$ is called to verify the validity of $\boldsymbol C\boldsymbol r\boldsymbol e\boldsymbol d\boldsymbol A\boldsymbol g{\boldsymbol g}_{uid}$. If the validation is passed, $\boldsymbol C\boldsymbol r\boldsymbol e\boldsymbol d\boldsymbol A\boldsymbol g{\boldsymbol g}_{uid}$ will be issued to $U_{uid}$ who will store his credential in the local wallet.

Credential presentation

When required to present trading qualifications, $U_{uid}$ personally present credentials $\boldsymbol C\boldsymbol r\boldsymbol e{\boldsymbol d}_{uid,\,\rho}$ on the selective presentation attribute subset $\rho\subseteq\boldsymbol A\boldsymbol t\boldsymbol t\boldsymbol r$. The function $\textbf{CredProve}$ will be called to randomize the signature of credential $\boldsymbol C\boldsymbol r\boldsymbol e\boldsymbol d\boldsymbol A\boldsymbol g{\boldsymbol g}_{uid}$ and generate the proof of signature.

Credential verification

During the trading, CS will check the validity of $\boldsymbol C\boldsymbol r\boldsymbol e{\boldsymbol d}_{uid,\,\rho}$ through function $\textbf{CredVerify}$, namely, verify the signature and its zero-knowledge proof in $\boldsymbol C\boldsymbol r\boldsymbol e{\boldsymbol d}_{uid,\,\rho}$.

Threshold trace and user revocation

When RGs receive complaints or inspect abnormal credentials, $\tau + 1$ RGs ${\mathcal{O}}_{i} \in {\mathcal{O}}$ will run the function $\textbf{Trace}$ to recover the identity of $U_{uid}$ who computed the credential $\boldsymbol C\boldsymbol r\boldsymbol e{\boldsymbol d}_{uid,\,\rho}$. Firstly, ${\mathcal{O}}_{i} \in {\mathcal{O}}$ runs $\mathbf{CredVerify}(apk,\rho,\boldsymbol{Attr},\boldsymbol{Cred}_{uid,\,\rho})$ to verify the validity of $\boldsymbol{Cred}_{uid,\,\rho}$. Then, ${\mathcal{O}}_{j} \in {\mathcal{O}}$ downloads ${\boldsymbol{CredReq}}_{uid} \leftarrow (G_{uid} ,H_{uid} ,\pi_{uid} ,\{ V_{\ell } \}_{{\ell \in [1,\tau ]}} ,\{ \tilde{C}_{j} ,\pi_{Pj} )_{{j \in [1,\eta ]}} ))$ from CMSC, and updates the trace list $\boldsymbol T\boldsymbol r\boldsymbol a\boldsymbol c\boldsymbol e\boldsymbol L\boldsymbol i\boldsymbol s{\boldsymbol t}_j\lbrack uid\rbrack\leftarrow\widetilde C_j^\ast={\widetilde C}_{j,1}{/(}\widetilde C_{j,0}^{z_j}{)=\widetilde Y_0^{'uid_j}}$, where $uid_{j}$ is the share of ${\mathcal{O}}_{j} \in {\mathcal{O}}$ on secret identity $uid$. Next, ${\mathcal{O}}_{i} \in {\mathcal{O}}$ broadcasts recovered share of his own and collects shares of other RGs ${\mathcal{O}}_{q} \in {\mathcal{O}}$ and finally calls the algorithm $\textbf{VSS}\boldsymbol.\textbf{Recover}$ to recover $uid$. If no such identity is found, open the algorithm to return $\bot$.

When RGs judged that $U_{uid}$ needs to be revoked, RGs will submit the revocation update information to CMSC which is in charge of maintaining the revocation registry and revoking corresponding credentials.

Verifiable data trading

The verifiable data trading scheme mainly includes initialization, dataset encryption and upload, dataset validation and registration, and the trading stage. Furthermore, the trading stage consists of data trading request, data trading response, key verification and decryption, as well as dispute.

Initialization

In this phase, the public parameters $\Gamma : = {(}p,{\mathbb{G}},{\mathbb{G}}_{T} ,e_{1} ,\hat{g},{\mathcal{H}}{)} \leftarrow {\mathbf{MS .Setup}} (1^{\lambda } )$ and $pp: = {(}p,{\mathbb{G}},{\tilde{\mathbb{G}}},{\mathbb{G}}_{T} ,e_{2} ,g,\tilde{g}{)} \leftarrow {\mathbf{PCE.Setup}} (1^{\lambda } )$ are first generated. Then, DO $I_{i} \in {\mathcal{I}}$ generates the key pair $\boldsymbol(vsk_i\boldsymbol,vpk_i\boldsymbol)\,\leftarrow\textbf{MS}\boldsymbol.\textbf{KGen}\boldsymbol(\mathrm\Gamma\boldsymbol)$ (where $(vsk_{i} ,vpk_{i} ) \, \leftarrow (v_{i} \in_{R} {\mathbb{Z}}_{p}^{*} , \, g^{{v_{i} }} \in {\mathbb{G}})$) for dataset upload and data integrity verification and $Ks\leftarrow\mathbf{AES}\boldsymbol.\textbf{KGen}\boldsymbol(1^\lambda\boldsymbol)$ for the AES encryption algorithm(the data owner can generate the AES encryption key through negotiation or by selecting a representative). Meanwhile,$U_{uid}$ generates the key pair $(psk,ppk): = (x \in_{R} {\mathbb{Z}}_{p}^{*} ,g^{x} \in {\mathbb{G}}) \leftarrow {\mathbf{PCE.KGen}} (pp)$ for the PCE algorithm.

Dataset upload and registration

In the data upload phase, DOs encrypt and upload the dataset, and submit the data upload smart contract $\textbf{UploadData}$ to DTSC. Firstly, DOs select the dataset $DATA=\boldsymbol\{m_1\cdots m_\ell\boldsymbol\}$ ($m_j\in\mathbb{Z}_p^\ast j\in\boldsymbol\lbrack1,\ell\boldsymbol\rbrack$) and encrypt $DATA$ using the AES encryption algorithm to generate $DATA^\ast=\boldsymbol\{{\textbf{Enc}}_{Ks}\boldsymbol(m_j\boldsymbol){\boldsymbol\}}_{j\in\lbrack1,\ell\rbrack}$. Then, each $I_{i} \in {\mathcal{I}}$ runs function $\psi_i\leftarrow\textbf{MS}\boldsymbol.\mathbf{Sign}\boldsymbol(m_j\boldsymbol,mid_j\boldsymbol,vsk_i\boldsymbol)$ to generate the local signature $\psi_{i} = \{ \psi_{j,i} \}_{{_{j \in [1,\ell ]} }} = \{ [{\mathcal{H}}(mid_{j} )\hat{g}^{{m_{j} }} ]^{{v_{i} }} \}_{j \in [1,\ell ]}$ (where $mid_{j}$ is the identifier of block $m_{j}$) and broadcasts it in ${\mathcal{I}}$. Next, DOs generate properties of the dataset, including the unique identifier $did$, the index hash $Index\boldsymbol:=\,Hash(DATA\parallel did)$, the dataset description $ProData$ (including file name, file size, keywords, data samples, underlying data model, and other attributes), and access policy $Policy$. Finally, DOs upload $File=\boldsymbol\{DATA^\ast,\psi_i,\,Index,ProData,Policy,did\boldsymbol\}$ to CS and submit the contract $\textbf{UploadData}$ (as shown in Fig. 4) to DTSC.

After receiving $File$, CS verifies the local signatures $\boldsymbol\{\psi_{j,i}{\boldsymbol\}}_{j\in\lbrack1,\ell\rbrack\,i\in\lbrack1,n\rbrack}$ in turn and computes the aggregate signature of the data block $\psi_{j} \leftarrow \mathbf{MS.SAgg}{(\{ }vpk_{i} {\} }_{i = 1}^{n} ,{\{ }\psi_{j,i} {\} }_{i = 1}^{n} {)}$. Then, CS stores data and generates the file storage address $Loc_{f}$. Finally, CS submits the data registered smart contract $\textbf{RegistrData}$ (as shown in Fig. 5) to DTSC. The contract $\textbf{RegistrData}$ means that $File$ has been received by CS.

Trading

The data trading stage mainly includes data trading request, data trading response, data integrity verification, key verification and decryption as well as disputes,

1)
Data trading request

$U_{uid}$ retrieves the needed data through the description $ProData$ and submits the smart contract $\textbf{RequestData}$ to DTSC. As shown in Fig. 6, $\textbf{RequestData}$ contains basic data information, trade information and contract code. In addition, $U_{uid}$ needs to provide his PCE public key $ppk_{uid}$, credentials $\boldsymbol{Cred}_{uid,\,\rho }$ and access information $AccInfo$. The contract code stipulates that the data owner must provide correct PCE ciphertext $c_{Ks}$ and data integrity verification results. Apparently, $\textbf{RequestData}$ guarantees that $U_{uid}$ will pay the data trade fee $Fee_{Trd}$ to DOs before the time $T_{pay}$ if $U_{uid}$ can correctly decrypt the dataset.
Fig. 6
Trading request smart contract
Full size image
2)
Data integrity verification

In the data integrity verification process, blockchain sends challenge information to CS. CS calculates evidence based on the challenge information and returns the evidence.

In the challenge phase, blockchain generates $\ell$ random numbers $\delta_{1} \cdots \delta_{\ell } \in_{R} {\mathbb{Z}}_{p}^{*}$ and sends the challenge request $\text{chal}=\boldsymbol(j,\delta_j{\boldsymbol)}_{j\in\lbrack1,\ell\rbrack}$ to CS. In the evidence generation phase, CS calculates $\mu = \sum\nolimits_{j = 1}^{\ell } {\delta_{j} m_{j} \in {\mathbb{Z}}_{p}^{*} }$ and $\upsilon = \prod\nolimits_{j = 1}^{\ell } {\psi_{j}^{{\delta_{j} }} } \in {\mathbb{G}}$ after receiving the challenge message, and returns the verification evidence $(\mu {,}\upsilon {,\{ }mid_{1} \cdots mid_{\ell } \} )$ to blockchain. In the data verification phase, after blockchain receives the verification evidence $(\mu {,}\upsilon {,\{ }mid_{1} \cdots mid_{\ell } \} )$ and the public key sequence $\{ vpk_{1} \cdots vpk_{n} \}$ of all data owners, calculates $pk_{s} = \prod\nolimits_{i = 1}^{n} {vpk_{i} }$ and verifies whether the equation $e{(}\upsilon ,\hat{g}{) = }e{(}\prod\nolimits_{j = 1}^{\ell } {{\mathcal{H}}(mid_{j} )^{{\delta_{j} }} \cdot \hat{g}^{\mu } } ,pk_{s} {)}$ is true. Finally, DTSC sends data integrity verification result to DOs and DU.
3)
Data trading response

Before the data trading response, the validity of dataset and data user identity need to be checked. Firstly, DTSC checks whether there is the same hash index $Index$ of the dataset on the chain and checks the data validity period. Then, DTSC initiates the data integrity verification request and CS returns the data integrity verification result. If there are other duplicate indexes or the dataset expires or data integrity verification failed, DTSC will reject the request and notify DOs. Meanwhile, CMSC will check the revocation registry and will call the function $\mathbf{CredVerify}$ to verify the validity of the credentials ${\boldsymbol{Cred}}_{uid,\,\rho}$, and reject the data request of $U_{uid}$ if the credential is revoked or illegal. Then DTSC will check the access control policy, and will reject the request if it does not match.

After all checks are completed, DOs will respond to the request and submit the data sell contract $\textbf{SellData}$ to DTSC. As shown in Fig. 7, $\textbf{SellData}$ provides the correct ciphertext of $Ks$, i.e. $c_{Ks}\leftarrow\textbf{PCE}\boldsymbol.\mathbf{Enc}\boldsymbol(ppk_{uid}\boldsymbol,Ks\boldsymbol)$, and meanwhile promises that $U_{uid}$ is allowed to deny the validity of $c_{Ks}$ and initiate dispute handling before the dispute closing time $T_{close}$. The contracts $\textbf{RequestData}$ and $\textbf{SellData}$ ensure that the fair trading between DOs and $U_{uid}$, that is, honest DOs get paid, and honest $U_{uid}$ gets the correct data.
Fig. 7
Trading response smart contract
Full size image
4)
Key verification and decryption

After receiving the PCE ciphertext, $U_{uid}$ first recovers the AES key $Ks^{\prime}$ through $K^{\prime}_{S} \leftarrow {\mathbf{PCE.Dec}}(psk_{uid} ,c_{Ks} )$, and checks whether $Ks^{\prime}$ is the correct decryption of $c_{Ks}$( i.e. ${\mathbf{PCE.Check}} (Ks^{\prime},ppk_{uid} ,c_{Ks} )\begin{array}{*{20}c} ? \\ = \\ \end{array} 1$). Then, $U_{uid}$ downloads and decrypts the ciphertext of the dataset $DATA^{ * }$ from CS to obtain the original data (i.e. $DATA^{\prime} \leftarrow {\mathbf{DEC}} (Ks^{\prime}DATA^{\ast})$). Next, $U_{uid}$ calculates the index value of $DATA^{\prime}$ and checks whether it is consistent with the index value stored on the platform (i.e. $Hash(DATA^{\prime}\parallel did)\begin{array}{*{20}c} \mathbf{?} \\ = \\ \end{array} Hash(DATA\parallel did)$).

If the above checks pass, $U_{uid}$ will pay $Fee_{Trd}$ before $D_{pay}$. Otherwise, if $Ks$ is incorrect decryption (i.e. ${\mathbf{PCE.Check}} (Ks^{\prime},ppk_{uid} ,c_{Ks} ) = = 0$) or $DATA^{\ast}$ is incorrect decryption (i.e.${\mathbf{DEC}} (Ks^{\prime}DATA^{\ast} = = \bot)$), $U_{uid}$ will refuse to pay within $T_{pay}$ and initiate the trading dispute before the trading closed time $T_{close}$.
5)
Dispute

In the dispute stage, $Fee_{Trd}$ will continue to be locked until the evidence is confirmed. $U_{uid}$ invokes the $\textbf{Dispute}$ contract and submits the evidence, including $Hash(DATA^{\prime}\parallel did)$,$Ks^{\prime}$ and signature $\psi_{j}$. If DTSC verifies $Ks$ successfully but fails to decrypt the ciphertext correctly (i.e. ${\mathbf{PCE.Check}} (Ks^{\prime},ppk_{uid} ,c_{Ks} ) = = 1$ and $Hash(DATA^{\prime}\parallel did) \ne Hash(DATA\parallel did)$), meanwhile the signature verification is passed ${\mathbf{MS .Verify(}}m,mid,\psi_{j} ,{\{ }vpk_{1} \cdots vpk_{n} \} ) = 1$, the trading fails and the locked $Fee_{Trd}$ will not be paid to the data owner. If anyone is not satisfied, it means the trading is successful and the locked $Fee_{Trd}$ will continue to be paid to the data owner (Fig. 8).
Fig. 8
Dispute smart contract
Full size image

Analysis