CanDiD

AbstractWe present CanDID, a platform for practical, user friendly realization of decentralized identity, the idea of empowering end users with management of their own credentials. While decentralized identity promises to give users greater control over their private data, it burdens users with management of private keys, creating a significant risk of key loss. Existing and proposed approaches also presume the spontaneous availability of a credential-issuance ecosystem, creating a bootstrapping problem. They also omit essential functionality, like resistance to Sybil attacks and the ability to detect misbehaving or sanctioned users while preserving user privacy. CanDID addresses these challenges by issuing credentials in a user-friendly way that draws securely and privately on data from existing, unmodified web service providers. Such legacy compatibility similarly enables CanDID users to leverage their existing online accounts for recovery of lost keys. Using a decentralized committee of nodes, CanDID provides strong confidentiality for user’s keys, real-world identities, and data, yet prevents users from spawning multiple identities and allows identification (and blacklisting) of sanctioned users. We present the CanDID architecture and its technical innovations and report on experiments demonstrating its practical performance.
Year2020
Link to the paperhttps://www.arijuels.com/wp-content/uploads/2020/07/Candid.pdf
Relevance scoreVery relevant
Quality score5
LabelsLegacy compatibleManagement of credentialsSybil resistance insights

TL;DR

CanDID attempts to address four main issues in decentralized identity:

  1. Legacy Compatibility
  1. Sybil resistance
  1. Accountability
  1. Key recovery

In the identity system for Lido, CanDID will play a key role in Sybil-resistance. In other words, CanDID would try to ensure that an operator does not try to act as more than one operator to accrue more rewards. The idea of accountability and key recovery may also be useful but only after some modifications (this is because if we use the system for maintaining accountability and key recovery as is, then the dependency on the CanDID committee is huge.).

Short description

  1. Identity System: CanDID uses services such as DECO or Town Crier to port identities and credentials from web services. These services enable the users to generate trustworthy credentials without issuers generating credentials for them.
    1. Credential Privacy:
      1. Selective disclosure - using zero-knowledge arguments.
      1. Membership privacy - committee members and web service providers cannot obtain the identity of the users as well as the membership of the user (if the user is using CanDID or not).
      1. Pairwise credentials - CanDID enables users to use different credentials for different services and the credential is unlinkable to any other credential for another service.
    1. Sybil Resistance: Supports deduplication of identities using a privacy-preserving MPC method.
    1. Accountability: CanDID can screen users of the system so as to identify the identity of the suspect user. This functionality is privacy-preserving and the committee uses fuzzy matching for finding the identity.
  1. Key Recovery System: Leverage existing web authentication schemes to recover keys. Users may store the keys in their local devices and backup with the CanDID committee, prespecify recovery accounts on the web services, and define the recovery policy. To recover the keys, a user has to prove successful logins based on the recovery policy.

Oracles

An oracle relays and provides assurance that the retrieved data is authentic. It allows users to prove that the origin of the data is authentic. CanDID uses oracles to allow users to pull data from existing systems. CanDID proposes that either DECO or Town Crier can be used for this purpose.

System Overview

Identity System

Goal: To convert already existing legacy data to decentralized identity.

CanDID relies on decentralized oracles such as DECO or Town Crier to port credentials from legacy web services. The CanDID committee acts as the verifier for the ported data.

Source: https://eprint.iacr.org/2020/934.pdf Figure.1. This figure depicts the overview of general workflow of CanDID system. A user ports his legacy data as a pre-credential to the CanDID committee. The committee deduplicates the pre-crdential according to a unique legacy data, and obtains a master credential. Using the master credential and a context information, the committee generates a context-based credential which can be used by the user only in that context. Later on those credentials can be revoked by the committee. To that end, the committee maintains a revocation list.

CanDID pulls data from web services using oracles to create credentials. A committee called the “CanDID Committee” acts as the verifier for the data ported.

There are four common requirements: uniqueness, non-transferability, accountability, and privacy.

  1. Uniqueness and Non-transferability: Uniqueness implies the inclusion of mechanisms to ensure the deduplication of user identities. Non-transferability refers to the integration of preventive measures that prevent users from transferring credentials.

    CanDID achieves uniqueness and non-transferability by making the system Sybil-resistant. Sybil resistance is achieved by deduplicating based on one or more attributes. To achieve deduplication, the CanDID committee maintains a secret-shared table of attributes (attributes such as SSNs or Aadhar numbers which are unique to every person). The steps in joining of a new user:

    1. The user presents one or more pre-credentials asserting various attributes. A pre-credential refers to a credential that has not been deduplicated in CanDID.
    1. After this, the CanDID committee performs a privacy-preserving MPC deduplication protocol to check the table if the attributes presented in the pre-credential exist. Once it’s confirmed that the attributes presented are unique, then the system issues a master credential.

    Making the system Sybil-resistant helps discourage credential transfer.

    • Each user can obtain only one master credential in CanDID, disincentivizing sale or transfer.
    • Other deterrents such as
      • temporary revocation of misused credentials, and
      • revocation of stolen credentials

    can be effective for the same reason.

    CanDID focuses only on truly unique identifiers for deduplication - SSNs, Aadhar number, etc. For the countries that do not have such a unique identifier - they have proposed another method that would prevent the dependency on unique identifiers (discussed later).

    The master credential may not contain all attributes for all use cases. Therefore, a user can add more credentials for this purpose. This credential has to be related to the master credential (because it has to be for some person). To do this, the user has to put some information in the context-based credential that is common with the master credential.

  1. Accountability: CanDID enables the identification of malicious users based on their real-world identities and permits the subsequent listing of such users on a committee-maintained public revocation list. Any verifying party can check this list to see if a user’s credential has been revoked. CanDID supports pulling data from real-world revocation lists as well as handling the case where a user is added newly to the revocation lists.
  1. Privacy: Privacy is one of the major properties of CanDID.
    1. Users’ attributes are hidden from committee nodes.
    1. Attribute-membership privacy: committee members cannot determine if a credential containing a certain attribute exists in the system.

Key Recovery System

Goal: Prevent identity loss

Source: https://eprint.iacr.org/2020/934.pdf Figure 2. This figure depicts the key recovery system in CanDID.

For key recovery - a user provides the key along with a recovery policy. The CanDID committee stores the key in a privacy-preserving manner and releases it when the user meets the criteria mentioned in the recovery policy.

System Model

Three main parties: users, credential issuers, and credential verifiers. CanDID supports the use of DID by relying on a PKI-like architecture where a mapping between DIDs and public keys is stored. The CanDID committee acts as the credential issuer. The CanDID committee stores a joint private key-public key pair. The private key is used for issuing credentials and the public key is used for verifying credentials.

Credential: CanDID uses the W3C specifications for structuring the credential.

A credential contains the following parameters:

User Identifier (pkUpk^U)Pseudonymous identifier of subject of the credential. Also referred to as pseudonym.
Context (ctxctx)Denotes the circumstances where credential is used.
Claims (claimi{claim_i})claimi={ai,vi,Pi}claim_i = \{a_i, v_i, P_i\} where, 1. aia_i represents the attribute that tells what the claim is about. 2. viv_i represents the value of the attribute. 3. PiP_i represents the provider which denotes the legacy web provider. CS=claimiCS={claim_i}
Signature (σ\sigma)Signature by the issuer

If there are kk claims, then CS={claimi}i=1kCS={\{claim_i\}}_{i=1}^k and σ=Sigskc({pk,ctx,CS})\sigma = Sig_{sk^c}(\{pk, ctx, CS\}), where skcsk^c is the secret key generated by the CanDID issuer committee. The representation is very similar to the W3C specification. The only difference is the inclusion of the providers in the claim. Each credential contains a dedupOver\mathrm{dedupOver} property which depicts the attribute used for deduplication. The figure depicts how a credential looks like in CanDID (note that they have used JSON).

Source: https://eprint.iacr.org/2020/934.pdf Figure 3. This figure depicts a master credential that is deduplicated over SSNs. The grey boxes show the commitments to sensitive data.

Security Model

Adversarial Model:

Adversary can corrupt up to tt out of nn committee nodes, where t<n/3t < n/3. The adversary can corrupt any number of users and applications. CanDID committee nodes hold a (t,n)(t, n)-Shamir secret sharing of the private key skcsk^c.

Communication Model:

The communication in the network is asynchronous. However, the DKG needed for (pkc,skc)(pk^c, sk^c) generation requires weak synchrony in the network for liveness.

Security Properties:

CanDID aims to satisfy certain security properties (since these are definitions, they have been put here as is):

  1. Sybil resistance: An adversary cannot obtain credentials associated with a larger number of distinct identities than the number of users which are controlled by the adversary.
  1. Unforgeability: An adversary cannot forge the credentials of the honest user or impersonate them.
  1. Privacy in Credential issuance and key recovery: It is infeasible for an adversary to learn about the attributes from observation of credential issuance and key recovery.
  1. Credential Validity: An adversary can obtain credentials only for real-world identities it controls.
  1. Unlinkability: The entities that deploy applications using CanDID cannot collude and link the respective transactions of any given user. This definition applies only in a weakened adversarial model that rules out malicious committee nodes.
  1. Privacy in Credential verification: An adversary can learn about a user no more than the information presented by the user using their credentials.

Assumptions on users’ legacy credentials: Some of the security properties depend on some assumptions.

Identity System

This system converts legacy data to verifiable credentials. This is done in two steps.

  1. CanDID converts a set of pre-credentials to a master credential using a privacy-preserving, deduplication protocol. A user has only one master credential and the user does not use this credential with applications.
  1. Users create context-based credentials which are created for corresponding applications. Context-based credentials are created in a way that they cannot be linked to extract more information about a user.

From legacy data to pre-credentials

A pre-credential is defined as:

PC=(claim,π),PC = (claim, \pi),

where claim={a,v,P}claim = \{a, v, P\} and π\pi is the proof for the claim which depicts that the vv is indeed the value corresponding to the attribute aa according to the provider PP. Pre-credentials are used for the creation of both master credentials and context-based credentials. CanDID uses Town Crier or DECO for creating pre-credentials:

  1. DECO: Here, π\pi is the signature over the claim made by the CanDID Committee.
    1. The user UU picks at least tt out of nn committee members ({C1,C2,C3,,Ct}\{C_1, C_2, C_3, …, C_t\}) who would verify the claim and proof generated by DECO (tt is the threshold required in the threshold signature scheme used for generating the (pkc,skc)(pk^c, sk^c)).
    1. Each committee member selected by the user verifies the proof generated by DECO and therefore, the committee member is convinced that the claim generated is authentic.
    1. Then, committee member CiC_i generates the partial proof πi=Sigski(claim)\pi_i = Sig_{sk_i}(claim).
    1. UU obtains π\pi by combining {πi}\{\pi_i\}.
  1. Town Crier: Here, a TEE is used to obtain π=SigskTEE(claim)\pi = Sig_{sk_{TEE}}(claim), which is generated only if the claim is authentic.

To prevent replay attacks, users associate their PCPC with the pkpk. (such as PC=(claim,pk,π)PC = (claim, pk, \pi))

Master credential issuance

Deduplication: CanDID committee stores registered users’ attributes in a table which is called the IDTable\mathrm{IDTable}.

  1. Deduplication Process:

    In CanDID, we have to use some attributes using which deduplication can be done. The authors have achieved this by using unique identifiers such as SSNs. This method provides strong Sybil-resistance within a given population. Each committee node stores locally IDTable={PRF(skc,vU)}\mathrm{IDTable} = \{PRF(sk^c, v_U)\}, where vUv_U is a unique identifier and skcsk^c is a secret key distributed across committee members. When a new user wants to join CanDID, the committee members evaluate and check if vˉ=PRF(skc,vU)\bar{v} = PRF(sk^c, v_U) is inside the IDTable\mathrm{IDTable}. If it is not present, then a master credential for UU is created. To ensure that the committee members do not learn about the value vUv_U, PRFPRF is evaluated using MPC (discussed later).

    A limitation of this approach is that it requires users to have unique identifiers which might not be present in many countries. Another way of doing it is by obtaining deduplication parameters using common identifiers such as names and addresses. In addition, this approach poses some threats when implemented. For example, if the system needs multiple pre-credentials from the user for accepting an attribute, then this provides an opportunity for an adversary to compromise multiple accounts of a user.

  1. Protocol Details: α\alpha is the attribute over which deduplication has been done.
    1. System Setup:
      1. nn nodes - {C1,C2,C3,,Cn}\{C_1, C_2, C_3, …, C_n\}
      1. Committee members use a threshold signature scheme TS=(KGen,Sig,Comb,Vf)TS = (KGen, Sig, Comb, Vf)
      1. This scheme generates (pkc,skc)(pk^c, sk^c) where pkc=(pksigc,pkprfc)pk^c = (pk^c_{sig}, pk^c_{prf}) and skc=(sksigc,skprfc)sk^c = (sk^c_{sig}, sk^c_{prf}). Each of the nn nodes knows pksigcpk^c_{sig} and pkprfcpk^c_{prf}. For the secret key, committee member CiC_i holds sksig,icsk^c_{sig, i} and skprf,icsk^c_{prf, i} where these keys form a share of the corresponding secret keys.
      1. Each committee node initializes a local table IDTable:=\mathrm{IDTable} := \empty.
      1. [v][v] denotes a sharing of vv and CiC_i has viv_i such that v=iλiviv = \sum_i{\lambda_iv_i}, where λi\lambda_i depicts the Lagrange’s coefficients.
      1. yf([x])y \leftarrow f([x]) depicts the MPC evaluation of ff over secret-shared input xx.
      1. CanDID uses a secure MPC protocol based on Beaver triples to evaluate PRF([skprfC],.)PRF([sk^C_{prf}], .).
      1. Committee generates secret-shared random blinding factors and commitments {[bi],gbi}\{[b_i], g^{b_i}\}.
      1. They have used MP-SPDZ framework for implementation.
    1. Pre-credential generation: Let vv denote the ideal value for α\alpha for UU. Let claim=(α,Cv)claim = (\alpha, \mathrm{C}_v), where Cv=com(v,r)\mathrm{C}_v = com(v, r) is a commitment to vv with a witness rr.

      UU generates a pre-credential PC=(claim,pkU,πoracle)PC = (claim, pk^U, \pi^{oracle}).

    1. Deduplication: Next step is to evaluate vˉ=PRF(skc,v)\bar{v} = PRF(sk^c, v):
      1. UU sends [v][v] to committee members.
      1. Committee members send shares of a fresh blinding factor ([b],B=gb)([b], B = g^b) to UU from which UU reconstructs bb.
      1. UU blinds vv by: v=b+vv' = b + v and a proof of correct blinding

        πiblind=ZkPoK{b,v,r:v=b+v(gb=B)(com(v,r)=(C)v)}\pi^{blind}_i = Zk-PoK\{b, v, r: v' = b + v \wedge (g^b = B) \wedge (com(v, r) = \mathrm(C)_v)\}

        UU sends (pkU,v,πblind,claim,πoracle)(pk^U, v', \pi^{blind}, claim, \pi^{oracle}) to all committee nodes.

      1. Each CiC_i verifies the received proofs and computes vi=v/nλibiv_i = v'/n\lambda_i - b_i.
      1. Committee nodes execute an MPC protocol to compute vˉ=PRF([skprfc],[v])\bar{v} = PRF([sk^c_{prf}], [v]) and each of them asserts that vˉIDTable\bar{v} \notin \mathrm{IDTable}. CiC_i adds (pkU,vˉ)(pk_U, \bar{v}) to the IDTable\mathrm{IDTable}.
    1. Credential issuance:
      • Each committee node CiC_i evaluates
        m={pkU,master,claim,{dedupOver,{α}}}m=\{pk^U, \mathrm{``master”}, claim, \{\mathrm{``dedupOver”}, \{\alpha\}\}\}

        and generates a partial signature σic=TS.Sig(sksig,ic,m)\sigma^c_i = TS.Sig(sk^c_{sig, i}, m).

      • CiC_i sends EncpkU(σic)\mathrm{Enc}_{pk^U}(\sigma_i^c) to UU.
      • After decrypting tt partial signatures, UU obtains the full signature σc=TS.Comb({σic})\sigma^c = TS.Comb(\{\sigma^c_i\}) and constructs the master credential.
        credmaster={pkU,master",claim,{dedupOver",{α}},σc}cred_{master} = \{pk^U, \mathrm{``master"}, claim, \{\mathrm{``dedupOver"}, \{\alpha\}\}, \sigma^c\}

Context-based credential issuance

Master credentials are not used for making interactions with applications because of the resulting linkability and their limited claims. Therefore, users create context-based credentials.

For this to work efficiently, two more challenges must be addressed: (i) it needs to be ensured that the new claims belong to the user holding the master credential else users can buy stolen accounts to add false claims; (ii) make pairwise DIDs, i.e., make credentials for different contexts independent.

The usual issue of VCs is the applications colluding to figure out users’ usage patterns. The authors have mentioned that CanDID can be extended with suitable anonymous credentials for solving this purpose.

Credential verification

Any party using CanDID can verify user UU’s context-based credential credcred with associated pkpk and associated opened commitments. The verifying party makes three checks:

  1. credcred is properly signed by the committee.
  1. pkpk does not appear in the public revocation list.
  1. Any commitment openings are valid.

Security arguments

  1. Sybil resistance: This follows from the integrity properties of oracle protocols.
  1. Unforgeability: Follows from the unforgeability of signatures.
  1. Privacy in credential issuance: From the privacy of oracle protocols, generating a pre-credential claim=(a,Cv)claim = (a, \mathrm{C}_v) does not leak information about vv. In addition, since the commitment is hiding, and MPC evaluation of vˉ=PRF([skprfc],[v])\bar{v} = PRF([sk_{prf}^c], [v]) guarantees privacy, AA does not learn vv during issuance.
  1. Credential validity: Follows from the integrity of oracle protocols
  1. Unlinkability across applications: The only linkage between the master credential and context-based credentials lies in the Issuedctx\mathrm{Issued}_\mathrm{ctx}. This property holds from the fact that an adversary cannot corrupt the committee members.
  1. Privacy in credential verification: Firstly, unopened commitments do not leak any information due to the hiding property. Secondly, commitments hide the result of a ZKP.

Accountability

CanDID enforces accountability by identifying misbehaving individuals in a privacy-preserving manner. In their implementation, they have used sanction lists for this purpose. There are two related problems corresponding to this:

For both of them, the system must be able to detect potential alternate spellings of names. This is normally done by using fuzzy matching. The challenge in the case of CanDID is to do this in a secure manner.

For achieving this goal, they deployed a fuzzy matching algorithm based on edit distance and c-shingles. Evaluating edit distance requires a dynamic programming approach which has a large constant factor due to the size of the alphabet. Therefore, an approximation of edit distance has been used which is called c-shingles. The c-shingles of a word ww is the set of length cc consecutive substrings of ww (ignoring order, repetition). Let shc(w)sh_c(w) be the set of c-shingles of wCnw \in C^n. We have shc(w)nc+1|sh_c(w)| \leq n - c + 1 from the c-shingles paper. If u=edit(w,w)u = \mathrm{edit}(w, w') is the edit distance between w,wCnw, w' \in C^n, then the distance between shc(w)sh_c(w) and shc(w)sh_c(w') is given by:

dist(shc(w),shc(w)):=shc(w)shc(w)+shc(w)shc(w)(2c1)u\mathrm{dist}(sh_c(w), sh_c(w')) := |sh_c(w)\setminus sh_c(w')| + |sh_c(w')\setminus sh_c(w)| \leq (2c-1)u

c-shingles has been used as a filtering mechanism in this architecture. First, the intersection between c-shingles with every element is computed to obtain a set of matches, and then, the distance is computed on these. It should be noted that given shc(w)sh_c(w) and shc(w)sh_c(w'), the evaluation of edit distance is simply computing the intersection between two sets. Therefore, this process can be sped up by storing the c-shingles of every name in the dataset and the sanctions list. To do this securely, an oblivious sorting network is used to sort the dataset using the shingle distance, and then, the edit distance is computed only on a fixed number of candidates. For implementing this, the parameters chosen are as follows:

Max. length of name = 30

Edit distance threshold (tt) = 3

c-shingles are used for removing the cases which cannot be matched. After setting the value of cc as 2, the number of candidates chosen for calculating the edit distance was 15 (numCandidatesnumCandidates). Next, the edit distance is calculated.

Computation using SNARK and MPC:

  1. For a client input string xx, compute the shc(x)sh_c(x) and provide a SNARK proof for the same.
  1. A boolean list is computed: candidates=[(y<<11)(dist(shc(x),shc(y))<(2c1))]\mathrm{candidates} = [(y < < 1|1)*(\mathrm{dist}(sh_c(x), sh_c(y)) < (2c - 1))]
  1. Bitonic sort is used to sort candidates\mathrm{candidates} using the condition comp(a,b)=a==0?a:bcomp(a, b) = a == 0? a:b. This pushes all the zero values to the back. The ones with the zero value are the ones that cannot have a distance less than tt.
  1. Retrieve the first numCandidatesnumCandidates elements from the Candidates\mathrm{Candidates}. This gives the finalCandidates\mathrm{finalCandidates}.
  1. Then, check if Edit(x,y)<t\mathrm{Edit}(x, y) < t for yfinalCandidatesy \in \mathrm{finalCandidates}.

In the end, a set is returned. If the set is empty, then nothing needs to be done. If that’s not the case, then: (i) the user may not be registered; (ii) the user may be expelled.

This procedure can be implemented as an arithmetic circuit which can then be compiled into either a R1CS for use with a SNARK (for the registration time screening) or as an MPC program (for periodic screening).

Key Recovery System

The standard DID and VC systems require users to store private key securely and reliably. This burdens the users. CanDID has a key-recovery subsystem that solves this issue. It deploys mechanisms similar to that in the identity system.

Users can store their DID keys with CanDID committee which stores users’ keys in a secret sharing manner. Users can employ legacy web authentication schemes to retrieve their backed-up keys. The users can choose their preferred authentication policies for recovery. The committee enforces the policy for key release.

This could have been done by using OAuth but this would bring up a major privacy issue - leakage of information to the CanDID committee members. Instead, CanDID uses privacy-preserving proof of account ownership, just like in the case of the identity system.

Enrollment: This is for backing up of key.

  1. UU picks a random ephemeral identifier pkephUpk_{eph}^U and generates a pre-credential PC=((accountid",CidpU),pkephU,π)PC = ((\mathrm{``account id"}, \mathrm{C}_{id^U_p}), pk^U_{eph}, \pi) containing a commitment to UU’s account identifier associated with the authentication provider (idPUid^U_P). The only difference here is that the PCPC contains a pkephUpk^U_{eph} which prevents the correlation between the two systems.
  1. Pre-credentials are verified by using a verification protocol where the user proves knowledge of skephUsk^U_{eph}.
  1. The committee nodes run MPC to compute private key pidPU=PRF([skepfC],[idPU])\mathrm{pid}^U_P = PRF([sk_{epf}^C], [id_P^U]).
  1. The user then shares the secret shares of skUsk^Uacross the committee.
  1. CiC_i stores (pidpU,skiU)(pid^U_p, sk_i^U).

Recovery:

To retrieve, the enrollment process is replicated to compute pidPUpid^U_P. CiC_i fetches the (pidPU,skiU)(pid^U_P, sk_i^U) and then, shares skisk_i with the user.

Security Arguments:

Results

Pre-credential generation

Source: https://eprint.iacr.org/2020/934.pdf

Master Credential Generation

  1. Proof of name matching across pre-credentials: Over 100 runs, the proof generation took 1.2 seconds, while verification took 0.006 seconds on average.
  1. Proof of non-existence in the sanctions list:
Source: https://eprint.iacr.org/2020/934.pdf
  1. Distributed PRF: 38 ± 1ms of CPU-time across four nodes in MP-SPDZ, as averaged over 10 trials of 10 runs each

Privacy-preserving screening via MPC

Source: https://eprint.iacr.org/2020/934.pdf

References