The Identity Corner

On Identity Claims, Unlinkability, and Selective Disclosure (part 1)

Following up on Ben Laurie’s excellent introduction to selective disclosure, Kim Cameron and Ben have been blogging on the properties of unlinkability and selective disclosure. See here, here, here, and here, as well as this older post of Ben’s. I’m going to jump into this discussion by means of several blog posts of my own. This first post is dedicated to clarifying the notion of unlinkability and the related notion of untraceability.

Considered a user who self-generates several identity claims at different occassions, say “I am 25 years of age”, “I am male”, and “I am a citizen of Canada”. The user’s software packages these assertions into identity claims by means of attribute type/value pairs; for instance, claim 1 is encoded as “age = 25″, claim 2 is “gender = 0″, and claim 3 is “citizenship = 1″. Clearly, relying parties that receive these identity claims cannot trace them to their user’s identity (whether that be represented in the form of a birth name, an SSN, or another identifier) by analyzing the presented claims; self-generated claims are untraceable. Similarly, they cannot decide whether or not different claims are presented by the same or by different users; self-generated claims are unlinkable. Note that these two privacy properties (which are different but, as we will see in the next paragraph, complementary) hold “unconditionally;” no amount of computing power will enable relying parties to trace or link by analyzing incoming identity-data flows, not even if relying parties collude (indeed, they may be the same entity). Now, consider the same self-generated identity claims, but this time their user “self-protects” them by means of a self-generated cryptographic key pair (e.g., a random RSA private key and its corresponding public key). The user digitally signs the identity claims with his private key; for example, claim 1 as presented to a relying party looks like “age = 25; PublicKey = 37AC986B…; Signature = 21A4A5B6…”. Clearly, these self-protected claims are as untraceable as their unprotected cousins in the previous paragraph. Are they unlinkable? Well, that depends:

  • If the user applies the same key pair to all claims, then the public key that is present in the presented messages will be the same; thus, all presented identity claims are linkable. As a result, a relying party that receives all three claims over time knows that it is dealing with a 25-year old Canadian male. As the user over time presents more linkable claims, this may indirectly lead to traceability; for example, the relying party may be able to infer the user’s birth name once the user presents a linkable identity claim that states the postal code of his home address.
  • If the user applies a different self-generated key pair to each identity claim, the three presented claims are as unlinkable and untraceable as in the example where no cryptographic data was appended. Note that this solution does notforce unlinkability and untraceability: in cases where the user should be identified, the user can simply provide a claim that specifies his name: “name=Jon Smith” or “SSN-identifier=945278476″, for instance. Similarly, to make self-generated identity claims linkable, an additional common attribute value can be encoded.

It is important to note that the privacy properties of unlinkability and untraceability are not about hardcoding anonymity into identity systems: instead, they are about ensuring that identity claims convey nothing more than the attribute information they are supposed to convey.

Principle 1: Identity claims should not convey any linking and tracing information beyond the attribute values that they specify.

As we saw, plain self-generated identity claims meet this principle. So do self-generated claims that are self-signed using random one-time key pairs. (This is not the place to get into what the benefits of self-signing might be; suffice it to say they are minimal.) As we get to identity claims that are generated by third parties (in CardSpace speak: managed identity claims), things become more complicated. I will discuss this case in one of my next blog entries; the next entry will be on the notion of selective disclosure.

June 4, 2007 - Posted by Stefan Brands | General | | 1 Comment

1 Comment »

  1. [...] Stefan Brands is contributing to the discussion of traceability, inkability and selective disclosure with a series of posts over at identity corner.  He is one of the world’s key innovators in the cryptography of unlinkability, so his participation is especially interesting.    Consider a user who self-generates several identity claims at different occassions, say “I am 25 years of age?, “I am male?, and “I am a citizen of Canada?. The user’s software packages these assertions into identity claims by means of attribute type/value pairs; for instance, claim 1 is encoded as “age = 25?, claim 2 is “gender = 0?, and claim 3 is “citizenship = 1?. Clearly, relying parties that receive these identity claims cannot trace them to their user’s identity (whether that be represented in the form of a birth name, an SSN, or another identifier) by analyzing the presented claims; self-generated claims are untraceable. Similarly, they cannot decide whether or not different claims are presented by the same or by different users; self-generated claims are unlinkable. [...]

    Pingback by Kim Cameron’s Identity Weblog » Keys, signatures and linkability | June 4, 2007

Leave a comment