2010/01/19

A Web Wide Public Key Infrastructure

In my last post, I talked about Magic Signatures, the evolution of Salmon's Magic Security Pixie Dust into something concrete.  There's an important bit missing, which I'll talk about now:  Public key signatures are fairly useless without a way to discover and use other people's public keys in a secure, reliable way.  We need a distributed, easily deployable public key infrastructure.  Let's build one.

The basic plan for Salmon signature verification is fairly simple:  Use Webfinger discovery on the author of the Salmon to find their Magic Signature public key, if they have one.  To find this, you first get the Webfinger XRD file, then look for a "magickey" link:
<Link rel="http://salmon-protocol.org/ns/magickey" href="https://www.example.org/0B6A58300/publickey" />
Retrieve the magickey resource, parse it into a public key in your favorite library, and use that to verify the signature of the original Salmon you started with.

A few notes:
  • Retrievals of these documents should be done via TLS and certificates checked; if any step in the chain isn't done via checked TLS, the result may be vulnerable to MITM or DNS poisoning attacks.
  • The public key is per-user and is effectively self-signed.  While I anticipate most keys will be generated by large providers, it is possible for users to generate their own public/private keypairs and upload only the public key to the Web.
  • Maintaining data about expired and revoked keys will be needed too, but first things first.

The next question is what format that public key should be.  The obvious choice is an X.509 PKI certificate PEM file, but this turns out to pull in a ton of baggage; you can't even parse one of these using available App Engine libraries at the moment.  This is primarily due to the use of ASN.1 DER encoding, which is usually dealt with via a linked-in C library.  X.509 is also inherently complex and overkill for what we need; it's based on a hierarchical model of CAs which doesn't map well to Salmon's decentralized model.

When you dig down into things, it turns out that an RSA public key itself is fairly simple.  You have two numbers, a modulus (n, for some reason) and an exponent (e).  You need to serialize and deserialize these numbers in a standard way.  Most libraries let you pass in and access these values.  The only catch is that the numbers themselves can be very, very big -- bigger than any native number on any system, so you need something like a bignum library to deal with these numbers.  At the end of the day, though, you're just dealing with a pair of numbers.  You don't need complicated formats for that.

So, here's a very simple way to represent this in a web-safe way:
"RSA." + urlsafe_b64_encode(to_network_bytes(modulus)) + "." + urlsafe_b64_encode(to_network_bytes(exponent))
The output is two base64 encoded strings, separated by a period, and prefixed with the key type:
RSA.mVgY8RN6URBTstndvmUUPb4UZTdwvwmddSKE5z_jvKUEK6yk1u3rrC9yN8k6FilGj9K0eeUPe2hf4Pj-5CmHww==.AQAB
To parse this, you split on the period, urlsafe_b64_decode each piece into bytes, and then turn the bytes into bignums using your library.  (I'm using the PyCrypto bytes_to_long and long_to_bytes functions to play around with this at the moment.)

(There might possibly also be a need for a Subject attached to the key itself, to prevent Joe from pointing at Bob's key and claiming it as his own.  Joe still couldn't sign things as Bob but he could potentially claim Bob's output as his own.  So, add the base64 encoding of Joe's Webfinger ID to the public key.  It's not clear to me if this is needed quite yet.)

You may also want to store and retrieve private keys; it turns out the private keys just need one more value, d, along with n and e.  Append this to the end as an optional parameter.  Thoughts?  Anything missing?  You can also take a look at the in-progress Python code to deal with this format.  It's pretty trivial to write and parse.  I imagine that dealing with parsing additional algorithms will be slightly more complicated, but nothing compared with the complication of actually implementing said algorithms.

(All of this still of course relies on X.509 certificates underlying the SSL connections used to retrieve the user-specific public keys.  That's fine, because those certificates are hidden underneath existing libraries and don't need to be visible to Salmon code at all.)

This is a general purpose, lightweight discovery mechanism for personal signing keys.  It works well for Salmon; once widely deployed, it would be useful for other purposes as well.

(Edited to update thoughts on Subject, which, I think, is not needed.)

7 comments:

  1. The W3 recommends for signatures in XML:
    http://www.w3.org/TR/xmldsig-core/#sec-Overview

    From
    http://www.w3.org/2000/09/xmldsig#
    XSD for RSA key:








    Hope this helps.

    ReplyDelete
  2. Andrew -- Yep, see my prior post about signatures. Unfortunately XML signatures rely on complicated canonicalization rule, and nobody in the feed syndication world has managed to successfully deploy DSig on the Web, even though it's the recommended signature mechanism for Atom and it's mentioned explicitly in the spec.

    ReplyDelete
  3. You can use X.509 in a very easy way which does not require the client to have a CA signed certificate. It works with most browsers, is RESTful:

    http://esw.w3.org/topic/foaf+ssl

    This would allow authentication in a web of trust way, using only well established technologies.

    ReplyDelete
  4. by the way, there is a really cool service I am using to sign in here using OpenId, that in fact uses my foaf+ssl certificate. http://openid4.me/

    (this is the first blogging site that uses OpenId for comment management!)

    ReplyDelete
  5. Henry -- I like foaf+ssl, we walked through your slides at the Sun-hosted unconference back in November :). Salmon does have an additional wrinkle: It wants to allow separation between the signer/IdP and the aggregator, and independent verification of the signatures. So a connection oriented auth isn't sufficient (it could be very useful as a building block). There's also a tooling issue in that things like Google AppEngine don't (yet) support client certs -- something to help drive fixing that would be good though.

    ReplyDelete
  6. Did you look at http://code.google.com/p/keyczar/wiki/RsaPrivateKey?

    ReplyDelete
  7. Adding 'd' is not sufficient for an RSA private key *format*.
    Private key operations, such as signing, are significantly faster (twice as fast I think) if you know the prime factors of the modulus so you can use the Chinese Remainder Theorem. That is too big an advantage to exclude.

    The standard RSA private key format includes 8 numbers: n, e, d, p, q, d mod p-1, d mod q-1, and q^-1 mod p.

    If you really want a minimal format perhaps n.e.[d][.p] would do. If you only have d (but not the primes) use n.e.d; if you have the primes use n.e..p, from which you can calculate all the other numbers. Minor bonus: n.e..p is shorter than n.e.d (only ~75% of the length).

    ReplyDelete

Suspended by the Baby Boss at Twitter

Well!  I'm now suspended from Twitter for stating that Elon's jet was in London recently.  (It was flying in the air to Qatar at the...