Skip to main content

Magic Signatures for Salmon

In writing the spec for Salmon we soon discovered that what we really wanted was S/MIME signatures for the Web.  In other words, given a message, let you sign it with a private key, and let receivers verify the signature using the corresponding public key.  Signing and verifying are pretty well understood, but in practice canonicalizing data and signing is hard to get right.  Making sure that the mechanism adopted is really deployable and interoperable, even in restricted environments, is a top priority for Salmon.

I'm calling this the "Magic Signature" mechanism because it's not really Salmon-specific and you can analyze it without thinking much about Salmon at all.

One of the reasons why this is hard is because of the abstraction layers that we have in place in our software.  For example, encryption algorithms operate on byte sequences, but a given XML document can have many different byte sequence serialized forms.  Even JSON isn't immune to this, though mandating UTF-8 certainly helps.  So, the first thing to make really simple is the serialization format.  Here's the Magic Signature serialization algorithm:
b64_data = urlsafe_b64_enc(utf8text)
In other words, serialize your data however your libraries let you into utf8 text, then base64 encode the resulting bytes, using the url safe variant of base64.  That's the actual string you sign, and it's nearly impossible to mess up that string as it's 7 bit ASCII, uses no characters known to ever be escaped by anything, and is mostly an uninterpreted blob of text as far as your libraries and transport layers are concerned.  The one caveat is that some transports may need to insert linebreaks/whitespace due to line length limits -- this can be solved by squeezing out all whitespace (which is never part of the data) before signing or validating.

Signing is then standard; we'll mandate support for RSA_SHA1, meaning you take the SHA1 hash digest of that base64 data and then sign the hash using an RSA private key:
s = rsa_sign(private_key,sha1_digest(b64_data))
the result is a very big integer, which you convert to network-neutral bytes and then turn into a string with, you guessed it, urlsafe_b64_enc:
sig = urlsafe_b64_enc(to_binary(s))
Now for the ugly bit:  Since the whole premise of this is that the receiver is not going to be able to create exactly the same serialization of utf8text that the sender did, you need to help the receiver out by sending it the exact b64_data used to compute the original signature.  Since it's base64 encoded, it's effectively armored not only against vagaries of transport protocols but also software stacks and frameworks.

Since you're sending the base64 data, and it's trivial to base64-decode it, there's no point in sending the original data as well.  So you just send the content, wrapped in its base64 envelope, plus a signature.  Call this a "Magic Envelope":

<?xml version='1.0' encoding='UTF-8'?>
<me:env xmlns:me=''>
  <me:data type='application/atom+xml' encoding='base64'>
And on the receiving side, you base64_decode to get the original content, you calculate the sha1_digest on that base64 data, and verify the signature.  If it works out, you use the resulting data, in this case a Salmon that was hidden inside the magic envelope:

<?xml version="1.0" encoding="utf-8"?><entry xmlns="">
  <thr:in-reply-to ref="," xmlns:thr="">,
  <content>Salmon swim upstream!</content>
  <title>Salmon swim upstream!</title>
<me:provenance xmlns:me=""><me:data encoding="base64" type="application/atom+xml">PD94bWwgdmVyc2lvbj0nMS4wJyBlbmNvZGluZz0nVVRGLTgnPz4KPGVudHJ5IHhtbG5zPSdodHRwOi8vd3d3LnczLm9yZy8yMDA1L0F0b20nPgogIDxpZD50YWc6ZXhhbXBsZS5jb20sMjAwOTpjbXQtMC40NDc3NTcxODwvaWQ-ICAKICA8YXV0aG9yPjxuYW1lPnRlc3RAZXhhbXBsZS5jb208L25hbWU-PHVyaT5hY2N0OmpwYW56ZXJAZ29vZ2xlLmNvbTwvdXJpPjwvYXV0aG9yPgogIDx0aHI6aW4tcmVwbHktdG8geG1sbnM6dGhyPSdodHRwOi8vcHVybC5vcmcvc3luZGljYXRpb24vdGhyZWFkLzEuMCcKICAgICAgcmVmPSd0YWc6YmxvZ2dlci5jb20sMTk5OTpibG9nLTg5MzU5MTM3NDMxMzMxMjczNy5wb3N0LTM4NjE2NjMyNTg1Mzg4NTc5NTQnPnRhZzpibG9nZ2VyLmNvbSwxOTk5OmJsb2ctODkzNTkxMzc0MzEzMzEyNzM3LnBvc3QtMzg2MTY2MzI1ODUzODg1Nzk1NAogIDwvdGhyOmluLXJlcGx5LXRvPgogIDxjb250ZW50PlNhbG1vbiBzd2ltIHVwc3RyZWFtITwvY29udGVudD4KICA8dGl0bGU-U2FsbW9uIHN3aW0gdXBzdHJlYW0hPC90aXRsZT4KICA8dXBkYXRlZD4yMDA5LTEyLTE4VDIwOjA0OjAzWjwvdXBkYXRlZD4KPC9lbnRyeT4KICAgIA==</me:data><me:alg>RSA-SHA1</me:alg><me:sig>EvGSD2vi8qYcveHnb-rrlok07qnCXjn8YSeCDDXlbhILSabgvNsPpbe76up8w63i2fWHvLKJzeGLKfyHg8ZomQ==</me:sig></me:provenance></entry>
Note that the signature, and the base64 data, is still carried inside a "provenance" element of the salmon for future verification.

This is all fun to describe, but it's even more fun to play with.  Take a look at http:/ to see this in action.  When you load it, you'll see that it gives you an error -- it will refuse to sign your salmon until you correct the author URI.  This is a feature; the demo checks that the signed-in user matches one of the authors of the salmon, so you need to edit the author/uri field to read "acct:<your email address here>" to make it work.

Next, you'll see the magic envelope appear.  You can verify the signature, which sends a request back to the server and replies Yes or No.  Or, you can unfold the envelope back into an Atom salmon to read the content.  Of course, if you tamper with the salmon first it will neither verify nor unfold properly.

For Salmon-aware processors, there's little reason to use anything but the application/magic-envelope form.  For syndication in general, though, it may be necessary to wrap the envelope in an Atom or RSS entry.

The source code for all of this is freely available .  If you're interested in all of this, please join the Salmon discussion group.

(Updated 1/19 to include a note about squeezing out whitespace from the b64 encoded data before doing anything important with it, per gffletch's comment.)

Popular posts from this blog

Personal Web Discovery (aka Webfinger)

There's a particular discovery problem for open and distributed protocols such as OpenID, OAuth, Portable Contacts, Activity Streams, and OpenSocial.  It seems like a trivial problem, but it's one of the stumbling blocks that slows mass adoption.  We need to fix it.  So first, I'm going to name it:

The Personal Web Discovery Problem:  Given a person, how do I find out what services that person uses?
This does sound trivial, doesn't it?  And it is easy as long as you're service-centric; if you're building on top of social network X, there is no discovery problem, or at least only a trivial one that can be solved with proprietary APIs.  But what if you want to build on top of X,Y, and Z?  Well, you write code to make the user log in to each one so you can call those proprietary APIs... which means the user has to tell you their identity (and probably password) on each one... and the user has already clicked the Back button because this is complicated and annoying.

XAuth is a Lot Like Democracy

XAuth is a lot like democracy:  The worst form of user identity prefs, except for all those others that have been tried (apologies to Churchill).  I've just read Eran's rather overblown "XAuth - a Terrible, Horrible, No Good, Very Bad Idea", and I see that the same objections are being tossed around; I'm going to rebut them here to save time in the future.

Let's take this from the top.  XAuth is a proposal to let browsers remember that sites have registered themselves as a user's identity provider and let other sites know if the user has a session at that site.  In other words, it has the same information as proprietary solutions that already exist, except that it works across multiple identity providers.  It means that when you go to a new website, it doesn't have to ask you what your preferred services are, it can just look them up.  Note that this only tells the site that you have an account with Google or Yahoo or Facebook or Twitter, not what the…
Twister is interesting.  It's a decentralized "microblogging" system based on putting together existing protocols:  Bitcoin, distributed hash tables, and Bittorrent.  The most interesting part for me is using Bitcoin for user registration and spam control.  Federated systems handle this with federated trust, which is at least conceptually simple.  The Twister/Bitcoin mechanism looks intriguing though I don't know enough about Bitcoin to really comment.  Need to read further.