Abstractioneer by John Panzer: One site-meta to rule them all

2008/11/24

One site-meta to rule them all

Eran & Mark's site-meta proposal is... interesting. I have a gut feel that it's a bit like democracy: The worst method for whole-site discovery, except for all the others. The killer app for this, IMHO, is the ability to solve things like "lookup metadata for user x on domain y" in a general way. An immediate practical application is translating things like email addresses and Jabber IDs, all of the form (user@domain), to something you can perform discovery on.

Other than the embarrassment of hacking another hard-coded magic name alongside favicon.ico and robots.txt, I really only have one issue with the proposal: It requires a directory lookup via an XML document. I have nothing against XML, but it seems like overkill for this purpose.

An alternative would be to use a very, very simple text-based format that is NOT very extensible. Fortunately, there's already a proposal for this type of format from Mark, for a Link header:

Link: http://example.org/ch rel="previous";
       title="previous chapter"

Just for simplicity, we can take the same format and embed it in an application/site-meta document. The sample site-meta XML would then transform into something like this (with newlines delimiting each type of metadata):

/robots.txt rel="robots"
/p3p.xml rel="privacy"
http://other.example.net/example rel="http://example.com/rel"

We lose the ability to use namespaces and inline (embedded) metadata in this site-meta document. Or alternatively, we gain the ability to ignore namespaces and we don't need to download inline metadata we don't care about.

One closing thought: A persistent issue with XML, unfortunately, is problems with cryptographic signatures, primarily due to complexity of signing and verifying something whose canonical representation is really an Infoset rather than a text stream. This hypothetical format, or any of a number close to it in configuration space, could solve that problem easily:

/robots.txt rel="robots"
/p3p.xml rel="privacy"
http://other.example.net/example rel="http://example.com/rel"
data:application/x-pkcs7-signature;base64,iVBORA...rkJggg== rel="signature"

...with some simple rules to determine which octets get signed.

3 comments:

AnonymousNovember 25, 2008 at 7:44 AM
+1 especially since if sites are going to have a single /site-meta file for all their discovery, then as soon as one person fat-fingers invalid syntax for an entry, it will break discovery for the entire site. XML is very brittle. But with one-line-per-entry syntax, presumably the damage would be limited and more easy to ignore.
ReplyDelete
Replies
AnonymousNovember 25, 2008 at 7:36 PM
Hey John,

I like the democracy comparison -- well said.

WRT XML - you're not the first person to raise this, and I agree that it adds some risk.

The intent, however, is to address as many of these use cases as possible, and I often run across people for who the extra roundtrip is unacceptable (e.g., in P3P, this concern drove the whole compact policy design, which IMO is pretty broken). Since site-meta is aimed at addressing as many of these uses as possible, that's why XML was chosen.

However, if there's emerging consensus that a line-oriented format would be better, I'm all for it; the whole point here is to build momentum behind one approach, and make it successful. So, get people to make some noise!

P.S. I'm re-submitting this comment, because the first time the openid dialog ate the comment text. Grr...
ReplyDelete
Replies
UnknownNovember 26, 2008 at 1:56 AM
This comment has been removed by a blog administrator.
ReplyDelete
Replies

Add comment

Abstractioneer by John Panzer

2008/11/24

One site-meta to rule them all

3 comments:

Suspended by the Baby Boss at Twitter

Search This Blog