Skip to main content

Is the Atom Publishing Protocol the Answer?

Are Atom and APP the answer to everything?  Easy one: No.

Dare Obasanjo raised a few hackles with a provocative post (Why GData/APP Fails as a General Purpose Editing Protocol for the Web).  In a followup (GData isn't a Best Practice Implementation of the Atom Publishing Protocol) he notes that GData != APP.  DeWitt Clinton of Google follows up with a refinement of this equation to GData > APP_t where t < now in On APP and GData.

I hope this clarifies things for everybody.

There seems to be a complaint that outside of the tiny corner of the Web comprised of web pages, news stories, articles, blog posts, comments, lists of links, podcasts, online photo albums, video albums, directory listings, search results, ... Atom doesn't match some data models.  This boils down to two issues, the need to include things you don't need, and the inability of the Atom format to allow physical embedding of hierarchical data.

An atom:entry minimally needs an atom:id, either an atom:link or atom:content, atom:title, and atom:updated.  Also, if it's standalone, it needs an atom:author.  Let's say we did in fact want to embed hierarchical content and we don't really care about title or author as the data is automatically generated.  I might then propose this:

<entry>
    <id>tag:a unique key</id>
    <title/>
    <author><name>MegaCorp LLC</name></author>
    <updated>timestamp of last DB change</updated>
    <content type="application/atom+xml">
        <feed> ... it's turtles all the way down! ... </feed>
     </content>
</entry>  

Requestors could specify the desired inline hierarchy depth desired.  Subtrees below that node can still be linked to via content@src.  And when you get to your leaf nodes, just use whatever content type you desire.

Alternatively, if you need a completely general graph structure, there's always RDF.  Which can also be enclosed inside atom:content.

The above is about as minimal as I can think of. It does require a unique ID, so if you can't generate that you're out of luck.  I think unique IDs are kind of general.  It also requires an author, which can be awfully useful in tracking down copyright and provenance issues.  So that's pretty generic too, and small in any case. 

Of course this type of content would be fairly useless in a feed reader, but it would get carried through things like proxies, aggregators, libraries, etc.  And if you also wanted to be feedreader friendly for some reason, the marginal cost of annotating with title and summary is minimal.

Popular posts from this blog

Personal Web Discovery (aka Webfinger)

There's a particular discovery problem for open and distributed protocols such as OpenID, OAuth, Portable Contacts, Activity Streams, and OpenSocial.  It seems like a trivial problem, but it's one of the stumbling blocks that slows mass adoption.  We need to fix it.  So first, I'm going to name it:

The Personal Web Discovery Problem:  Given a person, how do I find out what services that person uses?
This does sound trivial, doesn't it?  And it is easy as long as you're service-centric; if you're building on top of social network X, there is no discovery problem, or at least only a trivial one that can be solved with proprietary APIs.  But what if you want to build on top of X,Y, and Z?  Well, you write code to make the user log in to each one so you can call those proprietary APIs... which means the user has to tell you their identity (and probably password) on each one... and the user has already clicked the Back button because this is complicated and annoying.

The problem with creation date metadata in PDF documents

Last night Rachel Maddow talked about an apparently fake NSA document "leaked" to her organization.  There's a lot of info there, I suggest you listen to the whole thing:

http://www.msnbc.com/rachel-maddow/watch/maddow-to-news-orgs-heads-up-for-hoaxes-985491523709

There's a lot to unpack there but it looks like somebody tried to fool MSNBC into running with a fake accusation based on faked NSA documents, apparently based on cloning the document the Intercept published back on 6/5/2017, which to all appearances was itself a real NSA document in PDF form.

I think the main thrust of this story is chilling and really important to get straight -- some person or persons unknown is sending forged PDFs to news organization(s), apparently trying to get them to run stories based on forged documents.  And I completely agree with Maddow that she was right to send up a "signal flare" to all the news organizations to look out for forgeries.  Really, really, really import…
Twister is interesting.  It's a decentralized "microblogging" system based on putting together existing protocols:  Bitcoin, distributed hash tables, and Bittorrent.  The most interesting part for me is using Bitcoin for user registration and spam control.  Federated systems handle this with federated trust, which is at least conceptually simple.  The Twister/Bitcoin mechanism looks intriguing though I don't know enough about Bitcoin to really comment.  Need to read further.