Skip to main content

Social Web Foo: Standards for Public Social Web

Small but useful #swfoo session.  My idea was to try to give public social data formats, protocols, and standards some quality time, since (a) privacy and ACLs introduce many difficult problems that eat up lots of discussion time; and (b) there are many key use cases that are totally public, and might be easily solvable if we remove the distraction of privacy controls. @niall, @dewitt, and @steveganz attended, but per Foo rules, I won't attribute specific quotes.

Examples of this include public blogs, update streams, and feeds; and public following/friending relationships.  Typically following (one way) seems to be more likely to be public than friending, for social reasons.

Some random notes:  
  • Public content, once published, should be assumed to be "in the wild" everywhere, indefinitely, until the heat death of the universe.
  • PubSubHubHub (prior session) is a great example of a proposed open standard for improving the performance of public social data.
  • Problem:  How does an author prove authorship of data that's "in the wild" or syndicated?  Conversely, how do readers determine authenticity of an authorship claim?
  • Blogger's import/export facility currently "wrings the identity" out of the data, because we don't have any way to detect tampering with the supposed author/post/comment data between export and import.
  • There was a suggestion that signing a subset of fields in an Atom entry with Google's public key could provide authorship attestation for that data (content, title, author, etc.), in UTF-8 only, which would then let us solve the import/export and syndication attribution problems without having to deal with DigSig.
  • Great example of a situation where a hosting web site needed attestation from a chain of 3 parties before allowing possibly copyright-infringing content to be uploaded; no standard exists for doing this online.
  • Would like to be able to link to a real world identity (vouched for) or to at least a profile provided by someone like Google; there are lots of pieces of data that would let Google vouch for identity of a profile owner, but no standard way to express this publicly.
  • Google for example could also do more general reputation which could also be public.
  • A public social graph consisting of following relationships is both useful, and potentially honestly mine-able, assuming users opted in with full knowledge that data was public and mine-able; this is very different from private relationships.
  • Public social graph is also potentially a way to determine public reputation; it's possible to game this, but difficult especially if the relationships are publicly visible on the open web so that subverting them believably would take months or years of stealth work.
  • Being able to verify past employment, educational credentials, etc. (data that a user chooses to make public and verifiable) would be very useful.

Popular posts from this blog

Personal Web Discovery (aka Webfinger)

There's a particular discovery problem for open and distributed protocols such as OpenID, OAuth, Portable Contacts, Activity Streams, and OpenSocial.  It seems like a trivial problem, but it's one of the stumbling blocks that slows mass adoption.  We need to fix it.  So first, I'm going to name it:

The Personal Web Discovery Problem:  Given a person, how do I find out what services that person uses?
This does sound trivial, doesn't it?  And it is easy as long as you're service-centric; if you're building on top of social network X, there is no discovery problem, or at least only a trivial one that can be solved with proprietary APIs.  But what if you want to build on top of X,Y, and Z?  Well, you write code to make the user log in to each one so you can call those proprietary APIs... which means the user has to tell you their identity (and probably password) on each one... and the user has already clicked the Back button because this is complicated and annoying.

The problem with creation date metadata in PDF documents

Last night Rachel Maddow talked about an apparently fake NSA document "leaked" to her organization.  There's a lot of info there, I suggest you listen to the whole thing:

http://www.msnbc.com/rachel-maddow/watch/maddow-to-news-orgs-heads-up-for-hoaxes-985491523709

There's a lot to unpack there but it looks like somebody tried to fool MSNBC into running with a fake accusation based on faked NSA documents, apparently based on cloning the document the Intercept published back on 6/5/2017, which to all appearances was itself a real NSA document in PDF form.

I think the main thrust of this story is chilling and really important to get straight -- some person or persons unknown is sending forged PDFs to news organization(s), apparently trying to get them to run stories based on forged documents.  And I completely agree with Maddow that she was right to send up a "signal flare" to all the news organizations to look out for forgeries.  Really, really, really import…
Twister is interesting.  It's a decentralized "microblogging" system based on putting together existing protocols:  Bitcoin, distributed hash tables, and Bittorrent.  The most interesting part for me is using Bitcoin for user registration and spam control.  Federated systems handle this with federated trust, which is at least conceptually simple.  The Twister/Bitcoin mechanism looks intriguing though I don't know enough about Bitcoin to really comment.  Need to read further.