Skip to main content

Another use for feed licenses: Splogicide

Doc Searls just changed his blog license to Attribution-NonCommercial-ShareAlike 2.5... in order to clearly deny splogs reblogging rights to his content.  Interesting, though I think there may be some unintended fallout.  But there are some cool applications for this.  What if someone built a tool to make it easy to find such copy right violators (academics use these tools to find plagarism)? With an accompanying service to aggregate complaints and, when they reach a sufficiently remunerative level, send attack lawyers after sploggers.

Update: The collective intelligence of the blogosphere is a mighty thing.  In a comment below, Doc points at an open source plagarism detector from UCSB (my alma mater) that already does Internet searches.  Hmmm.... 


  1. Hey, as coincidence has it I'm hanging here at UCSB, where we have a plagarism-nailing system called PAIRwise, which is open source. Cool, no?

    So I just blogged about it, here:

  2. My blog has been licensed with the CC BY-NC-SA 2.5 for a while now, and sploggers repost my content all the time.  It hasn't hurt my position in SERPs, so I'm not really too worried about it.

    Attack lawyers is the wrong idea.  Search companies should use the CC license RDF along with some kind of authority measure to mechanically identify splogs to remove them from their indexes, etc.  This would make splogs less valuable to create, which might curtail their growth.

    The reason why I mention the authority measure is because, what happens when a splogger reposts your CC licensed content and put their own CC license on it?  It becomes a Mexican standoff between the two sources ... which would render the technique unusable for search companies.  But then, if we had that authority measure ... we wouldn't need the CC license information, anyway.

    Oh well.

  3. Dossy -- Here's a thought:  Create a registration service that accepts regular weblog pings (like Technorati or Pingomatic).  It just records URLs, licenses, content hashes, and most importantly date/timestamps.  Then anyone can go to the service to ask which URL was "first" -- which ends the Mexican Standoff: You can't copy something from the future.

  4. John ... now you've traded trust for a race condition.  Imagine the ping service is overloaded or down or there's network issues -- the original author's ping doesn't make it through.  Later, a splogger picks up the content and goes for the ping and gets it first.

    Game over.

  5. Dossy -- Remember that there are always server logs and other evidence.  But one could also offer a signing service -- pass every post through the service before posting, which includes its own timestamp and digitally signs the whole thing.  The downside is that if the service is down, you can't post.  There are lots of ways to skin this cat.


Post a Comment

Popular posts from this blog

The problem with creation date metadata in PDF documents

Last night Rachel Maddow talked about an apparently fake NSA document "leaked" to her organization.  There's a lot of info there, I suggest you listen to the whole thing:

There's a lot to unpack there but it looks like somebody tried to fool MSNBC into running with a fake accusation based on faked NSA documents, apparently based on cloning the document the Intercept published back on 6/5/2017, which to all appearances was itself a real NSA document in PDF form.

I think the main thrust of this story is chilling and really important to get straight -- some person or persons unknown is sending forged PDFs to news organization(s), apparently trying to get them to run stories based on forged documents.  And I completely agree with Maddow that she was right to send up a "signal flare" to all the news organizations to look out for forgeries.  Really, really, really import…

Why I'm No Longer On The Facebook

I've had a Facebook account for a few years, largely because other people were on it and were organizing useful communities there.  I stuck with it (not using it for private information) even while I grew increasingly concerned about Facebook's inability to be trustworthy guardians of private information.  The recent slap on the wrist from the FTC for Facebook violating the terms of its prior consent agreement made it clear that there wasn't going to be any penalty for Facebook for continuing to violate court orders.
Mark Zuckerberg claimed he had made a mistake in 2016 by ridiculing the idea of election interference on his platform, apologized, and claimed he was turning over a new leaf:
“After the election, I made a comment that I thought the idea misinformation on Facebook changed the outcome of the election was a crazy idea. Calling that crazy was dismissive and I regret it.  This is too important an issue to be dismissive.” It turns out, though, that was just Zuck ly…

Personal Web Discovery (aka Webfinger)

There's a particular discovery problem for open and distributed protocols such as OpenID, OAuth, Portable Contacts, Activity Streams, and OpenSocial.  It seems like a trivial problem, but it's one of the stumbling blocks that slows mass adoption.  We need to fix it.  So first, I'm going to name it:

The Personal Web Discovery Problem:  Given a person, how do I find out what services that person uses?
This does sound trivial, doesn't it?  And it is easy as long as you're service-centric; if you're building on top of social network X, there is no discovery problem, or at least only a trivial one that can be solved with proprietary APIs.  But what if you want to build on top of X,Y, and Z?  Well, you write code to make the user log in to each one so you can call those proprietary APIs... which means the user has to tell you their identity (and probably password) on each one... and the user has already clicked the Back button because this is complicated and annoying.