2006/04/11

Code, and other laws... (part 2)

In part 1I talked about the ideal world where feeds were all clearly licensed. Sonow I'll turn to the real world, and I'll be very US-centric becausethis article is quite long enough as it is. You might want to skip tothe happy fun summary at the bottom.

Millions of feeds aren't explicitly licensed.  Some can't be becausetheir generators don't allow for it.  For others, the owner doesn'tknow or care about licensing.  For unlicensed feeds, it's notreasonable to make the default assumption "nothing more than fair use"because there are millions of feeds out there whose owners want theircontent syndicated as-is (headline feeds with links back to content,for example).  On the other hand, if you assume anything more than fairuse, you also need to be prepared handle exceptions.  So how to do bothof these in a way that minimizes overhead and letsaggregation happen without lawyers while respecting copyright?

My take is that a reasonable default assumption is to assume the Creative Commons Attribution license only if the feed owner hasn't specified otherwise. 
Thismeans that by default, we'd assume that copying of feed content isallowed as long as attribution is given through an appropriatehyperlink.  Then, provide easy ways to let feed owners specify a different license whenever they explicitly declare one. 

If a feed owner is happy with the default, they need to do nothing.  My senseis that this covers 98% of unlicensed feeds.  For the remainder, a feedowner could go to individual aggregators and tell them explicitly whatlicense they prefer.  They can always choose a completely restrictivelicense that allows only fair use for the general public.  Or, they canchoose a noncommercial license.  My take is that something equivalentto the current Creative Commons license chooser is sufficient.

Of course, what we'd all really prefer is for feed owners to put thelicenses in their feeds directly.  That way, our AOL proxies and cacheswould simply pass the information along to clients, which would makeappropriate decisions about what to do based on the particularlicense.  If we're dealing with a small number of well understoodlicenses, this is the easy part.

How should the feed licenses work?  There's a pretty good page with reasonable recommendations at Creative Commons on the subject.  James Snell's Feed License Link Relation works well for Atom and is pretty flexible:
<link rel="license" href="http://creativecommons.org/licenses/by/2.5/"/>
The Creative Commons RSS Module works for RSS 2.0:
<creativeCommons:license>http://www.creativecommons.org/licenses/by-nc/2.5</creativeCommons:license>. 
Both of these work with CC and other licenses and have been deployed in real implementations  There's an RDF version for RSS 1.0 as well (cc:license).

Finally there's the RSS 2.0 <copyright> element, which is justplain text.  But, given that some tools might allow people to put textin this field but not embed the other types of licenses, I think it'sreasonable to look for a known license URL in the copyright text aswell:
<copyright>Thecontents of this feed are licensed to the public under http://creativecommons.org/licenses/by-nc-sa/1.0/</copyright>
If a processorcan't find any of the above licenses, I'm proposing that AOL feedconsumers fall back to a license based on an explicit list that AOLmaintains by feed owner request.  This would be part of our feedinfrastructure.  I see this working two ways.  First, we would addmetadata to feeds which are requested via our feed proxies.  For Atomand RSS 2.0, the two output formats we support, this would be anamespaced extension, aol:declared-license:
<aol:declared-license>
      <link rel="license" href="http://creativecommons.org/licenses/by/2.5/"/>
</aol:declared-license>
It would contain a Feed License Link Relationindicating which license the owner specified to AOL.  It couldpotentially contain multiple license links.  It could contain othernamespaced elements in the future as well, but feed consumers canignore ones they don't understand.

A client might also want to inquire about a feed's declared licensewithout retrieving it.  For this, we could provide a simple REST API:
GET http://example.aol.com/declared-license/example.org/feed/atom.xml
which returns a simple XML document:
<?xml version="1.0" encoding="utf-8" ?>
<declared-license xmlns="http://example.aol.com/2006/aolfeeds">
    <link rel="license" href="http://creativecommons.org/licenses/by/2.5/"/>
</declared-license>
Note that non-AOLclients could potentially make use of this; you'd just have to believethat AOL is maintaining a good declared license list (the licensesthemselves are the ones the feed owners want to provide to the generalpublic, not to AOL specifically).  We could even potentially sharethese lists between feed aggregators.  An embedded (original) licensewould always override any declared license; this would let feed ownerseasily start embedding their own licenses in the future.  (Should weeliminate any declared license as soon as the source feed startslicensing itself?  I think so, but our legal team would need to weighin on that.)

Finally, we'd advertise a variety of ways for feed owners to contact usand declare their licenses.  There does need to be some sort ofvalidation step to ensure they really own the feed.  As part of thehopefully painless process we'd ask them to pick from one of theexisting Creative Commons licenses.  If these aren't sufficient we canadd other licenses but it's easier all around if people can agree on asmall set.

How about a real world example?  Brian Alvey of Weblogs Inc. recently announced support for excerpt feeds, for example Engadget full vs. Engadget headlines.  The full Engadget feed has the copyright statement:
<copyright>Copyright2006 Weblogs, Inc. The contents of this feed are available fornon-commercial use only.</copyright>
Translating into license-speak, we'd get anAttribution-NonCommercial-NoDerivs license for the full feed, meaningno commercial exploitation, links back are required, and editing of the material is not allowed beyond fair use:
<creativeCommons:license>http://creativecommons.org/licenses/by-nc-nd/2.5/</creativeCommons:license>
The excerpt Engadget feed has the copyright statement:
<copyright>Copyright2006 Blogsmith, LLC. The contents of this headlines and excerpts feedare available for limited commercial distribution. You may repost thisfeed to your site provided you link back to the original story, do notedit the material, and do not remove this copyrightnotice.</copyright>
Translating intolicense-speak, we'd get Attribution-NoDerivs for the excerpt feed,meaning that commercial use is OK but links back are required and thematerial may not be edited:
<creativeCommons:license>http://creativecommons.org/licenses/by-nd/2.5/</creativeCommons:license>
(I'massuming here that the restriction on editing applies to the individualarticles, not the feed document as a whole, since feed documents arenot intended to be kept intact in any case.  This minor ambiguity goesaway with Atom's Feed License Link Relation.)

So far, so good.  Having multiple versions does raise the question of how automated processors are supposed to find these feeds.  I think that's going to have to be a followup post.

That's about it.
In summary:
None of this is black or white.  I shouldalso mention that I'm completely conflicted here, in that my companyboth syndicates and aggregates content and I'm directly involved on both sides. I'm coming at this from the viewpoint of someone trying to provideonline feed aggregation services where the end users subscribe to thefeeds; they're not being selected or screened by editors.  In othersituations other rules about default licences might be better. Explicit licences are definitely best to avoid problems down the road. Here are some other links I've stumbled across:  A basic practical primer on copyright and RSS. One re-aggregator's viewpoint (Palfrey).  Producer's viewpoints: Shelley Powers, Om Malik (here and here) .  Some legal discussion (with Wendy Seltzer, previously of the EFF, weighing in). (Feedburner already does CC licensing following the methodsoutlined above, except that they're using the creativeCommons namespace extension for Atom as well as RSS 2.0; consumers should look for either one in Atom feeds.)

Tags: , Creative Commons, RSS, Atom, syndication

2 comments:

  1. John,
       This is great however I think this alludes to the question what is AOLs plans for the content of peoples blogs?...yeah true AOL (in part I assume) owns the content of any feeds/blogs posted however for the majority of our users who have no clue this would raise questions and probably lose users...

      As far as "what we do" with it is something that is of non-consequence however...this could be preceived as "duping" members into acquiring their content. I would just be cautious as to what we lead our members to follow, namely the concept of owning their journals and the potential profit vs. creating the content and leaving it as that.

    ReplyDelete
  2. Shawn - As on the rest of the Internet, "You own your own words."  See the AIM Terms of Service at http://www.aim.com/tos/tos.adp; specifically, "You or the owner of the Content retain ownership of all right, title and interest in Content that you post to public areas of any AIM Product."

    The TOS do give AOL a license to use the content, but we do not own it, and the owner retains all other copyrights (though IANAL).  So this would be relevant for AOL Journals.   But actually I'm actually more concerned about our feed aggregation products and AOL's ability to offer ad-supported services that surface third party content.

    ReplyDelete

Suspended by the Baby Boss at Twitter

Well!  I'm now suspended from Twitter for stating that Elon's jet was in London recently.  (It was flying in the air to Qatar at the...