Spam spam spam spam...

Back in June, aparently, the FTC said that a do-not-email list (likethe do-not-call list) would not work, and would generate more spambecause spammers would use it as a source of new email addresses. Though it's a bit late now, I have to wonder about the latterpoint.  Why not simply map each address into its MD5 checksumbefore storing it?

So would become "a0b6e8fd2367f5999b6b4e7e1ce9e2d2"which is useless for sending email.  However, spammers could use any of many available toolsto check for "hits" on their email lists, so it's still perfectlyusable for filtering out email addresses.  Of course it would alsotell spammers that they have a 'real' email address on their list, butonly if they already had it -- so I don't think that would be givingthem much information at all.

I still think the list would be useless because spammers would simplyignore it.  But it wouldn't generate new spam, and it would driveup the cost of spamming b…

The Noosphere Just Got Closer

Of course it'll take several years, but Google's just announced project to digitize major university library collections means that the print-only "dark matter" of the noosphereis about to be mapped out and made available to anyone with an Internetconnection.  Well, at least the parts that have passed into thepublic domain; the rest will be indexed.

I'm clearly a geek -- my toes are tingling.

The "5th Estate"

Interesting quote, from my point of view, in this article:

Jonathan Miller, Head of AOL in the US, testifies to the popularity ofCitizen's Media. He says that 60 - 70 per cent of the time people spendon AOL is devoted to ‘audience generated content'.

(Though he's talking mostly about things like message boards and chat rooms, of course, rather than blogs.)

Welcome MSN Spaces!

A surprise to welcome me back from sabbatical: Microsoft released the beta of MSN Spaces(congratulations guys!).  I've been playing with it a bit over thepast few days; there's some very cool stuff there, especially theintegrations between Microsoft applications. 

(I've seen a few comments about the instability of the Spaces service; come on folks, it's a beta.  And they're turning around bug fixes in 48 hours while keeping up with what has got to be a ton of traffic.)

Software Patents Considered Harmful

This post by Paul Vick is, I think, a very honest and representative take on software patents -- and in particular the over-the-top IsNot patent -- from the point of view of an innovator.  I find myself agreeing with him wholeheartedly:

Microsoft has been as much a victim of this as anyone else, and yetwe're right there in there with everyone else, playing the game. It'sbecome a Mexican standoff, and there's no good way out at the momentshort of a broad consensus to end the game at the legislative level.

And we all know how Mexican standoffs typically end.  Paul, myname is on a couple of patents which I'm not proud of either.  Butin the current environment, there really isn't a choice: We're alllocked in to locally 'least bad' courses, which together work toguarantee the continuation of the downward spiral (and in the long run,make all companies worse off -- other than Nathan Myhrvold's, of course.)

Web Services and KISS

Adam Bosworth argues for the 'worse is better' philosophy of web services eloquently in his ISCOC talk and blog entry. I have a lot of sympathy for this point of view.  I'm alsoskeptical about the benefits of the WS-* paradigm.  They seem tome to be well designed to sell development tools and enterpriseconsulting services.

Why Aggregation Matters

Sometimes, I feel like I'm banging my head against a wall trying to describe just why feed syndication and aggregation is important.  In an earlier post,I tried to expand the universe of discourse by throwing out as manypossible uses as I could dream up.  Joshua Porter has written areally good article about why aggregation is a big deal, even justconsidering its impact on web site design: Home Alone? How Content Aggregators Change Navigation and Control of Content

Prediction is Difficult, Especially the Future

Mysecond hat at AOL is development manager for the AOL Polls system. This means I've had the pleasure of watching the conventions anddebates in real time while sitting on conference calls watching theperformance of our instant polling systems. Which had some potentialissues, but which, after a lot of work, seem to be just fine now. Anyway: The interesting thing about the instant polling during thedebates was how different the results were from the conventionalinstant phone polls. For example, after the final debate the AOLInstapoll respondents gave the debate win to Kerry by something like60% to 40%. The ABC news poll was more like 50%/50%. Frankly, I don'tbelieve any of these polls. However, I'll throw this thought out: Theonline insta polls are taken by a self selected group of people who areinterested in the election and care about making their opinions known. Hmmm... much like the polls being conducted tomorrow.
I'llgo out on a limb and make a prediction base…

Random Note: DNA's Dark Matter

Scientific American's The Hidden Genetic Program of Complex Organismsgrabbed my attention last week.  This could be the biologicalequivalent of the discovery of dark matter.  Basically, the 'junk'or intron DNA that forms a majority of our genome may not be junk atall, but rather control code that regulates the expression of othergenes. 

The programming analogy would be, I think, that the protein-codingparts of the genome would be the firmware or opcodes while the controlDNA is the source code that controls when and how the opcodes areexecuted.  Aside from the sheer coolness of understanding how lifeactually works, there's a huge potential here for doing useful geneticmanipulation.  It's got to be easier to tweak control code than totry to edit firmware... (Free link on same subject: The Unseen Genome.)

Things in Need of a Feed

Syndicated feeds are much bigger than blogs and news stories; they're aplatform.  A bunch of use cases, several of which actually exist in some form, others just things I'd like to see:
Blog entries for blogs I'm interested inFeed of all comments on entries I've authoredNews stories matching a custom filter I've set upTraffic conditions on my customary route(s)Fedex shipping feed giving status and history for all of my packagesCustomer support feed giving status and history for all my issues (any company)Product safety/recall information for everything I buy
Amazon feed of new books matching my preferencesAll new material by a specific author (on any blog or online source)Feed of new feeds, of various types:  Just my friends  Authored by people whose blogs I already subscribe to  Filtered on personal profile/interestsHouse for sale listings Newly discovered prime numbers (okay, a niche audience)Airport flight status alertsMovies in my Netflix queue and recommendation…

Niche Markets

Niche markets are where it's at: Chris Anderson's The Long Tailis exactly right. The Internet not only eliminates the overhead ofphysical space but also, more importantly, reduces the overhead offinding what you want to near-zero. When your computer tracks yourpreferences and auto-discovers new content that you actually want, it enables new markets that couldn't otherwise exist.

Update 10/11: Joi Ito's take.

Network Protocols and Vectorization

Doing things in parallel is one of the older performance tricks.  Vector SIMD machines -- like the Cray supercomputers -- attack problems that benefit from doing the same thing to lotsof different pieces of data simultaneously.  It's just a performancetrick, but it drove the design and even the physical shape of thosemachines because the problems they're trying to tackle -- airflowsimulation, weather prediction, nuclear explosion simulation, etc. --are both important and difficult to scale up.  (More recently, we'reseeing massively parallel machines built out of individual commodityPCs; conceptually the same, but limited mostly by networklatency/bandwidth.)

So what does this have to do with network protocols?  Just as the problems of doing things like a matrix-vector multiply very, very fast drove the designs of supercomputers, the problems of moving data from one place to another very quickly, on demanddrive the designs of today's network services.  The designs of netw…

Office Space

How important is the physical workspace to knowledge workers generally,and software developers specifically?  Everybody agrees it'simportant.  Talk to ten people, though, and you'll get nine differentopinions about what aspects are important and how muchthey impact effectiveness.  But there are some classic studies thatshed some light on the subject; looking around recently, they haven'tbeen refuted.  At the same time, a lot of people in the softwareindustry don't seem to have heard of them.

Take the amount and kind of workspace provided to each knowledgeworker.  You can quantify this (number of square feet,open/cubicle/office options).  What effects should you expect from,say, changing the number of square feet per person from 80 to 64?  Whatwould this do to your current project's effort and schedule?

There's no plug-in formula for this, but based on the available data,I'd guesstimate that the effort would expand by up to 30%.  Why?

"Programmer Performan…

Community, social networks, and technology at Supernova 2004

Some afterthoughtsfrom the Supernova conference, specifically about social networks andcommunity.  Though it's difficult to separate the different topics. 

A quick meta-note here: Supernova is itself a social network of peopleand ideas, specifically about technology -- more akin to a scientificconference than an industry conference.  And, it's making a lot of useof various social tools:,

Decentralized Work (Thomas Malone) soundsgood, but I think there are powerful entrenched stakeholders that canwork against or reverse this trend (just because it would be gooddoesn't mean it will happen).  I'm taking a look at The Future of Work right now; one first inchoate thought is how some of the same themes are treated differently in The Innovator's Solution.

The Network is People - a panel with Chrisopher Allen, Esther Dyson, Ray Ozzie, and Mena Trott.  Interesting/new thoughts:
Chris Allen on spreadsheet…

Supernova 2004 midterm update

I'm at the Supernova 2004 conferenceat the moment.  I'm scribbling notes as I go, and plan to go backand cohere the highlights into a post-conference writeup.  Firstimpressions:  Lots of smart and articulate people here, both onthe panels and in the 'audience'.  I wish there were more time foraudience participation, though there is plenty of time for informalinteractions between and after sessions.  The more panel-like sessions are better than the formal presentations.

The Syndication Nation panel had some good points, but itratholed a bit on standard issues and would have benefited from alonger term/wider vision.  How to pay for content is important,but it's a well trodden area.  We could just give it a code name,like a chess opening, and save a lot of discussion time...

I am interested in the Autonomic Computing discussion and relatedtopics, if for no other reason than we really need to be able to focussmart people on something other than how to handle and recover …

Atom Proposal: Simple resource posting

On the Atom front, I've just added a proposal to the Wiki: PaceSimpleResourcePosting. The abstract is:

This proposal extends the AtomAPI to allowfor a new creation URI, ResourcePostURI, to be used for simple,efficient uploading of resources referenced by a separate Atom entry.It also extends the Atom format to allow a "src" attribute of thecontent element to point to an external URI as an alternative toproviding the content inline.

This proposal is an alternative toPaceObjectModule, PaceDontSyndicate, and PaceResource. It is almost asubset of and is compatible with PaceNonEntryResources, but differs inthat it presents a very focused approach to the specific problem ofefficiently uploading the parts of a compound document to form a newAtom entry. This proposal does not conflict with WebDAV but does notrequire that a server support WeDAV.

Atom: Cat picture use case

To motivate discussion about some of the basic needs for the Atom API, I've documented a use case that I want Atom to support: Posting a Cat Picture.This use case is primarily about simple compound text/picture entries,which I think are going to be very common.  It's complicatedenough to be interesting but it's still a basic usage.

The basic idea here is that we really want compound documents thatcontain both text and pictures without users needing to worry about thegrungy details; that (X)HTML already offers a way to organize the toplevel part of this document; and that Atom should at least provide away to create such entries in a simple way.

Who am I?

I'm currently a tech lead/manager at Google, working on Blogger engineering.

I'm formerly a system architect and technical manager for web based products at AOL. I last managed development for Journals and Favorites Plus.  I've helped launch Public & Private Groups, Polls, and Journals for AOL.


Around 1991, before the whole Web thing, I began mycareer at a startup which intended to compete with Intuit's Quickensoftware on the then-new Windows 3.0 platform.  This was greatexperience, especially in terms of what not to do[*]. In 1993 Itook a semi-break from the software industry to go to graduate school at UCSanta Cruz.  About this time Usenet, ftp, and email started to beaugmented by the Web.  I was primarily interested in machinelearning, software engineering, and user interfaces rather thanhypertext, though, so I ended up writing a thesis on the use of UI usabilityanalysis in software engineering.

Subsequently, I worked for a startup that es…

First Post

The immediate purpose of this blog is to publish thoughts about web technologies, particularly Atom. Of course that suffers from the recursive blogging-about-bloggingsyndrome, so I'll probably expand it to talk about software in general.

What does the name stand for?  Mostly, it stands for "something not currently indexed by Google".  Hopefully in a little while it will be the only thing you get when you type "Abstractioneer" into Google. Actually it's a contraction of the "Abstract Engineering" which is ameme I'm hoping to propagate.  More on that later.