Skip to main content

On Magic

We discovered an interesting IE6 feature when we pushed out caching changes earlier this week for Journals.  For posterity, here are the technical details.  Our R8 code started using "conditional GET" caching, meaning that we supported both If-Modified-Since: and If-None-Match: HTTP headers.  The way this works is that, if a client has a version of a page in its cache, it can send one or both of these headers to our servers.  Like this:
If-Modified-Since: Tue, 26 Sep 2006 21:47:18 GMT
If-None-Match: "1159307238000-ow:c=2303"
If-None-Match, which passes an "entity tag" or ETag, is better to use and was designed to replace the If-Modified-Since header.   (If-Modified-Since has granularity only down to a second, and can't  be used to indicate non-time-based changes.)  In our case we actually have two versions of our pages which can be served up, one for viewers and another one for owners.  We really only want to cache the viewers' page.

When our server sees a request like the one above, it first does a quick check (in this case it'll ignore the If-Modified-Since and use the ETag) to see if the client already has the latest version; if it does, it returns a 304 Not Modified result.  The big win is that this can be done very very quickly and efficiently, while building a 200KB web page takes lots of work.  If the client doesn't have the right version, though, the server returns a 200 and sends new headers, like these:

Last-Modified: Tue, 26 Sep 2006 21:47:18 GMT
Etag: "1159307238000-c=2303"

If you're obsessive with details you might notice that the modification date is the same as before, but the ETag has changed (the -ow:c has changed to a -c).  When the second request was made, it sent cookies that told the server that the user was the owner of the blog.  So the page is different and therefore the ETag is different, but the last date modified is the same.  We're expecting browsers and caches to detect the change and refresh the page.

This all works fine... except for IE6 (and the AOL client, which uses IE6 under the hood).  IE6 seems to see the Last-Modified: timestamp above and simply stop, ignoring the Etag: header and the fact that we're returning a 200 response with new content.  I've sat and watched the data flow in and out of my Internet connection and verified that IE just drops the 60K or so of content on the floor, as well as the new ETag, and re-uses its old version.  The only way to prevent it is to force a reload using ctrl-Reload, or clearing your Temporary Internet Files.

What this means is that if you change "who you are" by logging in or out, and nothing else changes, you will get a stale, cached version of your own blog's page.  Which is certainly not good.

As of this morning, we're running with caching turned off on but with a bug fix on  The bug fix is simple:  Don't send Last-Modified: headers.  So we only send back the Etag:
Etag: "1159307238000-c=2303"
Which forces IE6 to pay attention to it, fixing the problem with IE6.  IE7, by the way, works either way; go Microsoft!

This all means that we're not going to try to enable caching for non-Etag-aware clients and caches.  Since non-Etag-aware seems to pretty much equate to old or buggy, and not having caching is just a minor performance hit, this seems to be a pretty reasonable approach in theory.  The question now is, will practice accord with theory?  We really need people to hammer on over the next few days and give us feedback.  See Stephanie's post: Like Magic, We're Back Where We Began... and please leave us feedback!

Popular posts from this blog

Personal Web Discovery (aka Webfinger)

There's a particular discovery problem for open and distributed protocols such as OpenID, OAuth, Portable Contacts, Activity Streams, and OpenSocial.  It seems like a trivial problem, but it's one of the stumbling blocks that slows mass adoption.  We need to fix it.  So first, I'm going to name it:

The Personal Web Discovery Problem:  Given a person, how do I find out what services that person uses?
This does sound trivial, doesn't it?  And it is easy as long as you're service-centric; if you're building on top of social network X, there is no discovery problem, or at least only a trivial one that can be solved with proprietary APIs.  But what if you want to build on top of X,Y, and Z?  Well, you write code to make the user log in to each one so you can call those proprietary APIs... which means the user has to tell you their identity (and probably password) on each one... and the user has already clicked the Back button because this is complicated and annoying.

XAuth is a Lot Like Democracy

XAuth is a lot like democracy:  The worst form of user identity prefs, except for all those others that have been tried (apologies to Churchill).  I've just read Eran's rather overblown "XAuth - a Terrible, Horrible, No Good, Very Bad Idea", and I see that the same objections are being tossed around; I'm going to rebut them here to save time in the future.

Let's take this from the top.  XAuth is a proposal to let browsers remember that sites have registered themselves as a user's identity provider and let other sites know if the user has a session at that site.  In other words, it has the same information as proprietary solutions that already exist, except that it works across multiple identity providers.  It means that when you go to a new website, it doesn't have to ask you what your preferred services are, it can just look them up.  Note that this only tells the site that you have an account with Google or Yahoo or Facebook or Twitter, not what the…
Twister is interesting.  It's a decentralized "microblogging" system based on putting together existing protocols:  Bitcoin, distributed hash tables, and Bittorrent.  The most interesting part for me is using Bitcoin for user registration and spam control.  Federated systems handle this with federated trust, which is at least conceptually simple.  The Twister/Bitcoin mechanism looks intriguing though I don't know enough about Bitcoin to really comment.  Need to read further.