Skip to main content

The Essential Hardness of Programming

Software engineering's preoccupation is the arrangement of bits, as opposed to atoms. One of the properties of bit arrangements is that their marginalmanufacturing cost is zero; once you have an arrangement of bits, youcan make as many exact copies of that arrangement as necessary,whenever and wherever they're needed.  By contrast, an arrangement ofatoms such as a bridge has a large marginal manufacturing cost, even ifyou just want an exact copy.  Further, there are few physical limits tobits, while there are sharp physical limits to atoms.  The only reallimit to bit arrangement is the human brain, and economics (how badlypeople want bits arranged in particular ways). 

These are the fundamental reasons why nearly every software engineering project is attempting new design, and is thus hard. This is because, in the world of software, design equals bitarrangements and copying a prior bit arrangement has zero cost. Finding an appropriate bit arrangement used to have substantial cost,but that cost is falling towards zero too.  So for a given project, youcan assume that competent software engineers have mostly found andcopied the relevant patterns of bits where possible, and the remainingwork is design

Think about what the statement above means.  This isn't like a civilengineer dealing with slightly differing terrain or traffic loads whenadapting an existing design for a new bridge; it's more like a civilengineer being asked to build a bridge out of Jello on Pluto.  And thenext time, to build atop a moving lava flow on Mercury.  In otherwords, with the easy, mechanical adaptations being taken care of bythose ubiquitous bit patterns, the problems that are left for people towork out are the hard, surprising, novel ones.  Usually with nonlinearAnd in software, design really is everything; once you've taken designto a detailed enough level that the implementation is mechanical... welet the machines do it.

Which is why I winced when I read Scott Rosenberg's interview in Salon. He gets it exactly right when he notes that there's always somethingnew in every software project, otherwise there'd be no point in doingit.  But he goes off the rails when he says, "...programmers areprogrammers because they like to code -- given a choicebetween learning someone else's code and just sitting down and writingtheir own, they will always do the latter."  Jonathan Rentzsch hasalready skeweredthis statement better than I could.  It is of course true that thereare some people who just aren't good at finding prior solutions, or atunderstanding them once found, and they may contribute to unnecessaryre-creation of software, increasing both cost and risk to largerprojects.  But they're not the norm, and aren't a major cause of the"always something new" phenomenon.  The essence of software developmentis new design.

This is also why attempts to map manufacturing based activities tosoftware development are at best rough approximations and at worstdangerous distractions.  Software development is a knowledgeacquisition activity, not a manufacturing activity.

Popular posts from this blog

Personal Web Discovery (aka Webfinger)

There's a particular discovery problem for open and distributed protocols such as OpenID, OAuth, Portable Contacts, Activity Streams, and OpenSocial.  It seems like a trivial problem, but it's one of the stumbling blocks that slows mass adoption.  We need to fix it.  So first, I'm going to name it:

The Personal Web Discovery Problem:  Given a person, how do I find out what services that person uses?
This does sound trivial, doesn't it?  And it is easy as long as you're service-centric; if you're building on top of social network X, there is no discovery problem, or at least only a trivial one that can be solved with proprietary APIs.  But what if you want to build on top of X,Y, and Z?  Well, you write code to make the user log in to each one so you can call those proprietary APIs... which means the user has to tell you their identity (and probably password) on each one... and the user has already clicked the Back button because this is complicated and annoying.

The problem with creation date metadata in PDF documents

Last night Rachel Maddow talked about an apparently fake NSA document "leaked" to her organization.  There's a lot of info there, I suggest you listen to the whole thing:

There's a lot to unpack there but it looks like somebody tried to fool MSNBC into running with a fake accusation based on faked NSA documents, apparently based on cloning the document the Intercept published back on 6/5/2017, which to all appearances was itself a real NSA document in PDF form.

I think the main thrust of this story is chilling and really important to get straight -- some person or persons unknown is sending forged PDFs to news organization(s), apparently trying to get them to run stories based on forged documents.  And I completely agree with Maddow that she was right to send up a "signal flare" to all the news organizations to look out for forgeries.  Really, really, really import…
Twister is interesting.  It's a decentralized "microblogging" system based on putting together existing protocols:  Bitcoin, distributed hash tables, and Bittorrent.  The most interesting part for me is using Bitcoin for user registration and spam control.  Federated systems handle this with federated trust, which is at least conceptually simple.  The Twister/Bitcoin mechanism looks intriguing though I don't know enough about Bitcoin to really comment.  Need to read further.