How to come up with good abstractions

The effective exploitation of his powers of abstraction must be regarded as one of the most vital activities of a competent programmer. -Dijkstra

Abstraction is something that every programmer uses, but few appreciate. In fact, there’s an architecture antipattern that discusses this. You can agree or disagree with the idea that 1 in 5 programmers understand abstraction, but it is my experience that few developers truly appreciate abstraction and can make good use of it. This is somewhat sad. While abstraction comes naturally to few programmers, it is something that every programmer can learn with some amount of effort.

So with that said, here are my (sometimes conflicting) guidelines for coming up with good abstractions.

  1. Trust your gut. Abstraction is more of an art than a science. If there were an objective, scientific method of coming up with good abstractions, programming would be a lot easier. Therefore, you need to forget everything your professors told you about not trusting your intuition. The goal of any abstraction is to make something easier to understand. Guess what abstractions are easiest to understand? The ones that are the most intuitive. Of these guidelines, this is the most important. When you’re in a situation where two guidelines conflict go with your first instinct.
  2. All abstractions leak. It’s worth reading Joel’s piece on leaky abstractions. This is an important thing to keep in mind. The most beautiful abstractions are the ones that are successful in spite of this. Unix is full of good examples. For instance, is anyone really naive enough to be convinced that everything really can be treated exactly the same as an honest-to-god file that resides on the hard disk? Sure, everything has the same interface as a file. But in all but the most trivial of cases, you’re going to need to know what the underlying thing you’re interfacing with is. For instance, with a socket, you may need to worry about if the socket is half-closed, totally closed, or open. If it’s a pipe, you need to realize that something on the other end might close the connection. And yet in spite of the leakiness, the “everything is a file” abstraction is probably the most important thing Unix ever did.
  3. Don’t trust black boxes. Whether it’s your abstraction or someone else’s, there’s a real danger of thinking of something as a magical black box that doesn’t need to be understood. This is dangerous because of guideline #2. When something doesn’t work right, you want to be prepared.
  4. …but don’t be too transparent. Of course, there’s a danger of going too far in the opposite direction. A completely transparent abstraction is useless; it’s not really an abstraction. A good abstraction should be translucent. You shouldn’t have to know all of the details all the time, but you should be able to figure them out easily.
  5. Elegance before comprehensiveness. Since we’ve already established in #2 that no abstraction works in all circumstances, you should abandon the idea of an all-encompassing abstraction. It’s better to have a small set of abstractions upon which higher-level abstractions can be built than to try and come up with one all-encompassing layer of abstraction.
  6. Abstraction doesn’t make your code simpler. Many peoples’ complaint against abstraction is that it makes your code more complex. In fact, they’re often correct. When you have the choice between abstraction and simplicity, always choose simplicity. Unfortunately simplicity isn’t always an option. Think of it this way: if you become sick, would you rather take medication that will cure the disease or treat the symptoms? Obviously, we’d rather make the disease go away if at all possible. But not all diseases are curable. In this case, the only thing you can do is treat the symptoms. This is essentially what abstractions do. They don’t make complex code simple. They do make complex code easier to wrap your head around though.

In the end, I can come up with nifty guidelines like this all day. But in the end, it’s most important to remember rule #1: trust your gut. Chances are that you’ll be better off than if you followed a checklist of guidelines.

What are some good abstractions that follow these guidelines? What are some that break them?

The Rumsfeld Hazard

There are known knowns. These are things we know that we know. There are known unknowns. That is to say, there are things that we now know we don’t know. But there are also unknown unknowns. These are things we do not know we don’t know. —Donald Rumsfeld

(I should note something here: although this blog post is named after a political figure, I generally prefer to avoid political references here. So don’t construe this as being a political message.)

I seem to recall Rumsfeld being very much criticized for the above quote. Now, I’m not saying that the above quote was the appropriate response to the question he was asked. However, there is some truth to what he said. Too often, we think in terms of knowns and unknowns. We treat planning a software project as if it were an obstacle course: overcome this and this obstacle and you’re done!

If this were the case, software development would be a much easier practice. It’s not so much that comparing software development to an obstacle course isn’t an apt analogy. But the obstacle course is more insidious than we think it is. Almost always, the obstacle course looks easy. The problem is that it’s built upon a minefield and you weren’t told about it. Unless you get really lucky, you’re in for a huge surprise.

And don’t think that you can beat this by bringing a mine detector through the second time around. By then, they’ll have replaced the mines with pits of spikes or some other twisted but hidden trap. A software developer has the unenviable job of traversing this obstacle course.

A lot of us will carefully map our plans out, only to have a single mine invalidate the entire plan. That mine is the Rumsfeld Hazard: a problem you didn’t know about, couldn’t have predicted, and thus couldn’t plan for. If there’s one thing that agile development methods help with, it’s dealing with the Rumsfeld hazard. Sure you won’t be any less surprised when the mine goes off, but you won’t have spent so much time making plans that are shattered by one unseen event. It’s a beautiful thing when a software team can roll with the punches this way and plan enough in advance to have a general direction.