Are you an Engineer or a Scientist?

I just got done reading an excellent article by Roman Snitko. I agree almost 100% with everything he says. However, I felt that the idea could have been developed and executed better. I’m not writing this because I feel that his original piece sucked. Rather I’m writing this because I like the idea he comes up with and want to see it have an impact. So here’s my shot.

(I should say that some of what I say might disagree with what Roman says. I did say almost 100%.)

The engineer

People tend to associate “engineers” with people who get things done and solve real-world problems. Thus, I feel that Roman’s “dudes” can be thought of as the engineers of the programming community. Engineers are the ones who want to make something and get it in peoples’ hands as quickly as possible. It just so happens that that product is a piece of software.

You can recognize an engineer because they’re usually saying things like “So what if it’s not perfect? It works!” or “That sounds like a great idea, but I don’t think we have time for it.”

The engineer’s biggest weakness is that they seem to not think about much past their current deadlines. Yes, it’s important to meet those deadlines. But you do plan on writing software after meeting those deadlines, don’t you? To make up for this, they must call upon the expertise of the scientist.

The scientist

The interesting thing is that almost every field of engineering has a corresponding field of science. Scientists are people who find better ways of doing things without necessarily considering their practicality. It’s fashionable to talk about how disconnected these types of people are from reality, but the truth of the matter is that engineers can’t do the things they can without scientists. Scientists are the ones who want to build great software. It just so happens that that software is also a product people will use.

You can recognize scientists because they’re usually saying things like “So what if it works? It’s an ugly hack!” or “Software schedules should reflect what needs to get done, not the other way around.”

You might have already guessed the scientist’s weakness: they tend to not get along with deadlines very well. To make up for this, they must call upon the abilities of an engineer who can transform their great ideas into reality.

So who’s right?

This way of looking at things is a bit of an oversimplification though. It’s really not that engineers don’t want to make a great piece of software or that scientists don’t want to make a great product. It’s really a matter of priority. Scientists would rather write a great piece of software than a product. Engineers would rather make a product than a great piece of software.

Unfortunately, both sides are a bit disconnected from reality. In the software world, a great product usually is a great piece of software. It’s also unfortunate that someone who can make both things happen is extremely rare if not nonexistent.

Thus, it is ideal for these two sides to be in a state of healthy conflict. They need to recognize that the other side has a point sometimes (as hard as that can be).

So which one are you? And how do you work with the other side?

The Rise of SYDNI, or YAGNI is Only About Problems, Not Solutions

I’ve got a new programming methodology to propose. I call it SYDNI
(Sometimes You Do Need It). It is a response to the problems that I
see with YAGNI. In fairness, I don’t dislike YAGNI. In fact, I agree
with it 100% (well, maybe 95%). But to truly appreciate it, you need
a bit of context.

On YAGNI

I’ve almost started thinking of YAGNI almost as a recursive way of
thinking. That is to say that I’ve begun to think of YAGNI as being
something that uses itself to implement itself. Allow me to explain.

What is YAGNI?

YAGNI stands for “you ain’t gonna need it.” I don’t want to make this post
an in-depth discussion of what YAGNI actually is, so click the
Wikipedia link if you aren’t familiar with YAGNI. The important thing
to take away from reading about YAGNI is that it’s saying that you
shouldn’t implement functionality if you don’t need it.

What YAGNI ISN’T

YAGNI sounds like a pretty straightforward way of thinking. And in a
lot of ways it is. But it’s more nuanced than one may think at first.
The “recursive” element of YAGNI that I speak of above is that YAGNI
(in my opinion) is a very specific solution to a very specific
problem, and that problem is over-engineering.

And YAGNI does its job well (especially in the context of Test Driven
Development). I tend to find myself throwing out a lot less code when
using YAGNI.

A lot of people take YAGNI to mean that the simplest solution is
always the best. That isn’t the case. Or at a very minimum, that
shouldn’t be the case. There’s a key thing about simplicity
that should be understood: it’s defined by the problem, not the
solution. This is key to understanding why YAGNI is so useful. Once
you’ve gotten to the point of choosing a solution, YAGNI is no help to
you. Thus, you have to use YAGNI to choose problems, not solutions.

You’re not in school anymore

In school, things are always so simple. You’re assigned a problem.
And you’re given a grade based on how well you solved that problem.
The real world is more complex.

You see, people too often forget that software developers don’t just
define solutions to problems. After all, aren’t all feature requests
nothing more than a statement of a problem? And isn’t choosing
software features a decision about what problems you will solve?

However, once you’ve chosen a problem to solve, there’s still the
issue of how to solve it.

Sometimes You Do Need It

In solving a problem, YAGNI’s usefulness starts to fade. It does have
some importance. You do have to make sure your solution is solving
the problem you set out to solve. However, beyond that, YAGNI just
doesn’t apply. In fact, it is likely harmful. That’s where SYDNI
comes in. Although SYDNI’s name is something of a jab at YAGNI, the
principle itself isn’t. Instead, SYDNI can be thought of as a
complement to YAGNI. A yin to YAGNI’s yang (alliteration for the
win).

Oftentimes, thoughts may enter your head that start with something
like “we’ll never need…” or “this will never have to…”. This kind
of thinking is helpful when choosing the problem to solve. However,
it’s destructive when choosing a solution. In a couple of years,
there is only one thing that will be certain about the software you’re
writing: it will be different. And it will be different in ways you
can’t have predicted or imagined. If you’re using YAGNI
appropriately, you’re choosing the easiest problems to solve.
However, at least a few of these problems come out of left field.

Therefore, I would put SYDNI this way: ideally, a piece of software
will be no more simple or complex than the problem it is trying to
solve. Therefore, there is danger not only in solutions that are
overly complicated, but there is also danger in solutions that are
overly simple.

This leads to another conclusion: if SYDNI is followed appropriately,
the complexity of your source code is a direct measure of how complex
the problems it is solving are. The reverse is true as well. The
complexity of the problems you’re solving is a direct measure of how
complex your source code will be.

But I don’t live in an ideal world!

The key hole in SYDNI is the word “ideally”. Unfortunately, some
problems just don’t have perfectly compatible solutions. Therefore, a
key decision to be made is whether it is better to err on the side of
over-engineering or under-engineering a solution. We are now delving
into the realm of many disputes between programmers. Many people
(mis)educated on the arts of YAGNI will say that it is always better
to tend towards under-engineering. If this were true, YAGNI wouldn’t
be as useful to as many people as it has been.

Even more unfortunately, there is no “one size fits all” answer of
whether it is better to over-engineer or under-engineer. It is highly
situational and care must be taken to arrive at the appropriate
solution. If you don’t believe me, consider the following two
questions:

  1. Which life support machine would you rather be hooked up to?
    •  A machine whose software developers always did the simplest thing possible
    •  A machine whose software developers went out of their way to anticipate possible problems and planned for each of them
  2. Which one-page web app do you feel would be easiest to maintain?
    • An application that is implemented as two or three source files and a few database tables
    • An application with a highly normalized database, highly modular source, and great flexibility

 

I should hope that the answer to number 1 is obvious. And why it is
the correct answer should also be obvious: if you missed a particular
contingency, people can die. Thus, it makes sense to err on the side
of over-engineering.

But number 2 is a little bit less obvious (and maybe more debatable).
However, I would err on the side of under-engineering. After all, no
matter what changes come up, a one-page web app is still a one-page
web app. The worst case is that the app would be rewritten from
scratch. That’s not to say that you need to throw caution into the
wind and ignore normal good practice. Rather, it’s saying that it’s
not really a good idea to stress much over how maintainable that
application is.

Therefore, when deciding on a solution, there are two things that need
to be decided upon beforehand:

1. How complex the problem is.
2. Whether under-engineering is more harmful than over-engineering.

Once you get those two things squared away, it should be easy to get
an idea of how complex the solution should be.

The devil’s in the details

There are two schools of thought in the programming world:

  1. Explicit is better than implicit (configuration over convention)
  2. A developer should only have to program the unconventional aspects of a program (convention over configuration).

We’ll call #1 the Python school and #2 the Ruby school. In fact, I
would argue that this is an issue that’s at the core of whether code
is considered “Pythonic” or “Rubyic” (I doubt the last one is a word).

So which school of thought is right? I personally think they both
are. It doesn’t really take a whole lot to demonstrate that the
Python school of thought isn’t always right. Think about it. Did you
know that the Python runtime has a component that goes around deleting
objects from memory totally implicitly? How unpythonic is that?

The Ruby school of thought takes a bit more work though. After all,
if it’s unconventional, why should you have to configure it? Of
course, the problem here is in defining “conventional”. What’s
conventional to me is likely unconventional to others. And what’s
conventional to others could be unconventional to me.

I wish I had more advice on how to reconcile these two schools of
thought. The truth is that I struggle with them daily. But I think
having an intuition about this is the dividing line between
“experienced programmer” and “newb”. After all, if programming were
merely about “make everything explicit” or “make everything implicit”,
any idiot could do it.

I think this is also the core skill for writing readable code. You
need to determine what details are relevant to each piece of code.
Whatever the case, you need to make a conscious decision as to what
details shine through and what details you obscure. Because if these
things happen on accident, they’re almost guaranteed to be wrong.

My five rules for writing good code

I just came across a blog post that outlines 5 rules for writing good code.  I agree with them for the most part.  But this subject is extremely subjective and will vary from person to person.  Therefore, I’d like to write up my own rules for writing good code.

Keep it simple

This is the YAGNI rule.
There are often times when we want to try to solve problems we don’t have.  You must resist this urge.  It’s far easier to make simple code more complex than the other way around.  This is usually more of a challenge than it looks.  It’s a sign of good code that you constantly find yourself saying “any idiot could have put this together”.  Reality is that idiots only write simple code when the problem is easy or when they get lucky.

…but not simplistic

You can call this the SYDNI rule.
Albert Einstein said it best when he said “Everything should be made as simple as possible, but not simpler.”  I’ve seen too many “simple” hacks that ended up causing more of a maintenance problem than it would have been just to write something more complex.  As good a thing as simplicity is, you need to be realistic.  Don’t try to make a simple solution match a complex problem.

Abstraction is your friend

So what do you do in those situations where you need to use a complex solution?  Do you give up all hope and just write some horrible piece of crap?  No, you find a way to make that complexity easier to manage.  This is where abstraction comes in handy.
For instance, Lisp introduced the concept that “it’s all data”. This makes it easier to understand things.  If you want to know what something is, you already know that it’s data of some kind.  What is a function?  Data.  What is a list?  Data.  What is code?  Data.  This has a profound effect upon your ability to understand things.

Follow guidelines

Jeff Atwood would call this following the instructions on the paint can.  We tend to sneer at the idea of “best practices”, but reality is that they’re necessary.  There are a lot of problems out there that people have already dealt with and solved.  Why waste time not learning from others’ mistakes.

…but don’t worship them

As the old cliche tells us, rules were made to be broken.  Some of the most well known design patterns break the guidelines and have some of the worst code smells.  In fact, sometimes the guidelines conflict with each other.  Thus, don’t blindly follow guidelines without knowing their purpose.  Instead, understand the rules, know when to follow them and when following them is the greater of two evils.

How to come up with good abstractions

The effective exploitation of his powers of abstraction must be regarded as one of the most vital activities of a competent programmer. -Dijkstra

Abstraction is something that every programmer uses, but few appreciate. In fact, there’s an architecture antipattern that discusses this. You can agree or disagree with the idea that 1 in 5 programmers understand abstraction, but it is my experience that few developers truly appreciate abstraction and can make good use of it. This is somewhat sad. While abstraction comes naturally to few programmers, it is something that every programmer can learn with some amount of effort.

So with that said, here are my (sometimes conflicting) guidelines for coming up with good abstractions.

  1. Trust your gut. Abstraction is more of an art than a science. If there were an objective, scientific method of coming up with good abstractions, programming would be a lot easier. Therefore, you need to forget everything your professors told you about not trusting your intuition. The goal of any abstraction is to make something easier to understand. Guess what abstractions are easiest to understand? The ones that are the most intuitive. Of these guidelines, this is the most important. When you’re in a situation where two guidelines conflict go with your first instinct.
  2. All abstractions leak. It’s worth reading Joel’s piece on leaky abstractions. This is an important thing to keep in mind. The most beautiful abstractions are the ones that are successful in spite of this. Unix is full of good examples. For instance, is anyone really naive enough to be convinced that everything really can be treated exactly the same as an honest-to-god file that resides on the hard disk? Sure, everything has the same interface as a file. But in all but the most trivial of cases, you’re going to need to know what the underlying thing you’re interfacing with is. For instance, with a socket, you may need to worry about if the socket is half-closed, totally closed, or open. If it’s a pipe, you need to realize that something on the other end might close the connection. And yet in spite of the leakiness, the “everything is a file” abstraction is probably the most important thing Unix ever did.
  3. Don’t trust black boxes. Whether it’s your abstraction or someone else’s, there’s a real danger of thinking of something as a magical black box that doesn’t need to be understood. This is dangerous because of guideline #2. When something doesn’t work right, you want to be prepared.
  4. …but don’t be too transparent. Of course, there’s a danger of going too far in the opposite direction. A completely transparent abstraction is useless; it’s not really an abstraction. A good abstraction should be translucent. You shouldn’t have to know all of the details all the time, but you should be able to figure them out easily.
  5. Elegance before comprehensiveness. Since we’ve already established in #2 that no abstraction works in all circumstances, you should abandon the idea of an all-encompassing abstraction. It’s better to have a small set of abstractions upon which higher-level abstractions can be built than to try and come up with one all-encompassing layer of abstraction.
  6. Abstraction doesn’t make your code simpler. Many peoples’ complaint against abstraction is that it makes your code more complex. In fact, they’re often correct. When you have the choice between abstraction and simplicity, always choose simplicity. Unfortunately simplicity isn’t always an option. Think of it this way: if you become sick, would you rather take medication that will cure the disease or treat the symptoms? Obviously, we’d rather make the disease go away if at all possible. But not all diseases are curable. In this case, the only thing you can do is treat the symptoms. This is essentially what abstractions do. They don’t make complex code simple. They do make complex code easier to wrap your head around though.

In the end, I can come up with nifty guidelines like this all day. But in the end, it’s most important to remember rule #1: trust your gut. Chances are that you’ll be better off than if you followed a checklist of guidelines.

What are some good abstractions that follow these guidelines? What are some that break them?

Pimp my Interactive Interpreter

A couple of things that bother me about Python’s interactive interpreter:

  • Having to import commonly used modules like sys.
  • Not having history stored across interpreter sessions.
  • No tab-completion.

Of course, things like iPython and bpython help, but I generally prefer just a plain old python interactive interpreter session. Plus, the above three problems are easy to solve without installing any extra packages, but the way to solve them is documented in somewhat obscure locations. The solution?

First, create a file somewhere with the following text (I save mine to ~/.pystartup):

import atexit
import os
import readline
import rlcompleter
import sys
histfile = os.path.join(os.environ["HOME"], ".pyhist")
try:
    readline.read_history_file(histfile)
except IOError:
    pass
readline.parse_and_bind('tab: complete')

atexit.register(readline.write_history_file, histfile)
del os, histfile

Then, you just need to add a line to your .bashrc, .zshrc, or whatever else your shell uses:

export PYTHONSTARTUP=~/.pystartup

…and viola! Your interactive interpreter has just been pimped.

If you’re on Windows, I’m afraid I have bad news. This probably won’t work for you without using cygwin (as you will need readline).

Choosing sides

Because you don’t need to own the universe. Just see it.  To have the privilege of seeing the whole of time and space… that’s ownership enough. -The 10th doctor

Very few quotes have summarized the fundamental differences between the left brain and the right brain as succinctly as the above quote. And yet, even though the difference can be summarized pretty nicely in a few sentences, there are a lot of misconceptions about the differences between the left brain and the right brain. The thing that makes these misconceptions so hard to break is that they’re just like most other forms of pop psychology: just accurate enough to be tricky.

Google would be jealous

Before I can get into those misconceptions (and what reality is), there’s something you need to understand about the brain. We tend to think of the brain as a biological microprocessor, but that’s not quite accurate. You see, a microprocessor has pre-defined subsystems with very well-defined tasks. The brain is much more sophisticated than that. A better analogy is to think of the brain as a biological cluster of computers (or probably more accurately, a biological cloud). Although you do have subclusters that handle specific segments of functionality, there’s no reason why they have to be dedicated to that task.

As it turns out, the brain is amazingly adaptable. In fact, your brain can and does spread tasks out to parts that we don’t generally associate with handling those tasks.

If it sounds like your brain is chaotic, it’s because it is. The different parts can even work at odds with each other some times. To deal with this, your ego tends to favor certain parts of the brain more than others. Generally, you favor a specific hemisphere over the other. In fact, you even favor a specific portion of a specific hemisphere. But that’s the subject of another blog post.

The left brain is logical, the right brain is creative

As it turns out, the information that I gave in the last section was quite a revelation to neurologists within the last couple of decades. Up until that point in time, they thought that the brain had a strict division of labor. It was thought that the left part of the brain handled logic while the right part of the brain was creative. If you wanted to be more creative, you needed to develop the right brain. If you want to be more logical, you need to develop the left brain. As a right-brained thinker, I’m living proof that the right brain is every bit as capable of logic as the left brain. It just tends to be a different kind of logic.

As it turns out, the two hemispheres can do mostly the same things. In fact, what we refer to as the “left brain” might not physically even be the left hemisphere (if you’re a lefty, there’s about a 70% chance of this!). The point is, there isn’t any reason why the left brain can’t do the things the right brain does and vice versa. However, these two hemispheres do tend to “specialize”. See, the brain is capable of rewiring itself. This is part of the process of learning. As one hemisphere does one task, it rewires itself so it can do that task better in the future. Thus, you do see differences emerge between the two hemispheres. These differences are developmental rather than biological though.

So what IS the difference?

If you’ve ever taken a personality test, you got back a four-letter code ending in J (for judging) or P (for perceiving). Most of the ideas that these tests are based on come from CG Jung. Jung said that judgers like to plan ahead of time and perceivers dislike to have their future mapped out ahead of time. The technology didn’t exist at the time for Jung to realize this, but he was essentially describing the difference between the left brain and the right brain (albeit with a somewhat primitive understanding).

Let’s go back to the Doctor Who quote I presented at the beginning of this post. In this scene, the Doctor (a perceiver) was talking to the Master (a judger). You see, the Master had a goal: complete domination of the universe. The Doctor never really has any explicit goals that he sets ahead of time. He sort of just bumbles around the universe until he runs into trouble. This illustrates the major differences between the left and right brain, but there are still others.

The forest or the trees?

Left-brained thinkers think very linearly. They tend to divide the topics up into individual parts and consider them one by one before considering the big picture. They may be seen as “not being able to see the forest for the trees”, but that isn’t strictly true. It’s more that they just haven’t moved from considering the parts to considering the whole.

Right-brainers might be seen as being “big-picture” oriented, but this isn’t necessarily true either. Distinguishing parts from the whole is a distinctly left-brained way of thinking. The right brain is probably better thought of as being “whole-picture” oriented. While right-brained people might start from the standpoint of the overall picture, you’ll notice that there are certain details that they just won’t let go of.

Experience vs instruction

Left-brained thinkers tend to do very well in school as our educational model is very much aligned with how they think. The model of listening to a lecture and reading an assignment fits very well into their mode of thinking. Left-brainers don’t really need experience to go about their tasks. But they do need instruction. They can follow instructions very well, but they don’t do very well when instructions are poorly defined or when the situation changes in such a way those instructions didn’t anticipate.

Right-brained thinkers do need experience. Until they’ve actually tried something once or twice, they have no frame of reference no matter how much instruction they get. In fact, they will usually feel as though instructions are boxing them in and limiting their options. However, (much to the chagrin of their left-brained counterparts) they are good at improvisation. They’re good at reacting quickly when plans become inadequate.

Conclusion

There’s a lot more to be said on this subject, but hopefully this should be a good outline of the differences between the left brain and the right brain.

How Celery, Carrot, and your messaging stack work

If you’re just starting with Celery right now, you’re probably a bit confused. That’s not because celery is doing anything wrong. In fact, celery does a very good job of abstracting out the lower-level stuff so you can focus just on writing tasks. You don’t need to know very much about how any of the messaging systems you’re using will work. However, to truly understand celery, you need to know a bit about how it uses messaging and where it fits in your technology stack. This is my attempt to teach you the things you need to know about the subject to be able to make everything work.

Messaging

At the very bottom of celery’s technology stack is your messaging system, or Message Oriented Middleware in enterprise-speak. As of this writing, there are a couple of standards out there in this market:

  • AMQP – A binary protocol that focuses on performance and features.
  • STOMP – A text-based format that focuses on simplicity and ease of use.

Of course, there are a lot more players out there than just this. But these are the two protocols that are the most important to celery.

Now, a protocol is totally useless without software that actually implements it. In the case of AMQP, the most popular implementation seems to be RabbitMQ. The popular implementation of STOMP seems to be Apache ActiveMQ.

Carrot

A good analogy that I think most people can wrap their heads around is the SQL database. STOMP and AMQP are like SQL, while RabbitMQ and ActiveMQ are like Oracle and SQL Server. Any one who has had to write software that works with more than one type of database knows how challenging this can be. Sure, it’s easy to issue SQL commands directly when you just support one type of database, but what happens when you need to support another? One possible solution is to use an ORM. By abstracting out the lower-level stuff, you make your code more portable.

The first thing most ORMs do is provide an abstraction to write SQL queries. For instance, if I want to write a LIMIT query for SQL Server, I would do something like this:

SELECT TOP (10) x FROM some_table

Oracle’s query would look something like this:

SELECT x FROM some_table WHERE row_num < 10

These are different queries, but they are both doing the same basic thing. That’s why SQLAlchemy allows you to write the query like this:

select([some_table.x], limit=10)

This is the functionality that carrot provides. Although most messaging systems are fundamentally different in a lot of ways, there are certain operations that every platform has some version of. For example, sending a message in STOMP would look like this:

SEND
destination:/queue/a

hello
^@

AMQP’s version is binary, but would look something like this in text format:

basic.publish "hello" some_exchange a

Since we don’t want to worry too much about these protocols at a low level, carrot creates a Publisher class with a “send” message.

Celery

Carrot makes it so that we can forget about a lot of the lower-level stuff, but it doesn’t save us from the fact that we’re still working with a messaging protocol (albeit a higher-level one). Going back to the ORM analogy, we can see the same thing happening: we need a layer of abstraction to make dealing with different implementations of SQL easier, but we don’t want to write SQL. We want to write Python (or whatever your language of choice is).

Thus, ORMs will add another layer of abstraction. Wouldn’t it be nice if we could just treat a database row as a Python object? Or, in the case of task execution, wouldn’t it be nice if we could just treat a task as a Python function? This is where celery comes in. See, we could run tasks like this:

  1. Process A wants to run task “foo.bar”
  2. Process A puts a message in queue saying “run foo.bar”
  3. Process B sees this message and starts on it
  4. When done, Process B replies to Process A with the status.
  5. Process A acknowledges this message and uses the return result.

Rather than having to code all the details of the messaging process, celery allows us to just create a Python function “foo.bar” that will do the above for us. Thus, we can execute tasks asynchronously without requiring that people reading our code know everything about our messaging backend.

Hopefully, this gives you a high-level overview of how celery is working behind the scenes. There are a lot of details that I’ve left out, but hopefully this provides you with enough knowledge that you can figure the rest out.

How to Write: A Programmer’s Guide

I love writing. I think I have talent in writing. But that doesn’t necessarily mean what I’m writing is good. To become a good writer, you need to put in work and experience. Here are some things I’ve learned along the way:

Practice makes perfect – It’s indisputable that great athletes like Michael Jordan and Wayne Gretzky have a lot of talent. But that doesn’t mean that they were able to go into the major leagues on their first try. Similarly, you might have talent, but it takes practice to unlock that talent. If you don’t have talent, you can still become good at writing. It just takes more time and patience. Blogs are perfect for this. Rest assured of one thing: if your writing sucks, nobody will read your blog. That’s actually a good thing because that means you can write all the stupid things you want and no one will read them until you get good at writing.

I know you’ve heard it a thousand times before. But it’s true – hard work pays off. If you want to be good, you have to practice, practice, practice. If you don’t love something, then don’t do it. -Ray Bradbury

Write because you want to – How many people have you met that “really needed to start blogging”? Don’t write because you think it’s good for your career or because programmers “need” to have a blog. I write about programming because I love writing and programming. If you love programming but not writing, then chances are your time is better spent writing code.

Love is easy, and I love writing. You can’t resist love. You get an idea, someone says something, and you’re in love. -Ray Bradbury

Just write it down – When you write something the first time, it will suck. Therefore, if it doesn’t suck at first, it hasn’t been written down.

Writing is no trouble: you just jot down ideas as they occur to you. The jotting is simplicity itself—it is the occurring which is difficult. -Stephen Leacock

Edit – In this sense, writing isn’t a whole lot different than coding. When you start coding something, the most important thing is just to start coding and not worry about it being perfect. But no code worth writing is perfect the first time around. You have to test it, break it, and make others read it to see if it’s clear. Writing for humans is no different. You must be as merciless in your writing as you are with your code.

I’m not a very good writer, but I’m an excellent rewriter -James Michener

Tail Recursion in Python using Pysistence

A topic that occasionally comes up in Python development is that of tail recursion. Many functional programmers want to see tail recursion elimination added to the Python language. According to Guido, that ain’t gonna happen. And to be fair, I agree. Tail recursion can be tricky not only for new programmers, but for old timers as well.

However, that doesn’t mean that we need to give up on the concept altogether. This is a problem that is hardly new. Functional languages have been implementing tail recursion in environments hostile to it for a while now using a trampoline approach. Let’s see how the newest version of pysistence implements that algorithm:

def trampoline(func, *args, **kwargs):
    """Calls func with the given arguments.  If func returns 
       another function, it will call that function and repeat
       this process until a non-callable is returned."""
    result = func(*args, **kwargs)
    while callable(result):
        result = result()
    return result

This makes it much easier to implement things functionally (using pysistence’s functional lists):

def iter_to_plist(seq):
    seq = iter(seq)
    def inner(accum):
        try:
            return partial(inner, accum.cons(seq.next()))
        except StopIteration:
            return accum
    return inner(pysistence.make_list())

>>> trampoline(iter_to_plist, xrange(1000))
PList([999, 998, 997, 996, 995, 994, 993, 992, 991, 990, 989, 988, 987, 986, 985

That was more work than writing this in a language that does tail recursion automagically. But it wasn’t too bad now was it? Ultimately, I think this approach works for a few reasons:

  1. It’s explicit. The user is well aware of what’s happening because they’re returning a callable.
  2. It makes functional programming more natural. Instead of using true recursion and risking blowing the stack or converting this into a loop and continually reassigning to a variable, you can make the algorithm work without side effects.
  3. It lets Python stay imperative.

For me personally, this is a great set of arguments. I love Python and I love functional programming. The more functional programming I can do in Python, the better. But there are very few things in programming that are free, right? Here are some of the disadvantages:

  1. It’s ugly. I don’t deny this. But this alone is not an argument against it. If you are the kind of person who wants nothing but beautiful code, I’d argue that you’ve chosen the wrong language. Python tends to be beautiful when possible, but ugly when it has to be. Besides that this is the best you’re going to get without modifying the language itself or adding macros.
  2. It isn’t very performant. Insert your favorite quote about not needing to be blazing fast, just good enough here. Besides that, it can be optimized in C if need be.
  3. Who needs more ways to do functional programming? I’m not going to join any flamewars on this one. But I will point you to the advantages of functional programming in Python’s own functional programming HOWTO.