Jason in a Nutshell

Jason in a Nutshell

Jason Baker  //  Just another random geek. Visit my homepage at http://jason-baker.com

Follow me on twitter: http://twitter.com/jasonsupdates

Filed under

functional-programming

See all posts on posterous with this tag ยป
Oct 15 / 5:14pm

Persistence is a subset of immutability

Whenever I tell someone I'm working on a persistent data structure 
library for Python, I almost always get the same response: 
"Persistent means that something is stored across process boundaries. 
I think you mean to say immutable data structures."  In fairness, this 
response is understandable.  But it's not quite right.  I'm going to 
address both of the points in the above statement. 


Persistent means that something is stored across process boundaries


Did you that letter can mean a note you send through the mail or a thing you use to make up words?  Did you know that programming can mean writing computer programs or mathematical optimization?  I'm sure if I sat down long enough, I could think of more examples.  The point is that the language we speak is full of ambiguities and whatnot. 

Before we consider the computer science meaning of persistence, let's 
look at the common meaning.  If we look at the word persistent in 
Merriam-Webster, the first two definitions we run into are: 

1 : existing for a long or longer than usual time or continuously: as 
a : retained beyond the usual period  b : continuing without change in 
function or structure 

The meaning Python programmers tend to think of is the definition in 
part a.  That is, a persistent data structure is stored beyond the 
usual period (the process's lifespan).  However, when I talk about a 
persistent data structure, I tend to mean the definition in part b. 

Going with definition b, we can see that lists for example aren't persistent: 

 
>>> a = [1,2,3] 
>>> b = a 
>>> b.append(4) 
>>> a 
[1, 2, 3, 4] 
 

Tuples, however, are a different matter: 

 
>>> a = (1,2,3) 
>>> b = a + (4,) 
>>> b 
(1, 2, 3, 4) 
>>> a 
(1, 2, 3) 
 

Ah, but here's where the second part of this post comes in.

I think you mean to say immutable data structures

In the case of the tuple, you are correct to say that a tuple is 
immutable.  And I'm correct in saying that a tuple is persistent. 
Just like I'd be right to say that I own a car while someone else 
claims that I own an automobile.  However in both cases, they can't 
quite be used interchangeably.  After all, an automobile isn't 
necessarily a car.  It can be a truck or a van or an SUV. 

In the same sense, a persistent data structure is effectively 
immutable.  But an immutable data structure isn't necessarily 
persistent.  Let's consider the definition of both of these words from 
wikipedia: 

Immutable object


In object-oriented and functional programming, an immutable object is 

an object whose state cannot be modified after it is created. 


Persistent data structure


In computing, a persistent data structure is a data structure which 

always preserves the previous version of itself when it is modified; 

such data structures are effectively immutable, as their operations

do not (visibly) update the structure in-place, but instead always

yield a new updated structure. 


These definitions have a lot of common ground, but there is some 
difference between them.  Let's consider a data structure that is 
immutable but not persistent.  Prepare yourself.  This will be 
complex.  Ready? 

 
>>> 1 
1 
 

Has your mind been blown yet? 

In this sense, the number 1 is definitely immutable.  You can't change 
it.  It's always the number 1.  However, it's not persistent.  How do 
you update the number 1?  No matter what you do, it will always be the 
number 1.  Sure, you can get 3 by adding it to 2.  But in a 
mathematical sense, that's not really changing the number 1. 

In this sense, the number 1 is atomic.  It simply doesn't make sense 
to modify it.  Heck, you can't even copy it: 

 
>>> from copy import copy 
>>> a = 1 
>>> a is copy(a) 
True 

Conclusion

With this semantic difference in mind, remember that it's just that: 
a semantic difference.  Call them Bob if you want to.  However, bear 
in mind that when someone uses the word persistence in this way, 
they're not being inaccurate.

 

Loading mentions Retweet
Filed under  //  functional-programming   persistence   programming   python  

Comments (0)