Some cases where mutable data is better

Immutable data is often presented as some pinnacle of good programming, so much so that you should feel guilty whenever you don’t use it. There are many cases where it makes things worse, however, so it really just depends.

Let me give some examples and then put it all back together.

GUI widgets

A tree of GUI widgets, such as the DOM of a web browser, is usually a mutable object where you can add and remove children, install event handlers, or modify properties such as “visible” right in place. This is a highly mutable data structure, and in practice, it works quite well. Operations are cheap and easy to understand, and when you make a change in one place, it gets picked up by the whole system in a very reliable way. You cannot easily go back to prior states of the DOM, but you generally do not want to.

We should not feel bad about all of this. Imagine what an immutable DOM would be like.

  1. When an event handler is invoked, it accepts a DOM as input and returns a new DOM as output. The browser then installs the newly constructed DOM. This extra parameter is information-free given that there is just one instance of the DOM at any time, anyway.

  2. If an event handler wants to add a child node to a parent, it needs to start at the root of the DOM, because that is how you modify an immutable tree in the middle of it. This breaks encapsulation, though, because now the event handler is being handed far more information as an input than it should really be looking at.

  3. If an event handler wants to change one property of a node, for example to make it visible or invisible, it has to make a copy of the node and copy over the values of all the attributes it is not changing. This basic operation is therefore impractical without some kind of meta-programming support. Maybe each node has a withVisible(boolean) method, but how would you even write that method without some kind of support from the language? You need to write or generate code that changes one value and copies over the 20-odd or more other values that are unchanged.

Stateful GUIs are just the better way to do it.

The builder pattern for collections

Linked lists are wonderful data structures that everyone should practice with. I have used them very heavily in SML, Scheme, Emacs Lisp, and Scala, and I would recommend everyone learn to use them so that you know what is possible. It’s really something how the old “cons” instruction from John McCarthy just gives and gives and is still the most common basis of a persistent list type.

I think it is better for a daily collection type, though, to use the builder pattern rather than a cons list, and the builder pattern fundamentally involves state. With the builder pattern, you make a list by starting with a mutable array and adding things to the end of it. Once you are done constructing the list, you “build” it by converting it to an immutable array. The majority of collections work well in this way in daily programming: you build them in one part of your code and then use them as-is for the rest of the life of that collection. Google’s Guava collections work this way, and they are well worth experimenting with if you work in the Java ecosystem.

Here are some reasons it helps to have a buildable array rather than an always-persistent linked list:

  1. Arrays save you the mental burden of modifying your algorithm to work with linked lists. For example, a linked list algorithm often works best if you build the list in reverse and then do an O(n) reversal of the whole list at the end. With arrays, you don’t have to think about that kind of thing and therefore have more mental bandwidth for other problems.

  2. Arrays are more efficient on any machine with a memory hierarchy due to not needing to chase pointers and to suffer all the cache misses that that will cause. “Machines with a memory hierarchy” would include any desktop, laptop, phone, or server machine.

  3. The efficiency arguments are straightforward and bullet-proof. To contrast, linked lists, functional arrays, and the magical Vector type in Scala all have complex implementations that do not always perform well depending on the details.

Similar issues apply for sets and maps. It is a good default plan to build sets using a mutable structure and then convert them to a read-only form at the end.

  1. For tree structures, you can use a b-tree that has 5 or 10 children per node rather than the 2 children per node of a red-black tree. That’s more cache-friendly and efficient. However, this mainly only works well for a mutable b-tree. If you add to an immutable b-tree in this way, you have to copy all the child pointers that you didn’t modify, which eats into your performance advantage from the fewer pointer chases.

  2. For hash structures, it’s just plain more practical. Hash tables are straightforward when implemented on a mutable array but require some gymnastics to implement otherwise. With the builder pattern, you can build the set or map using a mutable array and then freeze it at the end.

Similar things also apply for strings. It is unfortunate that Ruby’s default string type is mutable. It’s much better to build a string with a StringBuilder and then convert it to a read-only form.

Conclusions

What do we make of all of this?

Despite these exceptions, I highly recommend that a software developer learn about persistent data structures and about working with immutable data in general. Unless you do that, you are programming with one hand tied behind your back.

When you face each new problem, however, feel free to consider both mutable and immutable data structures, based on the characteristics of the problem. From the top of my head, I’d suggest the following guidelines.

  1. Try to write functions so that they accept immutable parameters and produce an immutable output. It’s a good default thing to try.

  2. For collections, start with the builder pattern.

  3. If your software has a large tree with many attributes on the nodes, consider a mutable tree. I gave the example of the JavaScript DOM, but another example would be a compiler AST.