Note: This article assumes some introductory Haskell knowledge.
Just as algebra is fundamental to the whole of mathematics, algebraic data types (ADTs) are fundamental to many common functional programming languages. They’re the primitives upon which all of our richer data structures are built, including everything from sets, maps, and queues, to bloom filters and neural networks.
Algebraic data types and mathematical algebra have some similar looking operations. In fact, it’s common in type theory to use the algebraic notation to define algebraic data types, rather than the Haskell-style data type declarations:
||It’s impossible to construct a term of this type (needs
||The type with just one term|
||Also known as
||Read “” as “Either”|
||Read “” as “And”|
||More on this in a moment|
In this essay, we’ll explore this coincidence and what that means for us as programmers. We’ll follow the path all the way to calculus.
As a first foray into the equivalence of algebraic data types and the algebra most of us are more familiar with, we’ll start with simple arithmetic. Specifically, we’ll start by counting the number of inhabitants (values) of a given type. Looking back at the examples:
|There are 0 ways to construct
Counting inhabitants is interesting and useful, but the connection runs much deeper! We can use it to gain insight into the nature of our datatypes. Let’s look at an example.
data Choice a = LeftChoice a | RightChoice a, i.e. . We can manipulate this using the familiar rules of algebra to the equivalent , or
type Choice' a = (Bool, a).
The manipulation was done using a rule we learned in grade school, before we’d ever heard of algebraic data types. What’s more, the rules of algebra were established centuries before algebraic data types were even invented. Yet the result makes sense. Our
Choice data type tags each
a as either a “left
a” or a “right
a.” Clearly it’s equivalent to
Choice’s use of a boolean tag. Isn’t it remarkable that algebraic manipulations agree with our intuition?
Exercise for the reader: Explain why
Bool -> a is not equivalent to
Either a a, but is equivalent to
(a, a) both algebraically and intuitively.
Discovering I could do this felt very much like learning algebra for the first time – elegant, powerful, exciting. It raises the questions of what else one can do? How might we use this new tool? What new powers does it give us? Where does it fail?
We’re going to play fast and loose with notation for a moment. Bear with me.
Let’s start with the familiar
List data type.
data List a = Nil | Cons a (List a) -- or, with Haskell's special syntax data [a] =  | a : [a]
We can repeatedly expand the definition by expressing it with algebraic notation.
Remarkably, this makes sense. The last line tells us that a list is either
[a, a], or
[a, a, a], and so on.
Neat trick, but let’s push our luck. The Taylor series for is , which we can use to bypass the manual expansion.
Let’s pause for a moment to remember that we’re dealing with types. And the expression
1 / (1 - a) contains both a negative and a fractional type, neither of which have a meaning yet. Let’s consider what they could possibly mean. Algebra tells us that
(1 - a) + a =
1. So, this data types combines either a
1 - a or an
a, to get just
1. What? The fractional type is just as unintelligible – given a
1 / a and an
a, we have…
1. In either case it seems we get out less than we put in! Some research has been done on deciphering some meaning from this mess, but I can’t use negative and fractional types without adopting the additional language semantics they propose. One final note – though we took some morally objectionable steps, using types we don’t have any justification for, we came out fine on the other end.
For fun, let’s try our luck with binary trees.
data BinaryTree a = Leaf a | Branch (BinaryTree a) (BinaryTree a)
Expansion by hand
(By the way, those coefficients are the Catalan numbers)
This describes a binary tree as being either a single leaf, the tree with two leaves, one of the two trees with three leaves, or one of the five trees with four leaves, and so on. Finally a reward for our diligence! We can answer the question of how many ways there are to produce a binary tree with five leaves (14), or 11 (16796)!
As you might expect, an alternate definition of binary trees which can be empty produces almost exactly the same result. Try it for yourself!
data BinaryTree a = Leaf | Branch a (BinaryTree a) (BinaryTree a)
Nat data structure shows that there are issues we haven’t yet considered. 3
data Nat = Zero | Succ Nat
This equation is clearly inconsistent, since is false for all possible values of . Why does this go wrong? Is it valid to just expand to
Nat = 1 + 1 + 1 + 1 + ... (which is right), even though we can’t isolate
Nat algebraically (and thus solve for it with Taylor series)?
Poking holes in things
It’s almost time for calculus on data types, and I want to get there just as desperately as you, dear reader, but we’ll require some motivation first. And so, it’s time for a brief interlude to think about poking holes in things. Yes, holes.
What do I mean by a hole? We want to find the places in our data structure where data could go, and remove it.
How many holes of type
a can we find in a tuple?
We can make a few observations:
- Data types with no
as (constants) have no holes of type
- The number of holes in a sum is the sum of the number of holes in each side.
- The number of holes in a product is the number of positions in the product times the number of ways to make a hole in each of those positions.
And with that we’ve finally made it. Ready for the big reveal? The rules we just saw are mysteriously similar to the familiar rules of differentiation. Differentiation tells us how to poke holes in data types. Read as “the of holes of type
|There are no holes in a constant|
|Sums are straightforward|
|As are products|
|The chain rule still applies for composition|
Let’s take a moment to marvel at the fact that differentiation, a tool developed by Newton and Leibniz over 300 years ago for physics, has made an unexpected appearance in the most discrete of settings. The correspondence was noticed in 2001 by Conor McBride.
The zipper is a way to “focus” on one element of a data structure, in order to edit it efficiently. The name is meant to evoke “going up and down in the structure … analogous to closing and opening a zipper in a piece of clothing.”
I’ll demonstrate with a list.
: -> : -> : -> : ->  | | | | 1 2 3 4
Let’s focus on the
 <- : <- : <- FOCUS -> : ->  | | | | 1 2 3 4
We can move left:
 <- : <- FOCUS -> : -> : ->  | | | | 1 2 3 4
 <- : <- : <- : <- FOCUS ->  | | | | 1 2 3 4
With each operation taking constant time.
What did we just accomplish? Normally editing an arbitrary list element is O(n), but each zipper operation is O(1).
Relating this back to holes, we see that our zipper was constructed by poking a hole in the list.
Reminder what lists look like:
What results from poking out the
Looking back at the diagrams above, we can see that
L^2 is exactly the type of our focused structure, with one list pointing left and one pointing right.
What just happened? Differentiating a data structure gave us the type of its context, which can be used as a zipper. In fact, differentiation can be used to build zippers for arbitrary ADTs – trees, maps, etc.
Can we extend the analogy to integration? Why does differentiation make a surprise appearance here? Unfortunately, I can’t answer either question.
I frankly have no clue how to interpret integration. As for why the rules for differentiation and hole-poking are the same, the best I can do is to point out that our goal is nearly the same in either case. In differentiation, the goal is to see how a function varies as one of its variables varies. In hole-poking, the goal is to see how a data structure changes as we examine one type variable. In either case, we need to see how our data changes as we vary the variable, in each place it occurs.
Algebraic data types are aptly named, for they bear a remarkably strong resemblance to the algebra we’re familiar with. This allows us to count the inhabitants of a type and manipulate data types between equivalent forms. Further, we can extend the analogy as far as calculus, where differentiation helps us build zippers for any data type.
I called our data type manipulations a “coincidence,” but the connection actually rests on solid mathematical ground. Our tricks can be justified by the fact that ADTs form a semiring.
We explored ADTs thoroughly, but only had time to scratch the surface of zippers. The rabbit hole is very deep, but I can send you straight to the bottom. Rather than taking the data structure view, Oleg Kiselyov advocates for zippers as the delimited continuation of a traversal, and has gone so far as creating a zipper filesystem, ZipperFS, and zippers with several holes. This is just one example of the large body of work on zippers.
Here are all the possible functions matching both types, showing what each input is mapped to.
Tri -> Bool:
Number Zelda Link Ganon 1 True True True 2 True True False 3 True False True 4 True False False 5 False True True 6 False True False 7 False False True 8 False False False
Bool -> Tri:
Number True False 1 Zelda Zelda 2 Zelda Link 3 Zelda Ganon 4 Link Zelda 5 Link Link 6 Link Ganon 7 Ganon Zelda 8 Ganon Link 9 Ganon Ganon
This formulation of the natural numbers is known as Peano Arithmetic. It would be frighteningly slow to compute with, but it’s almost always used for formal mathematics, where simplicity trumps efficiency. ↩
- Aki Yamada
- special thanks to Zach Gotsch