An important result in computer science and type theory is that a type system corresponds to a particular logic system.
How does this work? The basic idea is that of the Curry-Howard Correspondence. A type is interpreted as a proposition, and a value is interpreted as a proof of the proposition corresponding to its type. Most standard logical connectives can be derived from this idea: for example, the values of the pair type
(A, B) are pairs of values of types
B, meaning they’re pairs of proofs of
B, which means that
(A, B) represents the logical conjunction “A && B”. Similarly, logical disjunction (“A | | B”) corresponds to what’s called a “tagged union” type: a value (proof) of
Either A B is either a value (proof) of
A or a value (proof) of
This might be a lot to take in, so let’s take a few moments for concrete perspective.
String are propositions – you can think of simple types like these as just stating that “an Int exists” or “a String exists”.
1 is a proof of
"hands" is a proof of
(Int, String) is a simple tuple type, stating that “there exists an Int and there exists a String”.
(1, "hands") is a proof of
(Int, String). Finally, the
Either type is a bit more mysterious if you aren’t familiar with Haskell, but the type
Either a b can contain values of type
a tagged as the “left” side of an
Either or values of type
b tagged as the “right” side of an
Either Int String means “either there exists an Int or there exists a String”, and it can be proved by either
Left 1 or
Right "hands". The tags ensure that you don’t lose any information if the two types are the same:
Either Int Int can be proved by
Left 1 or
Right 1, which can be distinguished from each other by their tags.
It gets interesting when you think about implication. The type of functions that transform values (proofs) of type
A into values (proofs) of type
A -> B. Which means that a function is a proof that whenever there’s a proof of
A, you can get a proof of
B – “if A is true, then B is true.” So functions are proof of implication!
One thing is a bit strange, though. The most straightforward way to represent the proposition “not A” with the type
A -> ⊥, where
⊥ is what’s called an empty type – a type with no values. (This type’s symbol comes from mathematics. It is usually pronounced “bottom.”)
A -> Void can be read as “if we have an A, we can use it to make something that doesn’t exist” – it’s not a huge stretch to conclude from there that “we don’t have an A”, so this sounds like negation to me! But there’s a problem: in classical logic, negation is reversible. You can say “not (not A),” and classical logic says that this is the same as just saying “A.”
And it seems like classical logic is correct – if A isn’t false, it must be true, right? But if we use this
A -> Void definition of negation, our common-sense and classical-logic rule doesn’t hold. A value of type
(A -> Void) -> Void doesn’t actually give you a way to get a value of type
A. Double negation is not the same as a positive assertion! What does this mean?
Well, it signifies a paradigm shift. Type systems don’t correspond to classical logic, but to something called “constructive logic”. The paradigm shift is this: we move from thinking about whether things are actually true, to whether we can prove them, and more importantly we don’t consider existential quantifications to be proven except by an example.
We can’t do proofs by contradiction (“There must exist at least one blue object, because if there were none then we’d have a contradiction”). Instead, if we want to show the existence of an object with some property, we have to find such an object (“There must exist a blue object – in particular, this one”). As long as we keep in mind that our logic models provability rather than truth, all of the weird facts presented in this article should make sense.
Let’s think back to our “not (not A)” example. In constructive logic, a proof of
A -> Void, or “not A,” is a machine that takes an
A and does something impossible – it returns a proof of the proposition “false.” If you ever have an actual proof of false (such as if you ever have both an
A and an
A -> Void to feed it to), that is very bad: in most logics, you can use this impossible proof to prove literally anything else. All statements become equivalent, and they’re all both true and false. If you imagine the logic as its own little universe, where the rules of logic are the laws of physics, then you might say that a proof of false “destroys the world,” like a black hole imploding the entire universe down to a single point.
So, back to
not A. Most of the logics we’re interested in working with are “consistent”, which basically means that it’s impossible to destroy the world.
A -> Void means “If you give me an
A, I will destroy the world” – which means if we’re working in a consistent logic and we’ve proved
A -> Void, we know it’s impossible to prove
A -> Void seems to line up pretty well with what we’d expect
not A to mean.
Finally, back to double negation. If
not A is
a -> Void, then
not (not A)is
(A -> Void) -> Void, or “if you give me a machine that can take an
A and destroy the world, then I’ll destroy the world.” But none of this actually gives you an
A! Maybe, if you cared about some abstract untouchable notion of truth, you’d be certain that this notion applied to
A – but wearing our constructivist hats, we’re not interested in truth right now. We’re interested in
As, and we don’t have one.
The double negation weirdness comes from the fact that the Law of the Excluded Middle – the fact that for any proposition “A”, either “A” is true or “not A” is true – doesn’t hold in constructive logic. And again, this makes perfect sense. When you’re talking about abstract truth, it’s obvious that any proposition is either true or false, but when you’re dealing with provability – can you build, with no information whatsoever, a box that contains either a value of type
A or a machine that destroys the world if you feed it a value of type
A? I can’t either! A lot of the classical rules that seem obvious when you’re thinking about “truth”, no longer make sense when we’re talking about “provability” instead.
Haskell and Hindley-Milner
Hindley-Milner type systems – which I here use to refer broadly to type systems like those used by Haskell and OCaml – correspond to propositional logic.
Propositional logic is pretty boring. It’s basically boolean algebra, with a few variables thrown in.
Of course, we’re still dealing with constructive propositional logic, rather than classical. We still can’t construct
Either a (a -> Void). (Unless we cheat with
undefined; more on that below.)
In Haskell’s type system, a type like
Int corresponds to a proposition like “There exists an integer.” Pretty straightforward, right? The value
2 is obviously proof that there exists an integer – everything checks out.
“Propositional variables” like the “p, q, and r” you often find in discussions of logic are introduced with polymorphism. The Haskell type
a -> a, the type of the identity function, is the uninteresting logical theorem that any proposition implies itself. Something like
(a, (b, c)) -> ((a, b), c) represents the slightly more interesting theorem that logical conjunction is associative (in other words, that whenever
(A && B) && C is true,
A && (B && C) is true too, and vice versa).
Remember, though, that it’s not the types that prove anything – we can certainly have nonsensical types like
a -> b (“every proposition implies every other”) or indeed
Void (“false”). The type is the proposition itself – the proof is a value of that type, like
id x = x or
f (x, (y, z)) = ((x, y), z).
And even though we can’t prove the law of the excluded middle,
Either a (a -> Void), we can do something vaguely similar:
a -> (a -> Void) -> Void, which corresponds to “p implies not (not p)” and has the straightforward implementation
f counterexample machine = machine counterexample. If we have an
a and a machine that will destroy the world if given an
a, and we happen to want to destroy the world – well, we can just feed our
a into the machine!
We can also do some things that are more questionable, which most Haskellers (myself included) prefer to sweep under the rug. For example, the proposition
(a -> a) -> a represents the rather alarming logical proposition that “any statement that implies itself is true.” We’ve already proved that every statement implies itself, since the identity function (
id x = x) is a proof of
a -> a, so if we had a proof of
(a -> a) -> a, that would mean that every statement is true! So we’d like this
(a -> a) -> a proposition to be false in any reasonable logic.
Alas, it was not to be, because Haskell gives us the power of recursion, unrestricted. We can easily implement our type signature, which we might call “the type of circular arguments”, as
fix f = f (fix f). It’s called
fix because this is actually a sometimes-useful function for calculating fixpoints – the point where repeatedly applying a function to a value gives stops changing the result. But the fact that we can write this function means that Haskell’s type system corresponds to a logic that is unsound.
If that’s a little abstract for you, we can showcase the problem with a more straightforward example. The
Void type, defined simply by the empty data declaration
data Void, is supposed to be empty. The entire point of the
Void type is to represent “false”. But
Void isn’t actually empty – we can construct members of it!
explode :: Void; explode = explode is Haskell code that defines an infinite loop, similar to the python
def f(): return f(). However, even though it’s obviously silly, this definition is type-correct –
explode has type
Void, according to the type declaration we provided with
:: – and Haskell allows us to write arbitrary recursive definitions, so there’s no way to rule out things like this from the language. Any language that allows you to write an infinitely-recursive function will have this problem.
This is the reason that a lot of theoretical and research languages disallow general recursion: they want to correspond to a logic that doesn’t let you prove anything through circular reasoning. (For more information on this, the field is called total functional programming.)
Examining Haskell’s type system might have been interesting, but dependent type systems (whose major exemplars are Agda, Coq, and Idris) bring a whole new level of subject matter into the logic. In Haskell, types can’t talk about values, and the corresponding logic systems are reduced to a bunch of abstract “proposition” variables that don’t mean anything on their own. But in a dependent type system, types can talk about values, which means that the logic of the type system can explicitly describe the behavior of constructs in the programming language. For example, take the proposition “If a list’s length is greater than one, then that list has a head and a tail.” In logic, you might render this as “forall lists x, (length(x) > 1) implies there exists hd and tl such that x = hd :: tl”. (Here,
:: is the linked list construction operator.) This can be rendered as the type
(x : List a) -> length x > 1 -> (hd : a, tl : List a, x = hd :: tl). Once you get used to this, it reads pretty much the same way as the logical statement it came from!
(Side note: the single colon here is for type annotations, so
x : List a means that “
x has type
List a, in contrast to the double colon used in Haskell.)
Some notes about how the logic works: dependent types are first-order predicate logic. Function types represent universal quantification: the function type
(x : T) -> P x says “for all proofs x of T, there is a proof of P x”. If we go back to the machine analogy, it’s the type of machines that can take any
T and give you back the
P corresponding to the particular
T you gave it. That’s a bit abstract, so let’s consider a more specific example:
(x : Nat) -> Either (Even x) (Odd x)
Nat is the type of natural numbers;
Odd are functions that, given a natural number, return the proposition (or type) “that number is even” and “that number is odd”, respectively. So
Even 3 is a perfectly valid type with no inhabitants (you can’t prove that 3 is even because it isn’t). So
(x : Nat) -> Either (Even x) (Odd x) is the type of machines that, given any natural number, will determine whether that number is even or odd (and then give you the proof). As a proposition, it represents “for all natural numbers X, either X is even or X is odd” or “every natural number is either even or odd.”
Existential quantification is represented by pair types. The proposition “there exists some Thingy which has property P” is represented by the pair type
(x : Thingy, P x). This is pretty obvious if you think about it – if you want to (constructively) prove that a Thingy exists with a given property, you need to actually provide such a Thingy, and then you need to prove that it has the property. If you want to prove there exists an even number (
(n : Nat, Even n)), you need to give me an even number and then prove that it’s even.
Between existential and universal quantification and being able to actually refer to language constructs from inside the logic of the type system, this gives dependent type systems the ability to represent almost any fact you can imagine about how your program works – and have the typechecker demand that you prove your code works as you say. This is pretty exciting to me!
Exotic type systems
Despite being constructive rather than classical, the logic modeled by dependent type systems is actually pretty familiar. It still deals with the same notion of “proposition,” and it’s easy to confuse constructive logic’s notion of “provability” with classical logic’s notion of “truth.”
There are even more exotic type systems, though, and the logics they generate are strange and alien. A linear type system, for example, places constraints on the number of times a given value can be used. (This might be used for representing things like mutable arrays in functional programming languages: the “modify” operation would return a new array, as it does in traditional FP languages, but the old array is “consumed” and if you try to access it when it no longer exists, your program won’t typecheck.) This gives rise to a logic in which proofs can be “used up” – so that if you use them to reason about one thing, you can no longer use them to reason about another. Another way to think about it is that, just like constructive logic models provability instead of truth, linear logic models preservation of resources1 instead of truth.
Linear logic is just the beginning. There are also substructural type systems, which involve different kinds of constraints about when and where values can be used, and even type systems that involve fractional and negative types (if you look closely, a lot of type-operations turn out to be generalizations of numeric operations; fractional and negative types continue this trend), which involve logical constraints and the flow of information backward through time. (If this sounds a little far-fetched, it’s because I don’t understand fractional and negative types myself… yet.)
Why constructive logic
One final word before we conclude. Type-based logic forces us to use constructive logic rather than classical logic – which involves such strange contortions as denying the law of the excluded middle. Why would we want to do this?
Well, there are a few reasons. First, as I mentioned earlier, constructive logic isn’t actually that strange, as long as you think about it on its own terms. In classical logic, the law of the excluded middle is “For any statement P, either P is true or P is false”, which sounds obvious. But in constructive logic, it would be better to phrase the Law as “For any statement P, either I possess a proof of P or I possess a disproof of P”, which is obviously wrong – I’m not omniscient, and even if I was, we know that there are statements which are neither provable nor disprovable!2
But why would we want to talk about provability instead of truth? There are a couple reasons, most of them falling under the same theme: truth is useless for computation and proofs are not. Constructivism is a stricter standard than classical logic; a constructive proof contains much more information that can be analyzed while a non-constructive classical proof might leave you scratching your head as to what the proof can be used for. In computer science we’re particular fans of constructive logic, because an implementation of some specification amounts to a constructive proof that “there exists a program satisfying that specification”. Generally, if you have a constructive proof of something, that corresponds pretty well to our idea of what a “usable implementation” is, but if you have a proof that is classically valid but not constructively valid, that proof doesn’t tell you anything about how to get a “usable implementation.” So constructive proofs are a step closer to the practical, everyday notion of “can we do this” than classical proofs are.
It’s also worth noting that constructive logic is a better model for how knowledge works. A proposition might be true or not, but we never know that it’s true unless we have a proof. The law of the excluded middle makes sense from the point of view of an outside observer, overseeing the platonic Realm of Mathematics… but from the inside, where we can’t see the crystalline truth, it is a lot less useful to be told that a particular statement is “either true or false.”
Hoare, C. A. R., 1985. Communicating Sequential Processes. Prentice-Hall International. ISBN 0-13-153271-5 (via Wikipedia) ↩
Famous examples of this include the Axiom of Choice (if you don’t assume it) and the Continuum Hypothesis. Also, Goedel’s famous result gives a procedure for constructing such a statement for any sufficiently powerful proof system. ↩