Category Archives: Good Math

Interpreting Lambda Calculus using Closed Cartesian Categories

Today I’m going to show you the basic idea behind the equivalency of closed cartesian categories and typed lambda calculus. I’ll do that by showing you how the λ-theory of any simply typed lambda calculus can be mapped onto a CCC.

First, let’s define the term “lambda theory”. In the simply typed lambda calculus, we always have a set of base types – the types of simple atomic values that can appear in lambda expressions. A lambda theory is a simply typed lambda calculus, plus a set of additional rules that define equivalences over the base types.

So, for example, if one of the base types of a lambda calculus was the natural numbers, the lambda theory would need to include rules to define equality over the natural numbers:

  1. x = y if x=0 and y=0; and
  2. x = y if x=s(x’) and y=s(y’) and x’ = y’

So. Suppose we have a lambda-theory L. We can construct a corresponding category C(L). The objects in C(L) are the types in L. The arrows in C(L) correspond to families of expressions in L; an arrow
f : A \rightarrow B corresponds to the set of expressions of type B that contain a single free variable of type A.

The semantics of the lambda-theory can be defined by a functor; in particular, a cartesian closed functor F that maps from C(L) to the closed cartesian category of Sets. (It’s worth noting that this is completely equivalent to the normal Kripke semantics for lambda calculus; but when you get into more complex lambda calculi, like Hindley-Milner variants, this categorical formulation is much simpler.)

We describe how we build the category for the lambda theory in terms of a CCC using something called an interpretation function. It’s really just a notation that allows us to describe the translation recursively. The interpretation function is written using brackets: [A] is the categorical interpretation of the type A from lambda calculus.

So, first, we define an object for each type in L. We need to include a special
type, which we call unit. The idea behind unit is that we need to be able to talk about “functions” that either don’t take any real paramaters, or functions that don’t return anything. Unit is a type which contains exactly one atomic value. Since there’s only one possible value for unit, and unit doesn’t have any extractable sub-values, conceptually, it doesn’t ever need to be passed around. So it’s a “value” that never needs to get passed – perfect for a content-free placeholder.

Anyway, here we go with the base rules:

  • \forall A \in \mbox{basetypes}(L), [A] = A_C \in C(L)
  • [\mbox{unit}] = 1_C

Next, we need to define the typing rules for complex types:

  • [ A \times B] == [A] \times [B]
  • [A \rightarrow B] = [B]^{[A]}

Now for the really interesting part. We need to look at type derivations – that is, the type inference rules of the lambda calculus – to show how to do the correspondences between more complicated expressions. Just like we did in lambda calculus, the type derivations are done with a context Gamma, containing a set of type judgements. Each type judgement assigns a type to a lambda term. There are two translation rules for contexts:

  • [ \emptyset ] = 1_C
  • [\Gamma, x: A] = [\Gamma] \times [A]

We also need to describe what to do with the values of the primitive types:

  • For each value v : A, there is an arrow v : 1 \rightarrow A_C.

And now the rest of the rules. Each of these is of the form [\Gamma :- x : A] = \mbox{arrow}, where we’re saying that \Gamma entails the type judgement x : A. What it means is the object corresponding to the type information covering a type inference for an expression corresponds to the arrow in C(L).

  • Unit evaluation:  [ \Gamma :- \mbox{unit}: \mbox{Unit}] = !: [\Gamma] \rightarrow [\mbox{Unit}]. (A unit expression is a special arrow “!” to the unit object.)
  • Simple Typed Expressions: [\Gamma :- a: A_C] = a \circ ! : [\Gamma] \rightarrow [A_C]. (A simple value expression is an arrow composing with ! to form an arrow from Γ to the type object of Cs type.)
  • Free Variables:  [\Gamma x: A :- x : A] = \pi_2 : ([\Gamma] \times [A]) \rightarrow [A] (A term which is a free variable of type A is an arrow from the product of Γ and the type object A to A; That is, an unknown value of type A is some arrow whose start point will be inferred by the continued interpretation of gamma, and which ends at A. So this is going to be an arrow from either unit or a parameter type to A – which is a statement that this expression evaluates to a value of type A.)
  • Inferred typed expressions: , where (If the type rules of Γ plus the judgement x : A gives us x, then the term x is an arrow starting from the product of the interpretation of the full type context with A), and ending at . This is almost the same as the previous rule: it says that this will evaluate to an arrow for an expression that results in type A.)
  • Function Abstraction: [\Gamma :- \lambda x:A . M : A \rightarrow B] = \mbox{curry}([\Gamma, x:A :- M:B]) : [\Gamma] \rightarrow B^{[A]}. (A function maps to an arrow from the type context to an exponential [B]^{[A]}, which is a function from A to B.)
  • Function application: , , . (function evaluation takes the eval arrow from the categorical exponent, and uses it to evaluate out the function.)

There are also two projection rules for the decomposing categorical products, but they’re basically more of the same, and this is already dense enough.

The intuition behind this is:

  • arrows between types are families of values. A particular value is a particular arrow from unit to a type object.
  • the categorical exponent in a CC is exactly the same thing as a function type in λ-calculus; and an arrow to an exponent is the same thing as a function value. Evaluating the function is using the categorical exponent’s eval arrow to “decompose” the exponent, and produce an arrow to the function’s result type; that arrow is the value that the function evaluates to.
  • And the semantics – called functorial semantics – maps from the objects in this category, C(L) to the category of Sets; function values to function arrows; type objects to sets; values to value objects and arrows. (For example, the natural number type would be an object in C(L), and the set of natural numbers in the sets category would be the target of the functor.)

Aside from the fact that this is actually a very clean way of defining the semantics of a not-so-simply typed lambda calculus, it’s also very useful in practice. There is a way of executing lambda calculus expressions as programs that is based on this, called the Categorical Abstract Machine. The best performing lambda-calculus based programming language (and my personal all-time-favorite programming language), Objective-CAML had its first implementation based on the CAM. (CAML stands for categorical abstract machine language.).

From this, you can see how the CCCs and λ-calculus are related. It turns out that that relation is not just cool, but downright useful. Concepts from category theory – like monads, pullbacks, and functors are really useful things in programming languages! In some later posts, I’ll talk a bit about that. My current favorite programming language, Scala, is one of the languages where there’s a very active stream of work in applying categorical ideas to real-world programming problems.

Programs As Proofs: Models and Types in the Lambda Calculus

Lambda calculus started off with the simple, untyped lambda calculus that we’ve been talking about so far. But one of the great open questions about lambda calculus was: was it sound? Did it have a valid model?

Church found that it was easy to produce some strange and non-sensical expressions using the simple lambda calculus. In order to try to work around those problems, and end up with a consistent system, Church introduced the concept of types, producing the simply typed lambda calculus. Once types hit the scene, things really went wild; the type systems for lambda calculi have never stopped developing: people are still finding new things to do by extending the LC type system today! Most lambda calculus based programming languages are based on the Hindley-Milner lambda calculus, which is a simplification of one of the standard sophisticated typed lambda calculi called SystemF. There’s even a Lambda Cube which can categorize the different type abstractions for lambda calculus (but alas, despite its name, it’s not related to the time cube.) Once people really started to understand types, they realized that the untyped lambda calculus was really just a pathologically simple instance of the simply typed lambda calculus: a typed LC with only one base type.

The semantics of lambda calculus are easiest to talk about in a typed version. For now, I’ll talk about the simplest typed LC, known as the simply typed lambda calculus. One of the really amazing things about this, which I’ll show, is that a simply typed lambda calculus is completely semantically equivalent to an intuitionistic propositional logic: each type in the program is a proposition in the logic; each β reduction corresponds to an inference step; and each complete function corresponds to a proof! Look below for how.

Types

The main thing that typed lambda calculus adds to the mix is a concept called base types. In a typed lambda calculus, you have some universe of atomic values which you can manipulate; those values are partitioned into the *base types*. Base types are usually named by single lower-case greek letters: So, for example, we could have a type “σ”, which consists of the set of natural numbers; a type “τ” which corresponds to boolean true/false values; and a type “γ” which corresponds to strings.

Once we have basic types, then we can talk about the type of a function. A function maps from a value of one type (the type of parameter) to a value of a second type (the type of the return value). For a function that takes a parameter of type “γ”, and returns a value of type “δ”, we write its type as “γ → δ”. “→” is called the _function type constructor_; it associates to the right, so “γ → δ → ε” parses as “γ → (δ → ε)”

To apply types to the lambda calculus, we do a couple of things. First, we need a syntax update so that we can include type information in lambda terms. And second, we need to add a set of rules to show what it means for a typed program to be valid.

The syntax part is easy. We add a “:” to the notation; the colon has an expression or variable binding on its left, and a type specification on its right. It asserts that whatever is on the left side of the colon has the type specified on the right side. A few examples:

\lambda x: \nu . x + 3
This asserts that the parameter, x has type nu, which we’ll use as the type name for the natural numbers. (In case it’s hard to tell, that’s a greek letter “nu” for natural.) There is no assertion of the type of the result of the function; but since we know that “+” is a function with type \nu  \rightarrow \nu \rightarrow \nu, we can infer that the result type of this function will be \nu.
(\lambda x . x + 3): \nu \rightarrow \nu
This is the same as the previous, but with the type declaration moved out, so that it asserts the type for the lambda expression as a whole. This time we can infer that x : \nu because the function type is \nu \rightarrow \nu, which means that the function parameter has type \nu.
\lambda x: \nu, y:\delta . \text{if}\hspace{1ex} y\hspace{1ex} \text{then}\hspace{1ex} x * x \hspace{1ex}\text{else} \hspace{1ex} x
This is a two parameter function; the first parameter has type ν, and the second has type δ. We can infer the return type, which is ν. So the type of the full function is ν → δ → ν. This may seem surprising at first; but remember that lambda calculus really works in terms of single parameter functions; multi-parameter functions are a shorthand for currying. So really, this function is: λ x : ν . (λ y : δ . if y then x * x else x); the inner lambda is type δ → ν; the outer lambda is type ν → (δ → ν).

To talk about whether a program is valid with respect to types (aka well-typed), we need to introduce a set of rules for type inference. Then we can verify that the program is type-consistent.

In type inference, we talked about judgements. When we can infer the type of an expression using an inference rule, we call that inference a type judgement. Type inference and judgements allow us to reason about types in a lambda expression; and if any part of an expression winds up with an inconsistent type judgement, then the expression is invalid. (When Church started doing typed LC, one of the motivations was to distinguish between values representing “atoms”, and values representing “predicates”; he was trying to avoid the Godel-esque paradoxes, by using types to ensure that predicates couldn’t operate on predicates.)

Type judgements are usually written in a sequent-based notation, which looks like a fraction where the numerator consists of statements that we know to be true; and the denominator is what we can infer from the numerator. In the numerator, we normally have statements using a context, which is a set of type judgements that we already know;it’s usually written as an uppercase greek letter. If a type context includes the judgement that x : \alpha, I’ll write that as \Gamma :- x : \alpha.

Rule 1: Type Identity

\frac{\mbox{}}{x : \alpha :- x: \alpha}

This is the simplest rule: if we have no other information except a declaration of the type of a variable, then we know that that variable has the type it was declared with.

Rule 2: Type Invariance

\frac{ \Gamma :- x:\alpha, x != y}{\Gamma + y:\beta :- x:\alpha}

This rule is a statement of non-interference. If we know that x:\alpha, then inferring a type judgement about any other term cannot change our type judgement for x.

Rule 3: Function Type Inference

\frac{\Gamma + x:\alpha :- y:\beta}{\Gamma :- (\lambda x:\alpha. y):\alpha \rightarrow \beta}

This statement allows us to infer function types given parameter types. Ff we know the type of the parameter to a function is \alpha; and if, with our knowledge of the parameter type, we know that the type of term that makes up the body of the function is \beta, then we know that the type of the function is \alpha \rightarrow \beta.

Rule 4: Function Application Inference

\frac{\Gamma :- x: \alpha \rightarrow \beta, \Gamma :- y:\alpha}{\Gamma :- (x y): \beta}

This one is easy: if we know that we have a function that takes a parameter of type \alpha and returns a value of type \beta, then if we apply that function to a value of type \alpha, we’ll get a value of type \beta.

These four rules are it. If we can take a lambda expression, and come up with a consistent set of type judgements for every term in the expression, then the expression is well-typed. If not, then the expression is invalid.

So let’s try taking a look at a simple lambda calculus expression, and seeing how inference works on it.

\lambda x y . y x

Without any type declarations or parameters, we don’t know the exact type. But we do know that “x” has some type; we’ll call that “α”; and we know that “y” is a function that will be applied with “x” as a parameter, so it’s got parameter type α, but its result type is unknown. So using type variables, we can say “x:α,y:α→β”. We can figure out what “α” and “β” are by looking at a complete expression. So, let’s work out the typing of it with x=”3″, and y=”λ a:ν.a*a”. We’ll assume that our type context already includes “*” as a function of type “ν→ν→ν”, where ν is the type of natural numbers.

  • “λ x y . y x) 3 (λ a:ν . a * a)”: Since 3 is a literal integer, we know its type: 3:ν.
  • By rule 4, we can infer that the type of the expression “a*a” where “a:ν” is “ν”, and *:ν→ν→ν so therefore, by rule 3 the lambda expression has type “ν→ν”. So with type labelling, our expression is now: “(λ x y . y x) (3:ν) (λ a:ν.(a*a):ν) : ν→ν”.
  • So – now, we know that the parameter “x” of the first lambda must be “ν”; and “y” must be “ν→ν”; so by rule 4, we know that the type of the application expression “y x” must be “ν”; and then by rule 3, the lambda has type: “ν→(ν→ν)→ν”.
  • So, for this one, both α and β end up being “ν”, the type of natural numbers.

So, now we have a simply typed lambda calculus. The reason that it’s simply typed is because the type treatment here is minimal: the only way of building new types is through the unavoidable \rightarrow constructor. Other typed lambda calculi include the ability to define parametric types, which are types expressed as functions ranging over types.

Programs are Proofs

Here’s where it gets really fun. Think about the types in the simple typed language calculus. Anything which can be formed from the following grammar is a lambda calculus type:

  • type ::= primitive | function | ( type )
  • primitive ::= α | β | δ | …
  • function ::= type→type

The catch with that grammar is that you can create type expressions which, while they are valid type definitions, you can’t write a single, complete, closed expression which will actually have that type. (A closed expression is one with no free variables.) When there is an expression that has a type, we say that the expression inhabits the type; and that the type is an inhabited type. If there is no expression that can inhabit a type, we say it’s uninhabitable. Any expression which either can’t be typed using the inference rules, or which is typed to an uninhabitable type is a type error.

So what’s the difference between inhabitable type, and an uninhabitable type?

The answer comes from something called the Curry-Howard isomorphism. For a typed lambda calculus, there is a corresponding intuitionistic logic. A type expression is inhabitable if and only if the type is a provable theorem in the corresponding logic.

The type inference rules in lambda calculus are, in fact, the same as logical inference rules in intuitionistic logic. A type \alpha \rightarrow \beta can be seen as either a statement that this is a function that maps from a value of type \alpha to a value of type \beta, or as a logical statement that if we’re given a fact alpha \alpha, we could use that to infer the truth of a fact \beta.

If there’s a logical inference chain from an axiom (a given type assignment) to an inferred statement, then the inferred statement is an inhabitable type. If we have a type \alpha \rightarrow \alpha, then given a inhabited type \alpha, we know that \alpha \rightarrow \alpha is inhabitable, because if \alpha is a fact, then \alpha \rightarrow \alpha is also a fact.

On the other hand, think of a different case \alpha \rightarrow \beta. That’s not a theorem, unless there’s some other context that proves it. As a function type, that’s the type of a function which, without including any context of any kind, can take a parameter of type α, and return a value of a different type β. You can’t do that – there’s got to be some context which provides a value of type β – and to access the context, there’s got to be something to allow the function to access its context: a free variable. Same thing in the logic and the lambda calculus: you need some kind of context to establish “α→β” as a theorem (in the logic) or as an inhabitable type (in the lambda calculus).

It gets better. If there is a function whose type is a theorem in the corresponding intuitionistic logic, then the program that has that type is a proof of the theorem. Each beta reduction is equivalent to an inference step in the logic. This is what programming languages geeks like me mean when we say “the program is the proof”: a well-typed program is, literally, a proof its well-typed-ness.

To connect back to the discussion about models: the intuitionistic logic corresponding to the lambda calculus and intuitionistic logic are, in a deep sense, just different reflections of the same thing. We know that intuitionistic logic has a valid model. And that, in turn, means that lambda calculus is valid as well. When we show that something is true using the lambda calculus, we can trust that it’s not an artifact of an inconsistent system.

Models and Why They Matter

As I said in the last post, Church came up with λ-calculus, which looks like it’s a great formal model of computation. But – there was a problem. Church struggled to find a model. What’s a model, and why would that matter? That’s the point of this post. To get a quick sense of what a model is, and why it matters?

A model is basically a mapping from the symbols of a logical system to some set off objects, such that all statements that you can prove in the logical system will be true about the corresponding objects. Note that when I say object here, I don’t necessarily mean real-world physical objects – they’re just something that we can work with, which is well-defined and consistent.

Why does it matter? Because the whole point of a system like λ-calculus is because we want to use it for reasoning. When you have a logical system like λ-calculus, you’ve built this system with its rules for a reason – because you want to use it as a tool for understanding something. The model provides you with a way of saying that the conclusions you derive using the system are meaningful. If the model isn’t correct, if it contains any kind of inconsistency, then your system is completely meaningless: it can be used to derive anything.

So the search for a model for λ-calculus is really important. If there’s a valid model for it, then it’s wonderful. If there isn’t, then we’re just wasting our time looking for one.

So, now, let’s take a quick look at a simple model, to see how a problem can creep in. I’m going to build a logic for talking about the natural numbers – that is, integers greater than or equal to zero. Then I’ll show you how invalid results can be inferred using it; and finally show you how it fails by using the model.

One quick thing, to make the notation easier to read: I’m going to use a simple notion of types. A type is a set of atoms for which some particular one-parameter predicate is true. For example, if P(x) is true, I’ll say that x is a member of type P. In a quantifier, I’ll say things like forall x in P: mbox{em foo} to mean forall x : P(x) Rightarrow mbox{em foo}. Used this way, we can say that P is a type predicate.

How do we define natural numbers using logic?

First, we need an infinite set of atoms, each of which represents one number. We pick one of them, and call it zero. To represent the fact that they’re natural numbers, we define a predicate {cal N}(x), which is true if and only if x is one of the atoms that represents a natural number.

Now, we need to start using predicates to define the fundamental properties of numbers. The most important property of natural numbers is that they are a sequence. We define that idea using a predicate, mbox{em succ}(x,y), where mbox{em succ}(x,y) is true if and only if x = y + 1. To use that to define the ordering of the naturals, we can say: forall x in {cal N}: exists y: mbox{em succ}(x, y).

Or in english: every natural number has a successor – you can always add one to a natural number and get another natural number.

We can also define predecessor similarly, with two statements:

  1. forall x in {cal N}: exists y in {cal N}: mbox{em pred}(x, y).
  2. forall x,y in {cal N}: mbox{em pred}(y,x) Leftrightarrow mbox{em succ}(x,y)

So every number has a predecessor, and every number has a successor, and x is the predecessor of y if y is the successor of x.

To be able to define things like addition and subtraction, we can use successor. Let’s define addition using a predicate Sum(x,y,z) which means “z = x + y”.

  1. forrall x,y in {cal N}: exists z in {cal N} : Sum(x,y,z)
  2. forall x,y in {cal N}: Sum(x, 0, x)
  3. forall x,y,z in {cal N}: exists a,b in {cal N}: Sum(a,b,z) land mbox{em succ}(a,x) land mbox{em succ}(y,b) Rightarrow Sum(x, y, z)

Again, in english: for any two natural numbers, there is a natural number that it their sum; x + 0 always = x; and for any natural number, x + y = z is true if (x + 1) + (y – 1) = z.

Once we have addition, subtraction is easy: forall x,y,z in {cal N} : diff(x,y,z) Leftrightarrow sum(z,y,x)

That’s: x-y=z if and only if x=y+z.

We can also define greater than using addition:

  • . But we’ve violated that – we have both forall x in {cal N}: 0 le x, and