Notes

Turning Lambda Calculus Into a Minimal Programming Language

Lambda calculus is Turing-complete, but stuck in the Turing tarpit: “Everything is possible, but nothing of interest is easy.” What would it take to turn lambda calculus into a language that retains the minimalism of the calculus, but can be used for practical programs, such as a parser for a simple language or a static site generator?

Let's add a few constructs that might be trivial, but show how the lambda calculus can be evolved into something that feels quite similar to modern languages. Enriching a calculus step by step allows us to think about and sharpen some of the concepts we use in programming, which are hard to see from within the uniform view of a Turing tarpit, where everything looks like a function. So, let's dive into the tarpit and slowly crawl back out again!

(You can see the results of this experiment on Github.)

Pure Lambda Calculus

Lambda calculus uses only three constructs. A term t can be:

A bit more formally, as a BNF grammar:

t = x                  // variable
  | "(λ" var "." t ")" // abstraction
  | "(" t t ")"        // application

A Bit of Syntax and Sugar

Let's turn the calculus into something that can be used for practical programs. The first change is purely cosmetic, by using a more familiar notation for abstractions and applications: Instead of “(λx.y)”, let's use the JavaScript-inspired syntax “x => y”. Similarly, instead of “(f x)”, let's write “f(x)”, leading to the following grammar:

t = var         // variable
  | var "=>" t  // abstraction
  | t "(" t ")" // application

Without parentheses around abstractions, applications of abstractions can become ambiguous: Is “x => y(z)” meant to be a function of “x” that returns “y(z)” or the function “x => y” applied to the argument “z”? To resolve this ambiguity, let's stick with the precedence used in most programming languages and interpret “x => y(z)” as a function that returns “y(z)” and require parentheses for the other case, “(x => y)(z)”.

While we're at it, let's add a little bit of syntactic sugar. Instead of writing “f(x)(y)(z)”, we will write “f(x, y, z)”. This is merely sugar, there are no multi-argument functions in our extended calculus (because currying is sufficient), we just use a more familiar function call syntax.

We will also add let expressions of the form “let ... = ... ...” as sugar, so that...

let y = f(x)
g(y, y)

...is desugared as...

(y => g(y, y))(f(x))

This allows us to stick to a more familar flow of reading, where the definitions are introduced before they are used. (Notice how “f(x)” and “g(y, y)” are only separated by whitespace. To keep everything indentation-insensitive, the parentheses of application arguments need to start immediately after the caller, without any space between them, so that expressions like “let x = f(x => x)   (...)” can be distinguished from “let x = f   (x => x)(...)”.)

We will treat all syntactic sugar as a two-way transformation, meaning that it will not only be desugared into expressions during parsing, but returned expressions are also automatically “re-sugared”, so that for example “f(x)(y)” is always printed as “f(x, y)” and “(x => y)(z)” is always printed as “let x = z y”. Always re-sugaring expression keeps pretty-printing simple and ensures that we only adopt a tiny bit of syntatic sugar which is guaranteed to be generally applicable, not just in a case-by-case fashion.

Data Structures

Data structures and values such as true, false, numbers and lists can be encoded in lambda calculus by using functions. Church encoding is probably the most well-known example of such an encoding. The boolean value true is encoded as “x => y => z” (in other words as the function returning the first of its two arguments), false is encoded as “x => y => y” (the function returning the second of its two arguments). Instead of requiring an if-then-else construct, these values are already a sort of if-then-else, because they are directly applied to the then/else values: Instead of writing “if true then p else q” we can just write “(x => y => x)(p, q)” and get back “p”.

The encoding for numbers is slightly more complicated and the details don't matter here, but 0 becomes “f => x => x”, 1 becomes “f => x => f(x)”, 2 becomes “f => x => f(f(x))” and so on. Notice how false and 0 are encoded as the same function in lambda calculus, in both cases returning the second argument.

Such an encoding is quite elegant, but problematic if we want to use our calculus for actual programming:

Let's instead add two constructs to our calculus, one to construct data and one to break data apart again. To construct data, we use “tags” (also called “symbols” or “atoms” in Lisp / Elixir / Ruby), which are distinguished from variables by using uppercase names. “Foo” is a tag that just stands for itself and is not resolved to anything else. To construct more complex data structures such as lists, we use the existing application construct and apply values to a tag (using the existing sugar for applying multiple arguments at once), for example “Cons(X, Nil)”.

To access and destructure data, we use an “if ... is ... ... else ...” conditional construct, which compares a value with a pattern and returns the then-branch if the pattern matches, or the else-branch if it does not. A pattern can either be a simple tag, such as “Nil”, or a tag applied to one or more distinct variables, such as “Cons(x, y)” (but “Cons(x, x)” would be invalid because the variables are not distinct). Here's an example:

let list = Cons(Foo, Cons(Bar, Nil))
if list is Cons(x, xs)
  xs
else
  Nil

Since “list” matches the pattern “Cons(x, xs)”, the expression will return the then-branch, with x bound to “Foo” and xs bound to “Cons(Bar, Nil)”, thus returning “Cons(Bar, Nil)”.

This combination of tags and conditionals allow us to stick to the familiar style of programming with algebraic data types and pattern matching, but is much more minimal: We simply construct data structures (without needing to declare data types first), and patterns can only be a simple, consisting of a tag and a list of variables (instead of allowing full pattern matching with nested data structures and unification).

Let's update the grammar with our new constructs:

t = ...
  | tag                        // tag (starts with uppercase)
  | "if" t "is" pat t "else" t // conditional

pat = tag
    | pat "(" var ")" // application with distinct (!) vars

Explicit Recursion

While we take it for granted that function definitions in modern languages can refer to themselves and thus recursively call the function being defined, lambda calculus does not support direct (named) recursion and instead relies on fixed-point combinators such as:

let fix = f => (x => f(x(x)))(x => f(x(x)))
fix(foo) // == foo(fix(foo)) == foo(foo(fix(foo))) == ...

This is quite elegant, but not very practical. Instead, we would like to be able to refer to the function being defined by name, without having to invoke a combinator. Let's define a construct “var ~> t“ that works like abstractions, but instead of binding a variable to a single argument it will bind a variable to the expression itself:

r ~> Cons(Foo, r)
// == Cons(Foo, r ~> Cons(Foo, r))
// == Cons(Foo, Cons(Foo, r ~> Cons(Foo, r)))
// == Cons(Foo, Cons(Foo, Cons(Foo, r ~> Cons(Foo, r))))
// == ...

Notice how we used recursion to define an infinite value, because that is the simplest case, but we can just as well use it to define a recursive function such as map:

map ~> f => xs =>
  if xs is Cons(x, xs)
    Cons(f(x), map(f, xs))
  else
    xs

It is often useful to define a recursive function and then bind it with the same name in a let expression, but it is annoying to have to type the same name twice. Let's add a bit of sugar for such a case, so that...

loop r = x
y

...will be desugared as...

let r = r ~> x
y

Putting it all together, here's how we can define and apply our map function:

loop map = f => xs =>
  if xs is Cons(x, xs)
    Cons(f(x), map(f, xs))
  else
    xs

map(x => Foo(x), Cons(Bar, Cons(Baz, Nil)))
// == Cons(Foo(Bar), Cons(Foo(Baz), Nil))

Let's update the grammar with our recursion construct:

t = ...
  | var "~>" t // recursion

Putting it Together

This brings the total number of constructs up to 6: The original 3 lambda calculus constructs (variables, abstractions, applications), 2 more for data structures and 1 for recursion:

t = var                        // variable
  | var "=>" t                 // abstraction
  | t "(" t ")"                // application
  | tag                        // tag
  | "if" t "is" pat t "else" t // conditional
  | var "~>" t                 // recursion

This is again ignoring ambiguity and precedence issues introduced by missing parentheses. We can resolve any ambiguity by observing that three constructs can end with a term t (abstractions, recursions and conditionals) and that only these constructs need to be wrapped in parentheses when they appear as the left term of an application:

t = t'  // might need to be wrapped in parentheses
  | t'' // never need parentheses

t' = var "=>" t
   | var "~>" t
   | "if" t "is" pat t "else" t

t'' = var
    | tag
    | "(" t' ")(" t ")"
    | t'' "(" t ")"

The Rust code for the parser, pretty-printer and evaluator is on Github.

A Separation of Concepts

Why go through all the trouble of adding three new constructs and a bit of sugar to the lambda calculus? Isn't the end result pretty trivial, having exactly the same expressive power as lambda calculus? Yes, but that's the point! All of the above extensions are mathematically trivial and have been proposed before (the resulting calculus is quite similar to Plotkin's PCF, for example), but enriching a calculus in such a way can clarify some of the concepts we use in everyday programming.

Viewed from within the Turing tarpit of the lambda calculus, there is no difference between functions and data, they are the same concept. This is mathematically elegant, but philosophically poor, because we lose a conceptual distinction that is quite useful in programming. (There might be some echoes of Wittgenstein's remarks on the difference between mathematics and philosophy here.)

Same for abstraction and recursion: From the perspective of the lambda calculus there is only a single concept, but in everyday programming it is extremely useful to distinguish between recursive and non-recursive functions. For example, we know that non-recursive functions that only call other non-recursive functions must eventually terminate (in the absence of unrestricted loops).

By splitting abstraction and recursion into distinct concepts, we can start to treat them differently: Perhaps we want to use call-by-value evaluation for non-recursive functions, but evaluate recursive data and functions lazily? (This is similar to how Call-by-push-value distinguishes values from computations.)

Going Further

What else can be added to the calculus without sacrificing its minimalism?