The Most Beautiful Program Ever Written

A Lisp interpreter written in Lisp

6 minute read

It’s my 5th time trying to follow this video and things finally start to click in my head. So I think if I write down what is talked in the video, then I can just read my blog post for another 15 times instead of going through the video, that would be convenient.

OK, here is a gist of it. This guy is writing an Scheme interpreter in Scheme.

Let’s start from setting up the environment first so we can type along afterwards.

Environment Setup

The programming language used in this talk is a dialect of Scheme Chez Scheme, you can download and install it from here.

There is a pattern matching library used in the talk pmatch, you can find the source code from here.

Scheme Basics

The talk begins with a brief introduction to the Scheme language.

The REPL

If you have never worked with Lisp, REPL stands for Read Eval Print and Loop, it is an interactive programming environment that allows you to execute the program as you type it in. Now we fire up the REPL by just type scheme in the terminal. Or if you are using Emacs as I am, you can just M-x run-scheme.

List and Symbol

Lisp is all about list processing, let’s look at some simple programs.

Let’s explain them a bit.

First, 5 will evaluate to 5, that’s simple.

A list can be created by using (list ...), it can also be nested.

#t stands for true and #f stands for false.

quote anything will give you back a symbol of that thing. Symbol is kind of like a variable in other language, it might be bound to some value or it might not.

' is a short hand for quote.

null? is a function, it returns true if its arg is null or false otherwise.

cons will attach a value to the beginning of a list.

if is like if in other languages.

cond is like switch in other languages.

define is like assigning to a variable.

lambda is a way to define functions, e.g. (lambda (x) x) means that this is a function that returns what it takes in. In other words, it is the identity function.

Pattern Matching

How to use pmatch is best explained through an example.

Which is basically doing:

OK, now we know everything to write a interpreter!

Write an interpreter

What do we want to first interpret? How about a number like 5?

The interpreter that will interpret only numbers can be written as follows:

As we can see, it is trying to pattern match the given expression to ´n´ where n is number, and if it finds a match, it return n itself.

What do we want to interpret next? How about add1?

add1 is a function in scheme that adds 1 to its argument.

You can see that we use the add1 in Scheme to implement the add1 in our language. Pay attention that recursive eval-expr there, so that we can handle nested expressions like (add1 (add1 6)).

Next we will do multiplication.

OK, let’s add some more.

OK, now the boring part has ended. Let’s begin the fun part.

The core of this interpreter consists of the following 3 parts.

Application, lambda and variable lookup

Let’s now add application.

,rator stands for operator and ,rand stands for operand. This is rather straightforward, we first eval the operator expression and then eval the operand expression and apply the former to the latter.

Now let’s try to add lambda expression. This is the most difficult part so don’t expect to understand it in one go. And even you think you understand it, you probably don’t.

Let me present it to you first, then we can explain it.

This is the final version, now let’s explain everything.

I find it’s more important to understand what is happening on a higher level, that will make reading the code much easier.

So we are trying to interpret lambda expression, but what is the outcome? Yes, a procedure that we can call later, to be precise a Scheme procedure. So how do we create a Scheme procedure? You guessed it, lambda! So, the lambda in our language will be interpreted to the lambda in Scheme.

Let’s take an example, say we want to interpret the expression (lambda (x) (* x 2)), what is x? It doesn’t really mean anything alone but an indicator that will be used in combination with the body (* x 2). Then what is the value of x? We don’t know yet, as said it is just an indicator, a symbol, there is no value assigned to it yet. But when the value will be known? When we are calling it of course.

So based on what we now know, lambda expression can be interpreted as a Scheme procedure like this:

(lambda (arg)
  (eval-expr body))

What is arg here? It will be the argument with which this lambda will get called. There is a problem here. When we eval the body, which in this case looks like (eval-expr '(* x 2)), how do we connect the arg 5 to x? In other words, at run time when the Scheme lambda gets called with 5? how do the interpreter know that now the x should be evaluated to 5? The answer is to introduce a environment env. env is a procedure that takes one argument. So when we trying to look up a symbol in the environment, we just call it with the symbol: (env x).

So now back to the problem, when evaluate the body we just need to provide the right environment, so that when we look up x later, it will yield 5, which is the arg.

Now with all these knowledge, you can read the code and see if you can understand it better.

Verify everything works

We will test our interpreter with a program that calculates factorial numbers.

(define factorial
  ((lambda (!)
     (lambda (n)
       ((! !) n)))
   (lambda (!)
     (lambda (n)
       (if (zero? n)
           1
           (* n ((! !) (sub1 n))))))))

And tada, it all works!

Congratulations, now you understand the most beautiful program ever written!

Well, if not, then just go back and re-read this blog post or re-watch the video until everything becomes clear.

Warning, you may wake up during night or miss bus stop if you keep thinking about all the nested lambdas. Other than that, you will be fine.

comments powered by Disqus