Naming things

While our little language is developing apace, we are still lacking basic features. One is the ability to give names to values and reuse them, which is commonly known as variables. We’ll stay clear of that term for the moment however, as it hints strongly at things that can change and we do not want to go there quite yet. Instead, since we’re binding a name to a value, we’ll call these bindings.

For the same reason, I will not be using the common var or val keywords to declare bindings (var feels mutable, val immutable, and we’re very pointedly not making a decision either way). Instead, we’ll use let, a very good term that I wish more languages used.

Substitution rules

Here, then, is how your typical let statement goes:

let x = 1
x

You should instinctively feel that this evaluates to 1, which is perfectly correct. Do note, however, that there are two distinct parts to this:

This is important, as it tells us we’ll need to add two variants to our AST.

Here’s a slightly less obvious example:

x
let x = 1

This should feel uncomfortable: x is being used before being defined, surely that can’t be right!

Well, technically, it could, depending on how we decide bindings work. But the important point is: there must be some sort of rule to tell us when it’s legal to use a binding.

We’ll call the parts of a program in which a binding can be used its scope, and say that a binding is in scope to mean it can be used. Note that we’re not (yet) defining how scope works, merely that it must exist: at the very least, a binding must exist to be used.

This is why bindings are actually known as local bindings: they are not valid everywhere, but only locally, in their scope.

In order to make things clearer, we’ll make scope an explicit part of our syntax by introducing the in part of a let statement:

let [BINDING] in [SCOPE]

For example:

let x = 1 in x

This syntax removes any ambiguity as to where x comes from. It also shows us something a little interesting: how substitution works for let statements.

Let’s run through it step by step:

TargetKnowledgeAction
let x = 1 in xN/ABind x to 1
Substitute let statement with x
xx = 1Substitute x with 1
1 N/A

Substitution works by storing whatever knowledge the let part of the statement gives us, and then evaluating the in part. We’re starting to have quite a good model for how local bindings work. But there are still a few stones left unturned.

What do you think the following evaluates to?

let x = 1 in
  x + (let x = 2 in x)

You should in theory feel quite confident it evaluates to 3: the left-hand side operator of + is clearly bound to 1, and the right-hand side declares an entirely different x, bound to 2. Still, this is interesting: there is a notion of a local binding overriding another. This is typically known as shadowing, and can feel a lot like we’ve introduced reassignment to the language - after all, the value x is bound to changes, doesn’t it?

Consider the following program.

let x = 1 in
 (let x = 2 in x) + x

You hopefully think that this should evaluate to 3 - it’s essentially the same program as before, except we’ve swapped the addition’s operands. Addition is commutative, so this really shouldn’t change anything and we should get the same result as before.

But what this means, then, is that we have not in fact bound x to a new value: had we done that, this program would evaluate to 4, since we’re evaluating from left to right and binding x to 2 happens before x is ever used. No, what we have done is made it legal to declare a new x, which masks the old one locally, but we’ll get the latter back as soon as the former goes out of scope.

What we’ve just done is decided on a very precise kind of scope management: static scope, also known as lexical scope. The scope of a local binding can be understood statically, by looking at the code with no need to run it.

Of course, if we need the static qualifier, it means that there must be such a thing as dynamic scope. There really shouldn’t but, unfortunately, there is. With dynamic scoping, a binding’s value depends on the latest let statement that was executed, regardless of its position in the program. With dynamic scoping, our previous example would evaluate to 4.

This is generally considered to be a bad idea - if anything, it makes addition non-commutative! Some languages do this, but most languages choose static scoping and so will we.

To summarize all this, we’ve come to the conclusion that local bindings:

We now merely have to implement this!

Updating the AST

The first task is to update the AST with our new syntax elements. We’ve seen that we’ll need two new variants, one for introduction and one for elimination.

Let’s start with the simplest one, elimination. That’s merely stating here’s a name, substitute it with whatever value it’s bound to, so we only need to store the binding’s name:

  case Var(name: String)

Elimination is a little trickier. Recall how a let statement looks:

let [NAME] = [VALUE] in [BODY]

Where:

[NAME] is clearly a String and [BODY] an Expr, but what about [VALUE]? Well, we’ll definitely want to support something like this:

let x = 1 + 2 in x * 2

We must allow [VALUE] to be a complex expression - it must be an Expr, too:

  case Let(name: String, value: Expr, body: Expr)

This gives us our complete AST:

enum Expr:
  case Num(value: Int)
  case Bool(value: Boolean)
  case Add(lhs: Expr, rhs: Expr)
  case Cond(pred: Expr, thenBranch: Expr, elseBranch: Expr)
  case Let(name: String, value: Expr, body: Expr)
  case Var(name: String)

Interpreting local bindings

We’re clearly going to need to do a little work here. As we’ve seen when running substitution, the let part of our bindings is put in some sort of knowledge base, to be reused in the in part.

This knowledge base is typically called an environment. It really is just a mapping of names to the corresponding values, and could be a simple Map[String, Value].

We’ll do something a little more interesting and usable, however: it’s clear we’ll need to be able to query (binding elimination) and update (binding introduction) that environment, so we’ll create a bespoke type with dedicated methods:

case class Env(map: Map[String, Value]):
  def lookup(name: String) =
    map.getOrElse(name, sys.error(s"Unbound variable: $name"))

  def bind(name: String, value: Value) =
    Env(map + (name -> value))

Env wraps our map, and provides:

Since we now know that interpretation is done within the scope of a set of bindings (our environment), we also know that this must be passed around, which requires a little refactoring. We’ll need to update the existing code by passing an Env everywhere interpret will be called:

def add(lhs: Expr, rhs: Expr, env: Env) =
  (interpret(lhs, env), interpret(rhs, env)) match
    case (Value.Num(lhs), Value.Num(rhs)) => Value.Num(lhs + rhs)
    case _                                => sys.error("Type error in Add")

def cond(pred: Expr, t: Expr, e: Expr, env: Env) =
  interpret(pred, env) match
    case Value.Bool(true)  => interpret(t, env)
    case Value.Bool(false) => interpret(e, env)
    case _                 => sys.error("Type error in Cond")

def interpret(expr: Expr, env: Env): Value = expr matcha
  case Num(value)       => Value.Num(value)
  case Bool(value)      => Value.Bool(value)
  case Add(lhs, rhs)    => add(lhs, rhs, env)
  case Cond(pred, t, e) => cond(pred, t, e, env)

And having laid all this groundwork, we can now tackle the more interesting problem of interpreting Var and Let.

Let’s start with Var. That’s almost trivial: we need to lookup the value a name is bound to in our environment, which is exactly what lookup does:

def interpret(expr: Expr, env: Env): Value = expr matcha
  case Num(value)       => Value.Num(value)
  case Bool(value)      => Value.Bool(value)
  case Add(lhs, rhs)    => add(lhs, rhs, env)
  case Cond(pred, t, e) => cond(pred, t, e, env)
  case Var(name)        => env.lookup(name)

Let is slightly trickier, but perfectly manageable. Given a name, a value and a body, we need to:

This is a little complex, so we’ll extract it to a dedicated method:

def let(name: String, value: Expr, body: Expr, env: Env) =
  val actualValue = interpret(value, env)
  val newEnv      = env.bind(name, actualValue)

  interpret(body, newEnv)

Which gives us a final implementation of interpret:

def interpret(expr: Expr, env: Env): Value = expr match
  case Num(value)             => Value.Num(value)
  case Bool(value)            => Value.Bool(value)
  case Add(lhs, rhs)          => add(lhs, rhs, env)
  case Cond(pred, t, e)       => cond(pred, t, e, env)
  case Var(name)              => env.lookup(name)
  case Let(name, value, body) => let(name, value, body, env)

Testing our implementation

Now that we’ve written something that feels it should work, let’s take it out for a spin.

We’ll use the example we took to talk about static scoping:

let x = 1 in
 (let x = 2 in x) + x

This yields this (relatively noisy) AST:

val expr = Let(
  name  = "x",
  value = Num(1),
  body  = Add(
    Let(
      name  = "x",
      value = Num(2),
      body  = Var("x")
    ),
    Var("x")
  )
)

Interpreting that is simple enough, although we do need to pass some environment to interpret. At least for the moment, we’ll merely work with the empty environment, for which we’ll create a helper:

object Env:
  val empty: Env = Env(Map.empty)

Armed with all these tools, interpreting our expr is entirely straightforward:

interpret(expr, Env.empty)
// val res: Value = Num(3)

If you remember, if we implemented static scoping, this should evaluate to 3, or 4 if we made a mistake an implement dynamic scoping. And, fortunately, it evaluates to 3, as it should.

Where to go from there?

We’ve just finished adding local bindings to our language. Inquisitive readers might have realised that these feel very much like functions, except the parameter’s value is fixed - if you have, well done, you’ve correctly guessed what our next step should be: take what we’ve learned with local bindings and attempt to generalise it to support functions.