Blog

Before going further, let’s take a shot at understanding why the two interpreters behave differently on the same syntax. First, we will examine a case where they produce the same result:

```
bind n = 1 inv
bind f = (lambda x in x + n) in
(f 1)
```

First, the immediate substitution interpreter defined by `evals`

:

```
bind n = 1 in
bind f = (lambda x in x + n) in
(f 1)
==
bind f = (lambda x in x + 1) in
(f 1)
== ((lambda x in x + 1) 1)
== 1 + 1
== 2
```

Each step represents one recursive call to `evals`

. The first step immediately substitutes `1`

for `n`

throughout the remainder of the expression. The second does the same for `f`

. Finally, the application evaluates substituting `1`

for `x`

in `x+1`

. The resulting value is `2`

.

Now let’s look at the defered substitution interpreter implemented by `evalM`

:

```
bind n = 1 in [(n,1)]
bind f = (lambda x in x + n) in [(f,(lambda...)),(n,1)]
(f 1) [(f,(lambda...)),(n,1)]
== ((lambda x in x + n) 1)
== x + n [(x,1),(f,(lambda...)),(n,1)]
== 1 + 1
== 2
```

The first `bind`

adds a binding from `n`

to `1`

while the second adds a binding from `f`

to the lambda expression. The term `(f 1)`

is evaluated in the context of the resulting environment. `f`

is first replaced by the lambda, then $\beta$-reduction adds a binding of `1`

to `x`

in. Now the term can be evaluated. `x`

is replaced by `1`

and `n`

by `1`

and we’re done. The result is again `2`

.

What happens in the second example using nested binds? Again, let’s look at the immediate substitution interpreter:

```
bind n = 1 in
bind f = (lambda x in x + n) in
bind n = 2 in
(f 1)
==
bind f = (lambda x in x + 1) in
bind n = 2 in
(f 1)
==
bind n = 2 in
((lambda x in x + 1) 1)
== ((lambda x in x + 1) 1)
== 1 + 1
== 2
```

Immediate substitution results in `2`

again following roughly the same steps as before. The interesting step to pay attention to is the second substitution of `2`

for `n`

. The first substitution replace `n`

in the `lambda`

, thus no `n`

is present for the inner `bind`

to replace.

Now compare to the deferred substitution evaluator:

```
bind n = 1 in [(n,1)]
bind f = (lambda x in x + n) in [(f,(lambda ...)),(n,1)]
bind n = 2 in [(n,2),(f,(lambda ...)),(n,1)]
(f 1)
== ((lambda x in x + n) 1)
== x + n [(x,1),(n,2),(f,(lambda ...)),(n,1)]
== 1 + n
== 1 + 2
== 3
```

The result is `3`

, but why? The second `bind`

is key to understanding what happens. `bind`

adds bindings to the environment when deferring substition while `bind`

substitutes immediately when using direct substitution. By the time deferred substitution occurs, `(n,2)`

has been added to the environment where it shadows the original `(n,1)`

binding. Thus, when `x+n`

is evaluated, we get `1+2`

rather than `1+1`

.

Where the two interpreters differ is in their implementation of *scope*. The immediate substitution interpreter implements *static scoping* while the deferred substitution interpreter implements *dynamic scoping*. In dynamic scoping, the environment used when evaluating a function is the environment where the function is evaluated. In the example above, `f`

is evaluated in the environment where it is called - `n`

is bound to `2`

and `f`

to the lambda. Thus, the deferred substitution interpreter implements dynamic scoping.

Clearly dynamic scoping differs from our reference implementation. However, is that a bad thing? Most programming language researchers would say yes. Here’s a good example why that’s the case:

```
bind n = 1 in [(n,1)]
bind f = (lambda x in x + n) in [(f,(lambda...)),(n,1)]
bind n = (f 1) in [(n,2),(f,(lambda...)),(n,1)]
bind n = (f 1) in [(n,3),(n,2),(f,(lambda...)),(n,1)]
(f 1)
== 4
```

In this expression, every time `f 1`

is evaluated, the result is different. Every time `f 1`

is evaluted, the current environment must be known to determine the result. Same function, same arguments, different result. This is a debugging nightmare and generally, dynamic scoping is to be avoided.

Assuming the immediate substitution interpreter implements a reference interpreter, we need the substituting interpreter to produce the same values. Specifically, we need `f`

to use the environment where it is defined rather than where it is called. Unfortunately, it’s gone when `f`

is evaluated.

The approach we’ll use is to keep a copy of the environment where a function is defined. In effect, we’ll put a copy of the environment in the lambda value when the lambad is defined. This structure is called a *closure*. To implement closures we’re going to need to change several things in our interpreter.

To understand how a closure will work, lets consider the problematic `bind`

evaluation from earlier in the chapter. Recall that when `x + n`

evaluates, the environment used is the dynamic environment where `n`

is bound to `2`

:

```
bind n = 1 in [(n,1)]
bind f = (lambda x in x + n) in [(f,(lambda ...)),(n,1)]
bind n = 2 in [(n,2),(f,(lambda ...)),(n,1)]
(f 1)
== ((lambda x in x + n) 1)
== x + n [(x,1),(n,2),(f,(lambda ...)),(n,1)]
== 1 + n
== 1 + 2
== 3
```

However, immediate substitution uses the static binding of `n`

to `1`

where `f`

is defined. We call the environment `[(n,1)]`

where the `f`

\ is defined the *static scope* of `f`

. In contrast, the environment:

```
[(x,1),(n,2),(f,(lambda ...)),(n,1)]
```

where `f`

is called is the *dynamic scope* of `f`

. In general, the term *static* refers to definition time while the term *dynamic* refers to run time. If we could use the static scope of `f`

when evaluation occurs, all instances of `(f 1)`

generate the same value and the evaluation result matches the immediate substitution result. What we want is the following trace where the static scope gets used for evaluation:

```
bind n = 1 in [(n,1)]
bind f = (lambda x in x + n) in [(f,(lambda ...)),(n,1)]
bind n = 2 in [(n,2),(f,(lambda ...)),(n,1)]
(f 1)
== ((lambda x in x + n) 1)
== x + n [(n,1)]
== 1 + n
== 1 + 1
== 2
```

Having information from the `lambda`

definition plus the static environment in one data structure is a closure. Psuedo-Haskell for the closure would be:

```
data FBAE where
...
Closure :: String -> FBAE -> Env -> FBAE
...
```

As we shall see, implementing the closure is a simple extension of what we already do. Remember that for now a `lambda`

is a value. What if somehow we tuck the static environment into the `lambda`

definition?

All our interpreters thus far have taken an AST as input and produced an AST of the same type as output. For example, `evalM`

for the `FBAE`

interpreter has the signature:

```
evalM :: FBAE -> (Maybe FBAE)
```

Although correct, `evalM`

’s type is not specific. There are many elements of `FBAE`

that should not be returned by eval. In fact, there are far more `FBAE`

constructs that should not be return than should.

What we would like is for `evalM`

to return a *value*. As we’ve said before, a value is a term that cannot be evaluated further. Let’s go further and define a value type that interpreters must return:

```
evalM :: FBAE -> (Maybe FBAEVal)
```

where the definition of `FBAEVal`

might have the form:

```
data FBAEVal where
NumV :: Int -> FBAEVal
LambdaV :: String -> FBAE -> FBAEVal
deriving (Show,Eq)
```

Now the interpreter must return a value and that value is distiguishable from traditional AST elements.

We’ve defined static and dynamic scoping and said that for static scoping the static enviornment must be included in a function value. We’ve defined the concept of a value and defined a type for values. Just put the concepts together to define a value that includes closures:

```
data FBAEVal where
NumV :: Int -> FBAEVal
ClosureV :: String -> FBAE -> Env -> FBAEVal
deriving (Show,Eq)
```

The `NumV`

definition is for number values. `NumV`

is structually the same as `Num`

, but is an element of the value type rather than the AST type. The `ClosureV`

definition modifies `Lambda`

adding an `Env`

element. This new element represents the static environment necessary to implement static scoping. `ClosureV`

adds an environment value to a lambda value without making an addition to the AST definition.

Weaving this new definition througout the `FBAE`

interpreter is a matter of tweaking existing code. First, the environment should contain values and not AST elements. This change is simple requiring only a change to the `Env`

type definition:

```
type Env = [(String,FBAEVal)]
```

Walking through the interpreter reveals additional changes. The signature is updated to reflect return of an `FBAEVal`

:

```
evalM :: Env -> FBAE -> (Maybe FBAEVal)
```

Evaluating a number now results in a `NumV`

value:

```
evalM env (Num x) = return (NumV x)
```

In `Plus`

, `Minus`

and `Bind`

, evaluating subterms results in values rather than AST elements. Changes are minimal thanks to pattern matching:

```
evalM env (Num x) = return (NumV x)
evalM env (Plus l r) = do { (NumV l') <- (evalM env l);
(NumV r') <- (evalM env r);
return (NumV (l'+r'))}
evalM env (Minus l r) = do { (NumV l') <- (evalM env l);
(NumV r') <- (evalM env r);
return (NumV (l'-r'))}
evalM env (Bind i v b) = do { v' <- evalM env v;
evalM ((i,v'):env) b }
```

When evaluting a `Lambda`

we capture the static environment. In the case for `Lambda`

, the environment passed into the interpreter is bound to `env`

. Creating the closure value takes `env`

and saves its value in the closure value:

```
evalM env (Lambda i b) = return (ClosureV i b env)
```

Using the closure value is surprisingly simple. In the evaluation of `App`

in the dynamically scoped interpreter, the environment for evaluating the function body is the dynamic environment updated with the lambda argument. Specifically:

```
((i,a'):env)
```

Instead of using the dynamic environment, we use the environment from the closure being applied. This value is found bound to `e`

in the expression. The new environment for evaluating the function body becomes `((i,a'):e)`

and is used as follows:

```
evalM env (App f a) = do { (ClosureV i b e) <- (evalM env f);
a' <- (evalM env a);
evalM ((i,a'):e) b }
```

Other than updates to the environment, the evaluation of `App`

is completely unchanged.

Finally, the remaining cases are largely unchanged. The `If`

construct must be updated to use values, but otherwise is identical to the dynamically scoped iterpreter.

```
evalM env (Id id) = lookup id env
evalM env (If c t e) = do { (NumV c') <- (evalM env c);
if c'==0 then (evalM env t) else (evalM env e) }
```

Putting everything together gives us the following interpreter:

```
evalM :: Env -> FBAE -> Maybe FBAEVal
evalM env (Num x) = return (NumV x)
evalM env (Plus l r) = do { (NumV l') <- (evalM env l);
(NumV r') <- (evalM env r);
return (NumV (l'+r'))}
evalM env (Minus l r) = do { (NumV l') <- (evalM env l);
(NumV r') <- (evalM env r);
return (NumV (l'-r'))}
evalM env (Bind i v b) = do { v' <- evalM env v;
evalM ((i,v'):env) b }
evalM env (Lambda i b) = return (ClosureV i b env)
evalM env (App f a) = do { (ClosureV i b e) <- (evalM env f);
a' <- (evalM env a);
evalM ((i,a'):e) b }
evalM env (Id id) = lookup id env
evalM env (If c t e) = do { (NumV c') <- (evalM env c);
if c'==0 then (evalM env t) else (evalM env e) }
```

To make sure we’ve implemented our statically scoped interpreter correctly, we can go back to the test case that caused our problems to begin with:

```
bind n = 1 in
bind f = (lambda x in x + n) in
bind n = 2 in
(f 1)
```

Interpreting this expression with our new inerpreter gives precisely the result hoped for.

- static - definition or compile time
- dynamic - run-time
- static scoping - environment may be uniquely determined at definition time
- dynamic scoping - enviornment is not known until run-time.