Blog

For those of us in the functional language community, functions are the essence of computation. From Church’s Lambda Calculus forward, the use of functions to encapsulate and apply computation has been an essential part of computer science.

As an example of where we want to go, consider a Haskell function `inc`

that simply increments the value of its argument and returns the result:

```
inc x = x + 1
```

Applying `inc`

to an actual parameter value substitutes that parameter with the value in the function body:

```
inc 3
== 3 + 1
== 4
```

Like mathematics, the argument to `inc`

can be a complex expression:

```
inc ((5+1) - 3)
== inc 6 - 3
== inc 3
== 3 + 1
== 4
```

or alternatively:

```
inc ((5+1) - 3)
== ((5+1) - 3) + 1
== 3 + 1
== 4
```

The common concept here is substitution of an identifier by an expression in the body of the function. This will form the basis of function application.

Before we go further, let’s establish a few definitions. In the definition of `inc x`

we refer to `x`

as a *formal parameter*. When we apply `inc`

to an expression as in `inc ((5+1) - 3)`

we refer to `((5+1) - 3)`

as an *actual parameter* or an *argument*. We refer to the function applied to an expression such as `inc ((5+1) - 3)`

as an *application* of `inc`

. Finally, we refer to the expression that defines `inc`

, `x+1`

as the *body* of inc and the *scope* of `x`

as the body of `inc`

.

Given all these nice definitions we can describe the application of `inc`

to any actual parameter as substituting its formal parameter, `x`

, with the actual parameter from the body of `inc`

. We can be a just little more formal and general. _The application of any function to an actual parameter is the substitution of its formal parameter with its actual parameter in its body. Looking back to `inc 3`

in light of this definition, applying `inc`

to `3`

is substituting `x`

with `3`

in `x+1`

.

Having defined informally what functions do, lets talk briefly about how they are defined before going into a formal definition of their use.

*First-order functions* are functions that have a special representation that is not accessible to the programer. They cannot take other functions as arguments or return functions, thus the name first-order. Some languages with first-order functions do not even provide a syntax for defining functions, but most allow function definition of some type.

If the definition for `inc`

above were done in a first order language, the resulting definition would add a function named `inc`

to the environment that can then be called. Specifically:

```
inc x = x + 1
(inc 10)
```

defines `inc`

and calls he resulting function on `10`

. Because `inc`

is first order, it is not possible to define functions like `map`

that would apply `inc`

to every element of a list. No function can be used as an argument to another. Furthermore, no function can return a function as its return value. While this may seem an odd thing to do right not, it is a powerful concept that allows such things as currying and partial evaluation that will come later.

*Higher-order functions* take other functions as arguments and may return functions as values. Our old friend the `map`

function is a great example of a higher-order function:

```
map :: (a -> b) -> [a] -> [b]
```

The first argument to `map`

is a function mapping elements of type `a`

to type `b`

. Similarly, function application may result in a function. Currying the `map`

function is a good example of this:

```
map inc
```

In this case, the application of `map`

to the function `inc`

results in a new function of type that adds `1`

to every element of an input list.

*First-class functions* are values treated like any other value in a language. In addition to being applied to actual parameters, they can be created statically, created at run-time, assigned to variables, passed as parameters, and returned as values. In many languages they can even be reflected upon and examined.

Not stopping at higher-order functions, Haskell has first-class functions as does Scheme and ML. A function value is created using a lambda or other special form. In Haskell the following produces a function value that adds `1`

to an input value:

```
(\x -> x+1)
```

The backslash is used to resemble the lambda from lambda calculus and is easy to remember for that reason. We can bind a function value to a name just like any other value:

```
inc = (\x -> x+1)
```

`inc`

is an identifier bound to the function `(\x -> x+1)`

and behaves just like any other identifier. Function values such as these are frequently referred to as *anonymous functions* because they do not have explicit names. In the Haskell expression:

```
map inc l
```

the value for `inc`

is found just like the value for `l`

.
The use of function values in programming languages dates back to Lisp and is finding its way into many mainstream languages. It is an elegant solution to defining functions that requires little syntax and few semantic extensions.

Our next language will implement first-class functions as an extension of `BAE`

. The concrete syntax for `FBAE`

is a simple extension of `BAE`

to include function definitions and applications. Once again we are dropping Booleans for the time being:

The third and fourth lines introduce the new syntax. Functions are defined with the $\llambda$ keyword following by an argument and function body. The value of our `inc`

function in `FBAE`

becomes:

```
lambda x in x+1
```

and we can give it the name `inc`

using a `bind`

:

```
bind inc = (lambda x in x+1) in ...
```

where the ellipsis represents the remainder of the program using `inc`

. Applying inc is done using function application as in:

```
bind inc = lambda x in x+1 in
inc 1
```

Function application uses the common syntax $(f\; t)$ represent the function $f$ applied to the term $t$. Parenthesis are frequently ommitted when the application is unambiguous without them.

We can now use concrete syntaxc to define a function, give it a name, and apply it to an actual parameter.

Additions to abstract syntax include new fields for `Lambda`

and `App`

. The definitions follow immediately from what we defined in the concrete syntax. The new abstract syntax for `FBAE`

becomes:

```
data FBAE where
Num :: Int -> FBAE
Plus :: FBAE -> FBAE -> FBAE
Minus :: FBAE -> FBAE -> FBAE
Bind :: String -> FBAE -> FBAE -> FBAE
Lambda :: String -> FBAE -> FBAE
App :: FBAE -> FBAE -> FBAE
Id :: String -> FBAE
```

The most basic definition for functions and their application to arguments comes from Church and the Lambda Calculus. Hopefully you will have the opportunity to study Lambda Calculus later, but for now we will only borrow some basic concepts for our work.

A hint for our direction is the similarity between the concrete syntax for `bind`

and `lambda`

:

```
bind x = 5 in x
lambda x in x
```

`bind`

defines a new identifier, binds it to a value, and uses the new identifier in the body of the `bind`

. `lambda`

is quite similar in that it defines a new identifier and uses the new identifier in the body of the `lambda`

. However, it does not bind the identifier to a particular value. That binding is deferred until the `lambda`

is applied. `bind`

and `lambda`

are cousins in that both define new identifiers. `bind`

and apply are similarly cousins in that both bind values to identifiers. Regardless of the relationship, we should be able to use the tools for defining `bind`

to similarly defined apply.

The simplest definition for function application is called $\beta$-reduction and uses substitution to replace an identifier with its value. In the $\beta$-reduction rule below, $i$ is the identifier defined by the $\mathsf{lambda}$, $b$ is the body of the $\llambda$ and $a$ is the argument the $\mathsf{lambda}$ is applied to. If $a$ evaluates to some value $v$ then then a function applied to $a$ results in substituting $v$ for the function’s formal parameter. The rule for this is:

\[\frac{a\eval v}{((\llambda i\; b) \; a) \rightarrow [i\mapsto v]b}[\beta-reduction]\]The result of the application is $[i\mapsto v]b$, or replacing $i$ with $a$’s value in $b$. This is exactly how we expect functions to behave. When applied, their formal parameter is replaced with an actual parameter and the result becomes the function application’s value.

Let’s think about how to implement evaluating `App`

and `Lambda`

using the substitution operator defined for `bind`

. We established that `bind`

both defines and replaces an identifier while function definition defines an identifier and application replaces it. Let’s review the substitution performed for `bind`

in the `evalS`

function:

```
evalS (Bind i v b) = do { v' <- evalS v ;
(evalS (subst i v' b)) }
```

The `subst`

function gets is identifier from the `bind`

syntax as well as the value it is substituting, `v`

. The body, `b`

, is also found in the `bind`

expression. Thus, the `subst`

application is simple. We can understand the evaluation of `App`

by looking for the arguments to `subst`

in the `lambda`

and its actual parameter.

Before evaluating `App`

, lets evaluate both its arguments. Specifically, `f`

to get a lambda value and `a`

to get a value for the actual parameter:

```
evalS (App f a) = do { (Lambda i b) <- (evalS f) ;
v <- (evalS a) ;
evalS (subst i v b) }
```

Now we have the pieces needed to apply `subst`

. `i`

is the identifier defined by the `Lambda`

. The argument to the substitution is provided by the `App`

as `v`

. We evaluate it first before passing it into the substitution. Finally, the body of the function, `b`

, is the expression substituted into. The result is the following application of `subst`

:

```
evalS (subst i v b)
```

and embedding the substitution into the case for `App`

gives us the following:

```
evalS (App f a) = do { (Lambda i b) <- (evalS f) ;
v <- (evalS a) ;
evalS (subst i v b) }
```

And there we have it. Substitute the argument in for the identifier in the body of the function.

What about the `Lambda`

case for `evalS`

? As it turns out, `lambda`

s are values. Just like `True`

or `(Num 0)`

, `(Lambda x (Plus (Id x) (Num 1)))`

is a value and is not evaluated further. Thus, the evaluation of `Lambda`

is trivial:

```
evalS (Lambda i b) = return (Lambda i b)
```

Putting the whole thing together with evaluation of remaining terms gives us a substituting interpreter for `FBAE`

:

```
evalS :: FBAE -> (Maybe FBAE)
evalS (Num x) = (Just (Num x))
evalS (Plus l r) = do { (Num l') <- (evalS l) ;
(Num r') <- (evalS r) ;
return (Num (l' + r')) }
evalS (Minus l r) = do { (Num l') <- (evalS l) ;
(Num r') <- (evalS r) ;
return (Num (l' - r')) }
evalS (Bind i v b) = do { v' <- evalS v ;
(evalS (subst i v' b)) }
evalS (Lambda i b) = return (Lambda i b)
evalS (App f a) = do { (Lambda i b) <- (evalS f) ;
a' <- (evalS a) ;
evalS (subst i a' b) }
evalS (If c t e) = do { (Num c') <- (evalS c) ;
if c'==0 then (evalS t) else (evalS e) }
evalS (Id id) = Nothing
```

The new interpreter for functions has the same problem as our original interpreter for `bind`

. Substitutions are performed immediately throughout an expression. Thankfully, we can use an environment again to solve the same problem. Recall how we used the environment variable to pass a list of defined variables and their values to implement `bind`

:

```
evalM env (Bind i v b) = do { v' <- evalM env v ;
evalM ((i,v'):env) b }
```

We can do a similar thing with function application. In the same way we replaced the call to `subst`

with an environment update in `bind`

, we can literally do the same thing with function application:

```
evalM env (App f a) = do { (Lambda i b) <- (evalM env f) ;
a' <- (evalM env a) ;
evalM ((i,a'):env) b }
```

The identifier and its value are added to the environment before evaluating the function body. Whenever the identifier appears, the value is found in the environment and used. Virtually the same implementation as bind.

The complete code for the interpreter implementing an environment is:

```
type Env = [(String,FBAE)]
evalM :: Env -> FBAE -> (Maybe FBAE)
evalM env (Num x) = return (Num x)
evalM env (Plus l r) = do { (Num l') <- (evalM env l) ;
(Num r') <- (evalM env r) ;
return (Num (l'+r')) }
evalM env (Minus l r) = do { (Num l') <- (evalM env l) ;
(Num r') <- (evalM env r) ;
return (Num (l'-r')) }
evalM env (Bind i v b) = do { v' <- evalM env v ;
evalM ((i,v'):env) b }
evalM env (Lambda i b) = return (Lambda i b)
evalM env (App f a) = do { (Lambda i b) <- (evalM env f) ;
a' <- (evalM env a) ;
evalM ((i,a'):env) b }
evalM env (Id id) = do { v <- (lookup id env) ;
return v }
evalM env (If c t e) = do { (Num c') <- (evalM env c) ;
if c'==0 then (evalM env t) else (evalM env e) }
```

We now have two interpreters for what we hope is the same untyped system with functions. We can use QuickCheck to help our testing, but we have immediate problems that can easily be seen.

We have introduced a new kind of value to our interpreter. Clearly when we added Booleans to our first interpreter there were new values to deal with. It’s less obvious how we have new values in this language, but remember that `lambda`

creates a function value. `lambda`

expressions are values just like `1`

or `true`

. What this means is expression like this one:

```
bind f = (lambda x in x+1) in
f f
```

will crash. `f`

can take any argument, but its expression `x+1`

will only operate on numbers.

Let’s consider the following code snippet:

```
bind f = (lambda x in x + n) in
f 1
```

The `bind`

defines `f`

whose value is the function that returns its input plus `n`

. This looks quite a bit like `inc`

, but instead adds `n`

to the argument rather than `1`

. Where does `n`

come from? In this case, nowhere. Both `evalS`

and `eval`

throw an error when they cannot find the value of `n`

. Let’s now add an `n`

with `f`

in its scope:

```
bind n = 1 in
bind f = (lambda x in x + n) in
f 1
```

The outer `bind`

gives `n`

the value `1`

while the inner `bind`

gives `f`

the lambda value that returns its input argument plus `n`

. We just defined `f`

in the context of `n`

. When we evaluate the new `bind`

bith `evalS`

and `eval`

give us the value `2`

as expected. So far so good.

Lets take this one step further by defining multiple, nested values of `n`

by making the body of `f`

a `bind`

expression like this:

```
bind n = 1 in
bind f = (lambda x in x + n) in
bind n = 2 in
f 1
```

What do you expect to get when evaluating this expression? `evalS`

gives us the value `2`

. However, `eval`

gives us the value `3`

. What’s happening here? Whatever it is, its bad because our two interpreters for the same language are now giving different results with `evalS`

and `eval`

. Same language using deferred substitution rather than immediate substitution giving different values.

We have successfully implemented an untyped function system by adding to our original `BAE`

interpreter. We demonstrated that we can implement direct substitution and deferred substitution just as we did with `bind`

expressions. However, we’ve run into two problems associated with new function values and an odd difference in the way `evalS`

and `eval`

process expressions. As you might guess, we will deal with the new function values by predicting failure using types. We will deal with the difference between immediate and deferred substitution by looking at *static* and *dynamic* scoping.

- Write an
`eval`

function for a language with only first-class functions using direct substitutions and the`subst`

operation defined for`bind`

- Modify your
`eval`

function for a language with only first-class functions to defer substitution using an environment that contains both functions and values.