Programming Languages in Haskell

Support for EECS 662 at KU

Index
Blog

Typing Storage

Let’s start thinking about type inference with mutable state by going back to type inference for the BFAE

In evalM from the previous chapter we included types in lambda defintions but igored them after creating the AST. Let’s go back and think about types and how we might statically predict the type of an FBAE expression.

First let’s take care of locations. We added a location value to FBAE and now much have a type for those locations. The simplest resonse is to add a new type, TLoc, to the set of type values. The resulting data type now has the following form:

data TFBAE where
  TNum :: TFBAE
  TBool :: TFBAE
  (:->:) :: TFBAE -> TFBAE -> TFBAE
  TLoc :: TFBAE
  deriving (Show,Eq)

As seen previously, existing type rules do not change. Arithmetic operations still require natural numbers, function application requires its first argument to be a fucntion and the actual parameter type to match the function’s domain type, and Boolean operations require Boolean operands. For now let’s skip those operations and look at typing rules for the new storage operators and sequence. First let’s look at new:

typeofM cont (New t) = do { (typeofM cont t) ;
                            return TLoc }

Recall that new always returns a fresh location. Because all locations are of type TLoc, new t is always of type TLoc regardless of input. The type of t is still checked to ensure t is well-typed, but it is always discarded and TLoc returned.

set l v expects l to be a location and v be an expression defining the stored value. Recall that set returns the value after updating location l in memory. If l is a location, then the type of set l v is the type of v:

typeofM cont (Set l v) = do { TLoc <- (typeofM cont l) ;
															(typeofM cont v) }

Once again we don’t care about the type of l as long as it is TLoc. Pattern matching in the bind causes a failure if (typeofM cont l) is not a location.

So far so good.

deref l expects l to be a location and returns the thing stored at location l. Unfortunately, this presents a problem. How do we know the type of the value at location l? For example, in (deref l) + 1 the first term must be a number. All we know is that l is a location. Nothing more. The type of the value at l can only be known by performing the deref operation at run-time or by keeping track of what is stored at l each time set l occurs.

Maybe we can cheat and simply return TLoc to indicate a location is involved:

typeofM cont (Deref l) = do { TLoc <- (typeofM cont l) ;
                              return TLoc}

This will cause the (deref l) + 1 to fail type inference regardless of what is stored in l because + always expects a number and not a location. Defaulting to a number or a boolean type causes the same error.

Lets try another tactic by introducing a new type, TTop with concrete syntax Top. This new type will be returned whenever a deref operation should return a type:

typeofM cont (Deref l) = do { TLoc <- (typeofM cont l) ;
                              return TTop }

TTop will have the special property that all other types are compatible with TTop. Specifically, TTop can replace TNum, TBool or T:->:T in any typechecking situation. Now our (deref l)+1 is of type TNum as it should be, but it is always of type TNum regardless of what is stored at l.

What have we done? First, our type inference system will no longer predict all static typing errors. In that respect, we’ve broken our type system. However, we still catch errors that are not associated with the contents of storage.

What we’ve described is very much like what languages like C do. If you know C you should recall that malloc returns a pointer to a collection of bytes that are referenced by a pointer. When dereferenced, the result is caste to the needed type. Said differently, the set of bytes allocated by malloc can be anything we caste them to. You store a number and retrieve a function. You store a function and retrieve a Boolean. What I stored in memory becomes whatever you tell it to be. While there are very good reasons to do this in system programming, for other tasks many errors are missed by bad castes or worse no explicit castes.


typeofM cont (Seq l r) = do { (typeofM cont l) ;
                              (typeofM cont r) }
data TFBAE where
  TNum :: TFBAE
  TBool :: TFBAE
  (:->:) :: TFBAE -> TFBAE -> TFBAE
  TLoc :: TFBAE -> TFBAE
  deriving (Show,Eq)
typeofM :: Cont -> FBAE -> (Maybe TFBAE)
typeofM cont (Num x) = return TNum
typeofM cont (Boolean b) = return TBool
typeofM cont (Plus l r) = do { TNum <- (typeofM cont l) ;
                               TNum <- (typeofM cont r) ;
                               return TNum }
typeofM cont (Minus l r) = do { TNum <- (typeofM cont l) ;
                                TNum <- (typeofM cont r) ;
                                return TNum }
typeofM cont (Mult l r) = do { TNum <- (typeofM cont l) ;
                               TNum <- (typeofM cont r) ;
                               return TNum }
typeofM cont (Div l r) = do { TNum <- (typeofM cont l) ;
                              TNum <- (typeofM cont r) ;
                              return TNum }
typeofM cont (Bind i v b) = do { v' <- typeofM cont v ;
                                 typeofM ((i,v'):cont) b }
typeofM cont (Id id) = (lookup id cont)
typeofM cont (Lambda x t b) = do { tyB <- typeofM ((x,t):cont) b ;
                                   return (t :->: tyB) }
typeofM cont (App x y) = do { tyXd :->: tyXr <- typeofM cont x ;
                              tyY <- typeofM cont y ;
                              if tyXd==tyY
                              then return tyXr
                              else Nothing }
typeofM cont (And l r) = do { TBool <- (typeofM cont l) ;
                              TBool <- (typeofM cont r) ;
                              return TBool }
typeofM cont (Or l r) = do { TBool <- (typeofM cont l) ;
                             TBool <- (typeofM cont r);
                             return TBool }
typeofM cont (Leq l r) = do { TNum <- (typeofM cont l) ;
                              TNum <- (typeofM cont r) ;
                              return TBool }
typeofM cont (IsZero v) = do { TNum <- (typeofM cont v) ;
                               return TBool }
typeofM cont (If c t e) = do { TBool <- (typeofM cont c) ;
                               t' <- (typeofM cont t) ;
                               e' <- (typeofM cont e) ;
                               if t' == e'
                               then return t'
                               else Nothing }
typeofM cont (New t) = do { t' <- (typeofM cont t) ;
                            return (TLoc t') }
typeofM cont (Set l v) = do { (TLoc l') <- (typeofM cont l) ;
                              v' <- (typeofM cont v) ;
                              if l'==v'
                              then return v'
                              else Nothing }
typeofM cont (Deref l) = do { (TLoc l') <- (typeofM cont l) ;
                              return l'}
typeofM cont (Seq l r) = do { (typeofM cont l) ;
                              (typeofM cont r) }

Discussion

Definitions

Exercises