thev.net

Representing Simply Typed Lambda Calculus

We give an implementation of simply typed lambda calulus (STLC) that encodes its type as a Haskell type using GADTs, as well as a type-directed normalization procedure, which can be seen as an encoding of the proof that STLC is strongly normalizing.

Note: The bulk of this code was written around 2008, and it made heavy use of Functional Depencencies, which has somewhat lost its popularity as people move onto Type Families. I only did enough to clean it up and make sure the code compiles with GHC 7.8.3. With the new GHC extensions such as type-level naturals, the code sure can be simplified quite a bit. Any comments and criticisms are welcome. In particular, if anybody can help remove the use of UndecidableInstances, I’d love to learn about it.

> {-# LANGUAGE UndecidableInstances, NoMonomorphismRestriction,
>     GADTs, MultiParamTypeClasses, FunctionalDependencies,
>     FlexibleContexts, FlexibleInstances, ScopedTypeVariables
> #-}

Introduction

We would like to have a datatype to represent both the structure and the type of a lambda expression, so that we can have the evaluation function as something like:

eval :: Exp v -> v

With this, a Haskell compiler will do the type checking on such lambda terms for free. For example, one can only construct lambda terms with valid types, subject to Haskell’s typing rules.

Representing Lambda

Composing lambda terms this way requires computation at the type level, and also an environment that captures the types of all free variables in the term, without which, the type of a resulting term can’t be decided.

For this purpose, we’ll use an environment at type level, and variables will be represented as de Bruijn indices into the environment, both at the type and the term level. This is by representing natural numbers at the type level

> data Z = Z
> data S a = S a

The environment is basically a stack of types represented as nested tuples, and type class Index indicates the constraint: the n-th type in env is a.

> class Index env n a | env n -> a
> instance Index (a, env) Z a
> instance Index env n b => Index (a, env) (S n) b

Instead of defining a single datatype for lambda expressions, we make it fully parameterized, so that extensions may be added in the future without modifying the core program. This requires separate definitions for variable, abstraction and application.

The datatype for variables has to capture the environment, the index, and also the type of this variable

> data Var env n v where
>   VZ :: Index env Z v => Var env Z v
>   VS :: Index env n v => Var env n v -> Var (a, env) (S n) v

Lambda abstraction is defined as moving the first element in the environment out as an argument:

> data Abs env t v where
>   Abs :: g (a, env) h v -> Abs env (g (a, env) h v) (a -> v)

Similar to Var, Abs also has 3 type parameters: the environment env, the value type v, and the type h of the term it abstracts from, which is important at preserving the type and structure of all sub-terms.

Application is defined as applying a term of value type a->b to another term of value type a, The types of both of them are also passed along.

> data App env t v where
>   App :: g env h (a -> b) -> g' env h' a -> 
>          App env (g env h (a -> b), g' env h' a) b

We can use some shorthands for Abs and App, and for variables:

> (%) = App
> l   = Abs
> v0  = VZ
> v1  = VS v0
> v2  = VS v1
> v3  = VS v2

The identity function

> i = Abs v0

Church numerals

> zero = l $ l v0
> one  = l $ l $ v1 % v0
> suc  = l $ l $ l $ v1 % (v2 % v1 % v0)
> one' = suc % zero

Evaluation (to value)

Evaluating the lambda expression given its environment can be easily defined under a Eval class:

> class Eval g env t v where
>    eval :: env -> g env t v -> v 
>    eval _ _ = undefined

> instance Eval Var (a, env) Z a where
>    eval (a, e) VZ = a
> 
> instance (Index env n b, Eval Var env n b) => 
>          Eval Var (a, env) (S n) b where
>    eval (a, e) (VS n) = eval e n

> instance Eval g (a, env) h v => 
>          Eval Abs env (g (a, env) h v) (a -> v) where
>    eval env (Abs e) = \x -> eval (x, env) e

> instance (Eval g env h (a->b), 
>           Eval g' env h' a) => 
>          Eval App env (g env h (a -> b), g' env h' a) b where
>    eval env (App f e) = (eval env f) (eval env e)

We can have a function that evaluates church numerals:

> evalN n = eval (succ, (0, ())) $ n % v0 % v1

We can also make lambda expressions instances of Eq and Show:

> instance Eq (Var env n a) where
>    (==) _ _ = True
> 
> instance Eq (g (a, env) h v) => 
>          Eq (Abs env (g (a, env) h v) (a -> v)) where
>    (==) (Abs e) (Abs e') = e == e'
> 
> instance (Eq (g env h (a -> b)),
>           Eq (g' env h' a)) =>
>           Eq (App env (g env h (a -> b), g' env h' a) b) where
>    (==) (App f e) (App f' e') = (f == f') && (e == e')

> instance Show' (Var env h v) => Show (Var env h v) where
>    show = show''
> instance Show' (Abs env h v) => Show (Abs env h v) where
>    show = show''
> instance Show' (App env h v) => Show (App env h v) where 
>    show = show''

> show'' = show' (map (:[]) ['a' .. 'z']) []

> class Show' a where
>    show' :: [String] -> [String] -> a -> String
> 
> instance Show' (Var env Z a) where
>    show' (v:nvs) [] _ = v		-- free variable
>    show' nvs (v:vs) VZ = v
> 
> instance (Show' (Var env n b)) =>
>          Show' (Var (a, env) (S n) b) where
>    show' (v:nvs) [] _ = v		-- free variable
>    show' nvs (v:vs) (VS n) = show' nvs vs n
> 
> instance Show' (g (a, env) h v) =>
>          Show' (Abs env (g (a, env) h v) (a -> v)) where
>    show' (v:nvs) vs (Abs e) = "(\\" ++ v ++ "." ++
>                                 show' nvs (v:vs) e ++ ")"
> 
> instance (Show' (g env h (a->b)), 
>           Show' (g' env h' a)) => 
>          Show' (App env (g env h (a -> b), g' env h' a) b) where
>    show' nvs vs (App f e) = "(" ++ show' nvs vs f ++ " " ++ 
>                                    show' nvs vs e ++ ")"

Reduction (to term)

All the above are quite straight forward. Reduction at term level, however, is a difficult problem because the result type needs to be computed.

Subtitution requires moving a term to under an abstraction, and due to the de Bruijn index representation, we have to adjust the level of all free variables in the term to still refer to their old positions in the environment. Here is a nest function that nests all free variables in a term by one level:

> nest' :: Nest Z g env h v (a, env) h' => 
>          g env h v -> g (a, env) h' v 
> nest' = nest Z

The Nest class takes a number m that represents the level of outter abstractions when we traverse down the term tree, so that variables with less level than m will stay the same because they are not free in the original term.

> class Nest m g env h v env' h' | m g env h v env' -> h' where
>    nest :: m -> g env h v -> g env' h' v

Recursing down the variable structure and deciding whether to add an extra VS is the most challenging part of it because the environment has to be carefully re-arranged.

> instance Index env Z v => 
>          Nest Z Var env Z v (a, env) (S Z) where
>    nest Z VZ = VS VZ
> 
> instance Nest (S m) Var env Z v (v, env') Z where
>    nest (S m) VZ = VZ
> 
> instance (Index env' n' v,
>           Nest Z Var env n v env' n') =>
>          Nest Z Var (b, env) (S n) v (a, env') (S n') where
>    nest Z (VS n) = VS (nest Z n)
> 
> instance (Index env' n' v,
>           Nest m Var env n v env' n') =>
>          Nest (S m) Var (b, env) (S n) v (b, env') (S n') where
>    nest (S m) (VS n) = VS (nest m n)

For abstraction and application, the traversal is easy.

> instance Nest (S m) g (b,env) h v (b, env') h' =>
>          Nest m Abs env (g (b, env) h v) (b -> v) 
>               env' (g (b, env') h' v) where
>    nest m (Abs e) = Abs (nest (S m) e)
> 
> instance (Nest m g env g' (a -> b) env' g'',
>           Nest m h env h' a env' h'') =>
>          Nest m App env (g env g' (a -> b), h env h' a) b
>               env' (g env' g'' (a -> b), h env' h'' a) where
>    nest m (App f e) = App (nest m f) (nest m e)

Here are some tests, n1 and nʹ should be of the same type.

> n = Abs (App VZ (VS VZ))
> n1 = Abs (App VZ (VS (VS VZ)))

> n' ::  (Index (a', env) (S Z) a, Index env Z a) =>
>     Abs (a', env)
>         (App (a -> b, (a', env))
>              (Var (a -> b, (a', env)) Z (a -> b), 
>               Var (a -> b, (a', env)) (S (S Z)) a)
>              b)
>         ((a -> b) -> b)
> n' = nest' n

Armed with the nest function, we will then tackle the substitution. Because a substitution of

subst :: g env h a -> g' (a, env) h' v -> g'' env' h'' v

should effectively remove all references to the top a from all environments in the subterms, we need a type level function that removes n-th element from an environment.

> class RemoveNth n env env' | n env -> env'
> instance RemoveNth Z (a, env) env
> instance RemoveNth n env env' => RemoveNth (S n) (a, env) (a, env')

The most tricky part is that we also need a way to re-construct variables after adjusting the environment. This is achieved by tracking the environment in another variable, and later add it back.

> class VarAdd m n d | m n -> d where
>    varadd :: m -> n -> d

> instance VarAdd () (Var env' n b) 
>                 (Var env' n b) where
>    varadd () n = n

> instance VarAdd (Var (a, env) Z a) (Var env' n b) 
>                 (Var env' n b) where
>    varadd VZ n = n

> instance (Index env' n b,
>           VarAdd (Var env m a') (Var (a, env') (S n) b) w) => 
>          VarAdd (Var (a, env) (S m) a') (Var env' n b) w where
>    varadd (VS m) n = varadd m ((VS n) :: Var (a, env') (S n) b)

The substitution class uses a number m to track the level of abstractions during the traversal, and substitutes only the free variables that refer to current environment stacktop in expression e by s, while preserving both local variables and free variables that refers beyond the environment stacktop.

> class Subst m s e w | m s e -> w where
>    subst :: m -> s -> e -> w

Again, the most challenging part of it is to handle variables, and because we need to reconstruct the variable’s environment using VarAdd, we have to pass an temporary variable along.

This is the base case where an actual substitution happens:

> instance Subst Z (g env h v, d) (Var (v, env') Z v) (g env h v) where 
>    subst m (s, _) v = s

These are the two other base cases that with no substitution, and we have to reconstruct the variable. One for free variables where we need to reduce one level, and remove the a from environment stack:

> instance(Index env n b,
>          VarAdd d (Var env n b) d') =>
>          Subst Z (s, d) (Var (a, env) (S n) b) d' where
>    subst m (_, d) (VS n) = varadd d n

One for local variables where there is no modification of the variable level:

> instance (RemoveNth (S m) (a, env) (a, env'),
>           VarAdd d (Var (a, env') Z a) d') =>
>          Subst (S m) (s, d) (Var (a, env) Z a) d' where
>    subst _ (_, d) VZ = varadd d (VZ :: Var (a, env') Z a)

Below is the general reduction case:

> instance (Index env' h d, 
>           Subst m (s, Var (a, env') (S h) d) (Var env n v) w) =>
>          Subst (S m) (s, Var env' h d) (Var (a, env) (S n) v) w where
>    subst (S m) (s, d) (VS n) = subst m (s, (VS d) :: Var (a, env') (S h) d) n

Traversing Abstraction requires incrementing m, nesting of the term to be substituted, and resetting the temprary variable to Z for VarAdd purpose mentioned above.

> instance (Nest Z g1 env1 h1 v1 (a, env1) h1',
>           Subst (S m) (g1 (a, env1) h1' v1, Var (a, ()) Z a) 
>                 (g (a, env) h v) (g2 (a, env2) h2 v)) =>
>          Subst m (g1 env1 h1 v1, f) (Abs env (g (a, env) h v) (a -> v))
>                  (Abs env2 (g2 (a, env2) h2 v) (a -> v)) where
>    subst m (s, f) (Abs e) = Abs (subst (S m) 
>                                        (nest' s :: g1 (a, env1) h1' v1,
>                                         VZ :: Var (a, ()) Z a) e)

Traversing application is straight forward:

> instance (Subst m s (g env h (a -> b)) (g1 env1 h1 (a -> b)),
>           Subst m s (g' env h' a) (g2 env1 h2 a)) => 
>          Subst m s (App env (g env h (a -> b), g' env h' a) b)
>                (App env1 (g1 env1 h1 (a -> b), g2 env1 h2 a) b) where
>    subst m s (App f e) = App (subst m s f) (subst m s e)

As an example, to reduce a Beta redex:

> beta :: Subst Z (g' env h' a, ()) (g (a, env) h b) (g1 env h1 b) =>
>         App env (Abs env (g (a, env) h b) (a -> b), g' env h' a) b -> 
>         g1 env h1 b
> beta (App (Abs e) e') = subst Z (e', ()) e         
> 
> one'' = beta one'
> two = beta (suc % one)

Because simply typed lambda calculus is strongly normalising, reduction order doesn’t matter. Here is a method to reduce all existing beta redices in a term (and may produce new beta redices as a result).

> class Reduce a b | a -> b where
>    reduce :: a -> b

> instance Reduce (Var env n v) (Var env n v) where
>    reduce = id

> instance Reduce (g (a, env) h v) (g' (a, env) h' v) =>
>          Reduce (Abs env (g (a, env) h v) (a -> v))
>               (Abs env (g' (a, env) h' v) (a -> v)) where
>    reduce (Abs e) = Abs (reduce e)

Application f e has three cases, depending on the structure of function f

v e
(\v . e') e
(e' e'') e

First case: reduce e

> instance Reduce (g env h a) (g' env h' a) =>
>          Reduce (App env (Var env n (a -> b), g env h a) b)
>               (App env (Var env n (a -> b), g' env h' a) b) where
>    reduce (App f e) = App f (reduce e)

Second case: reduce e and then reduce this redex

> instance (Reduce (g' env h' a) (g1 env h1 a),
>           Reduce (g (a, env) h b) (g2 (a, env) h2 b), 
>           Subst Z (g1 env h1 a, ()) (g2 (a, env) h2 b) (g3 env h3 b)) =>
>          Reduce (App env (Abs env (g (a, env) h b) (a -> b), g' env h' a) b)
>               (g3 env h3 b) where
>    reduce (App (Abs f) e) = beta (App (Abs (reduce f)) (reduce e))

Third case: reduce both f and e

> instance (Reduce (App env h1 (a -> b)) (g1' env h1' (a -> b)), 
>           Reduce (g2 env h2 a) (g2' env h2' a)) =>
>          Reduce (App env (App env h1 (a -> b), g2 env h2 a) b)
>               (App env (g1' env h1' (a -> b), g2' env h2' a) b) where
>    reduce (App f e) = App (reduce f) (reduce e)

To reduce a term to normal form, we have to first define the depth of a redex.

> class Max a b c | a b -> c where
>    maxN :: a -> b -> c
> 
> instance Max Z Z Z where maxN _ _ = Z
> instance Max (S a) Z (S a) where maxN a _ = a
> instance Max Z (S b) (S b) where maxN _ b = b
> instance Max a b c => Max (S a) (S b) (S c) where 
>    maxN (S a) (S b) = S (maxN a b)

> class RedexDepth a d | a -> d where
>    redexDepth :: a -> d
> 
> instance RedexDepth (Var env h v) Z where
>    redexDepth _ = Z
> 
> instance RedexDepth (g (a,env) h b) d => 
>          RedexDepth (Abs env (g (a,env) h b) (a -> b)) d where
>    redexDepth (Abs e) = redexDepth e

> instance (RedexDepth (g1 env h1 (a -> b)) d1, 
>           RedexDepth (g2 env h2 a) d2, 
>           Max d1 d2 d) =>
>          RedexDepth (App env (g1 env h1 (a -> b), g2 env h2 a) b) (S d) where
>    redexDepth (App f e) = S (maxN (redexDepth f) (redexDepth e))

Then we define normalization to be repeatedly reducing the term for RedexDepth times.

> class Normalize n a b | n a -> b where
>    normalize :: n -> a -> b
> 
> instance Normalize Z a a where
>    normalize _ = id
> 
> instance (Reduce a b, Normalize n b c) => Normalize (S n) a c where
>    normalize (S n) x = normalize n (reduce x)
> 
> norm :: (RedexDepth a d, Normalize d a b) => a -> b
> norm x = normalize (redexDepth x) x

Examples

Here are some example lambda expressions.

> s = l $ l $ l $ v2 % v0 % (v1 % v0)
> k = l $ l $ v1
> i' = s % k % k

> two' = s % (s % (k % s) % k) % i

A test to demonstrate that Reduce really works for the 3rd case

> t = l $ l $ (i % (v1 % v0)) % (i % zero)

More Church Numeral fun

> true  = l $ l $ v1
> false = l $ l $ v0
> plus  = l $ l $ l $ l $ v3 % v1 % (v2 % v1 % v0)
> mult  = l $ l $ l $ v2 % (v1 % v0)
> pre   = l $ l $ l $ v2 % (l $ l $ v0 % (v1 % v3)) % (l $ v1) % i
> iszero =l $ v0 % (l $ false) % true

Some list fun

> cons = l $ l $ l $ v0 % v2 % v1
> car  = l $ v0 % true
> cdr  = l $ v0 % false