7.5. Generalised Algebraic Data Types (GADTs)

7.5. Generalised Algebraic Data Types (GADTs)
Prev	Chapter 7. GHC Language Features	Next

Generalised Algebraic Data Types generalise ordinary algebraic data types by allowing you to give the type signatures of constructors explicitly. For example:

  data Term a where
      Lit    :: Int -> Term Int
      Succ   :: Term Int -> Term Int
      IsZero :: Term Int -> Term Bool	
      If     :: Term Bool -> Term a -> Term a -> Term a
      Pair   :: Term a -> Term b -> Term (a,b)

Notice that the return type of the constructors is not always Term a, as is the case with ordinary vanilla data types. Now we can write a well-typed eval function for these Terms:

  eval :: Term a -> a
  eval (Lit i) 	    = i
  eval (Succ t)     = 1 + eval t
  eval (IsZero t)   = eval t == 0
  eval (If b e1 e2) = if eval b then eval e1 else eval e2
  eval (Pair e1 e2) = (eval e1, eval e2)

These and many other examples are given in papers by Hongwei Xi, and Tim Sheard. There is a longer introduction on the wiki, and Ralf Hinze's Fun with phantom types also has a number of examples. Note that papers may use different notation to that implemented in GHC.

The rest of this section outlines the extensions to GHC that support GADTs. It is far from comprehensive, but the design closely follows that described in the paper Simple unification-based type inference for GADTs, which appeared in ICFP 2006.

Data type declarations have a 'where' form, as exemplified above. The type signature of each constructor is independent, and is implicitly universally quantified as usual. Unlike a normal Haskell data type declaration, the type variable(s) in the "data Term a where" header have no scope. Indeed, one can write a kind signature instead:
```
  data Term :: * -> * where ...
```
or even a mixture of the two:
```
  data Foo a :: (* -> *) -> * where ...
```
The type variables (if given) may be explicitly kinded, so we could also write the header for Foo like this:
```
  data Foo a (b :: * -> *) where ...
```
There are no restrictions on the type of the data constructor, except that the result type must begin with the type constructor being defined. For example, in the Term data type above, the type of each constructor must end with ... -> Term ....

You can use record syntax on a GADT-style data type declaration:

  data Term a where
      Lit    { val  :: Int }      :: Term Int
      Succ   { num  :: Term Int } :: Term Int
      Pred   { num  :: Term Int } :: Term Int
      IsZero { arg  :: Term Int } :: Term Bool	
      Pair   { arg1 :: Term a
             , arg2 :: Term b
             }                    :: Term (a,b)
      If     { cnd  :: Term Bool
             , tru  :: Term a
             , fls  :: Term a
             }                    :: Term a

For every constructor that has a field f, (a) the type of field f must be the same; and (b) the result type of the constructor must be the same; both modulo alpha conversion. Hence, in our example, we cannot merge the num and arg fields above into a single name. Although their field types are both Term Int, their selector functions actually have different types:

  num :: Term Int -> Term Int
  arg :: Term Bool -> Term Int

At the moment, record updates are not yet possible with GADT, so support is limited to record construction, selection and pattern matching:

  someTerm :: Term Bool
  someTerm = IsZero { arg = Succ { num = Lit { val = 0 } } }

  eval :: Term a -> a
  eval Lit    { val = i } = i
  eval Succ   { num = t } = eval t + 1
  eval Pred   { num = t } = eval t - 1
  eval IsZero { arg = t } = eval t == 0
  eval Pair   { arg1 = t1, arg2 = t2 } = (eval t1, eval t2)
  eval t@If{} = if eval (cnd t) then eval (tru t) else eval (fls t)

You can use strictness annotations, in the obvious places in the constructor type:

  data Term a where
      Lit    :: !Int -> Term Int
      If     :: Term Bool -> !(Term a) -> !(Term a) -> Term a
      Pair   :: Term a -> Term b -> Term (a,b)

You can use a deriving clause on a GADT-style data type declaration, but only if the data type could also have been declared in Haskell-98 syntax. For example, these two declarations are equivalent
```
  data Maybe1 a where {
      Nothing1 :: Maybe1 a ;
      Just1    :: a -> Maybe1 a
    } deriving( Eq, Ord )

  data Maybe2 a = Nothing2 | Just2 a 
       deriving( Eq, Ord )
```
This simply allows you to declare a vanilla Haskell-98 data type using the where form without losing the deriving clause.
Pattern matching causes type refinement. For example, in the right hand side of the equation
```
  eval :: Term a -> a
  eval (Lit i) =  ...
```
the type a is refined to Int. (That's the whole point!) A precise specification of the type rules is beyond what this user manual aspires to, but there is a paper about the ideas: "Wobbly types: practical type inference for generalised algebraic data types", on Simon PJ's home page.
The general principle is this: type refinement is only carried out based on user-supplied type annotations. So if no type signature is supplied for eval, no type refinement happens, and lots of obscure error messages will occur. However, the refinement is quite general. For example, if we had:
```
  eval :: Term a -> a -> a
  eval (Lit i) j =  i+j
```
the pattern match causes the type a to be refined to Int (because of the type of the constructor Lit, and that refinement also applies to the type of j, and the result type of the case expression. Hence the addition i+j is legal.

Notice that GADTs generalise existential types. For example, these two declarations are equivalent:

  data T a = forall b. MkT b (b->a)
  data T' a where { MKT :: b -> (b->a) -> T' a }

Prev	Up	Next
7.4. Type system extensions	Home	7.6. Template Haskell