With the `-fglasgow-exts`

flag, GHC lets you declare
a data type with no constructors. For example:

data S -- S :: * data T a -- T :: * -> *

Syntactically, the declaration lacks the "= constrs" part. The
type can be parameterised over types of any kind, but if the kind is
not `*`

then an explicit kind annotation must be used
(see Section 7.8.4, “Explicitly-kinded quantification”).

Such data types have only one value, namely bottom. Nevertheless, they can be useful when defining "phantom types".

Haskell allows datatypes to be given contexts, e.g.

data Eq a => Set a = NilSet | ConsSet a (Set a)

give constructors with types:

NilSet :: Set a ConsSet :: Eq a => a -> Set a -> Set a

In GHC this feature is an extension called
`DatatypeContexts`

, and on by default.

GHC allows type constructors, classes, and type variables to be operators, and to be written infix, very much like expressions. More specifically:

A type constructor or class can be an operator, beginning with a colon; e.g.

`:*:`

. The lexical syntax is the same as that for data constructors.Data type and type-synonym declarations can be written infix, parenthesised if you want further arguments. E.g.

data a :*: b = Foo a b type a :+: b = Either a b class a :=: b where ... data (a :**: b) x = Baz a b x type (a :++: b) y = Either (a,b) y

Types, and class constraints, can be written infix. For example

x :: Int :*: Bool f :: (a :=: b) => a -> b

A type variable can be an (unqualified) operator e.g.

`+`

. The lexical syntax is the same as that for variable operators, excluding "(.)", "(!)", and "(*)". In a binding position, the operator must be parenthesised. For example:type T (+) = Int + Int f :: T Either f = Left 3 liftA2 :: Arrow (~>) => (a -> b -> c) -> (e ~> a) -> (e ~> b) -> (e ~> c) liftA2 = ...

Back-quotes work as for expressions, both for type constructors and type variables; e.g.

`Int `Either` Bool`

, or`Int `a` Bool`

. Similarly, parentheses work the same; e.g.`(:*:) Int Bool`

.Fixities may be declared for type constructors, or classes, just as for data constructors. However, one cannot distinguish between the two in a fixity declaration; a fixity declaration sets the fixity for a data constructor and the corresponding type constructor. For example:

infixl 7 T, :*:

sets the fixity for both type constructor

`T`

and data constructor`T`

, and similarly for`:*:`

.`Int `a` Bool`

.Function arrow is

`infixr`

with fixity 0. (This might change; I'm not sure what it should be.)

Type synonyms are like macros at the type level, but Haskell 98 imposes many rules
on individual synonym declarations.
With the `-XLiberalTypeSynonyms`

extension,
GHC does validity checking on types *only after expanding type synonyms*.
That means that GHC can be very much more liberal about type synonyms than Haskell 98.

You can write a

`forall`

(including overloading) in a type synonym, thus:type Discard a = forall b. Show b => a -> b -> (a, String) f :: Discard a f x y = (x, show y) g :: Discard Int -> (Int,String) -- A rank-2 type g f = f 3 True

If you also use

`-XUnboxedTuples`

, you can write an unboxed tuple in a type synonym:type Pr = (# Int, Int #) h :: Int -> Pr h x = (# x, x #)

You can apply a type synonym to a forall type:

type Foo a = a -> a -> Bool f :: Foo (forall b. b->b)

After expanding the synonym,

`f`

has the legal (in GHC) type:f :: (forall b. b->b) -> (forall b. b->b) -> Bool

You can apply a type synonym to a partially applied type synonym:

type Generic i o = forall x. i x -> o x type Id x = x foo :: Generic Id []

After expanding the synonym,

`foo`

has the legal (in GHC) type:foo :: forall x. x -> [x]

GHC currently does kind checking before expanding synonyms (though even that could be changed.)

After expanding type synonyms, GHC does validity checking on types, looking for the following mal-formedness which isn't detected simply by kind checking:

Type constructor applied to a type involving for-alls.

Unboxed tuple on left of an arrow.

Partially-applied type synonym.

So, for example, this will be rejected:

type Pr = (# Int, Int #) h :: Pr -> Int h x = ...

because GHC does not allow unboxed tuples on the left of a function arrow.

The idea of using existential quantification in data type declarations
was suggested by Perry, and implemented in Hope+ (Nigel Perry, *The Implementation
of Practical Functional Programming Languages*, PhD Thesis, University of
London, 1991). It was later formalised by Laufer and Odersky
(*Polymorphic type inference and abstract data types*,
TOPLAS, 16(5), pp1411-1430, 1994).
It's been in Lennart
Augustsson's **hbc** Haskell compiler for several years, and
proved very useful. Here's the idea. Consider the declaration:

data Foo = forall a. MkFoo a (a -> Bool) | Nil

The data type `Foo`

has two constructors with types:

MkFoo :: forall a. a -> (a -> Bool) -> Foo Nil :: Foo

Notice that the type variable `a`

in the type of `MkFoo`

does not appear in the data type itself, which is plain `Foo`

.
For example, the following expression is fine:

[MkFoo 3 even, MkFoo 'c' isUpper] :: [Foo]

Here, `(MkFoo 3 even)`

packages an integer with a function
`even`

that maps an integer to `Bool`

; and ```
MkFoo 'c'
isUpper
```

packages a character with a compatible function. These
two things are each of type `Foo`

and can be put in a list.

What can we do with a value of type `Foo`

?. In particular,
what happens when we pattern-match on `MkFoo`

?

f (MkFoo val fn) = ???

Since all we know about `val`

and `fn`

is that they
are compatible, the only (useful) thing we can do with them is to
apply `fn`

to `val`

to get a boolean. For example:

f :: Foo -> Bool f (MkFoo val fn) = fn val

What this allows us to do is to package heterogeneous values together with a bunch of functions that manipulate them, and then treat that collection of packages in a uniform manner. You can express quite a bit of object-oriented-like programming this way.

What has this to do with *existential* quantification?
Simply that `MkFoo`

has the (nearly) isomorphic type

MkFoo :: (exists a . (a, a -> Bool)) -> Foo

But Haskell programmers can safely think of the ordinary
*universally* quantified type given above, thereby avoiding
adding a new existential quantification construct.

An easy extension is to allow arbitrary contexts before the constructor. For example:

data Baz = forall a. Eq a => Baz1 a a | forall b. Show b => Baz2 b (b -> b)

The two constructors have the types you'd expect:

Baz1 :: forall a. Eq a => a -> a -> Baz Baz2 :: forall b. Show b => b -> (b -> b) -> Baz

But when pattern matching on `Baz1`

the matched values can be compared
for equality, and when pattern matching on `Baz2`

the first matched
value can be converted to a string (as well as applying the function to it).
So this program is legal:

f :: Baz -> String f (Baz1 p q) | p == q = "Yes" | otherwise = "No" f (Baz2 v fn) = show (fn v)

Operationally, in a dictionary-passing implementation, the
constructors `Baz1`

and `Baz2`

must store the
dictionaries for `Eq`

and `Show`

respectively, and
extract it on pattern matching.

GHC allows existentials to be used with records syntax as well. For example:

data Counter a = forall self. NewCounter { _this :: self , _inc :: self -> self , _display :: self -> IO () , tag :: a }

Here `tag`

is a public field, with a well-typed selector
function `tag :: Counter a -> a`

. The `self`

type is hidden from the outside; any attempt to apply `_this`

,
`_inc`

or `_display`

as functions will raise a
compile-time error. In other words, *GHC defines a record selector function
only for fields whose type does not mention the existentially-quantified variables*.
(This example used an underscore in the fields for which record selectors
will not be defined, but that is only programming style; GHC ignores them.)

To make use of these hidden fields, we need to create some helper functions:

inc :: Counter a -> Counter a inc (NewCounter x i d t) = NewCounter { _this = i x, _inc = i, _display = d, tag = t } display :: Counter a -> IO () display NewCounter{ _this = x, _display = d } = d x

Now we can define counters with different underlying implementations:

counterA :: Counter String counterA = NewCounter { _this = 0, _inc = (1+), _display = print, tag = "A" } counterB :: Counter String counterB = NewCounter { _this = "", _inc = ('#':), _display = putStrLn, tag = "B" } main = do display (inc counterA) -- prints "1" display (inc (inc counterB)) -- prints "##"

Record update syntax is supported for existentials (and GADTs):

setTag :: Counter a -> a -> Counter a setTag obj t = obj{ tag = t }

The rule for record update is this: *
the types of the updated fields may
mention only the universally-quantified type variables
of the data constructor. For GADTs, the field may mention only types
that appear as a simple type-variable argument in the constructor's result
type*. For example:

data T a b where { T1 { f1::a, f2::b, f3::(b,c) } :: T a b } -- c is existential upd1 t x = t { f1=x } -- OK: upd1 :: T a b -> a' -> T a' b upd2 t x = t { f3=x } -- BAD (f3's type mentions c, which is -- existentially quantified) data G a b where { G1 { g1::a, g2::c } :: G a [c] } upd3 g x = g { g1=x } -- OK: upd3 :: G a b -> c -> G c b upd4 g x = g { g2=x } -- BAD (f2's type mentions c, which is not a simple -- type-variable argument in G1's result type)

There are several restrictions on the ways in which existentially-quantified constructors can be use.

When pattern matching, each pattern match introduces a new, distinct, type for each existential type variable. These types cannot be unified with any other type, nor can they escape from the scope of the pattern match. For example, these fragments are incorrect:

f1 (MkFoo a f) = a

Here, the type bound by

`MkFoo`

"escapes", because`a`

is the result of`f1`

. One way to see why this is wrong is to ask what type`f1`

has:f1 :: Foo -> a -- Weird!

What is this "

`a`

" in the result type? Clearly we don't mean this:f1 :: forall a. Foo -> a -- Wrong!

The original program is just plain wrong. Here's another sort of error

f2 (Baz1 a b) (Baz1 p q) = a==q

It's ok to say

`a==b`

or`p==q`

, but`a==q`

is wrong because it equates the two distinct types arising from the two`Baz1`

constructors.You can't pattern-match on an existentially quantified constructor in a

`let`

or`where`

group of bindings. So this is illegal:f3 x = a==b where { Baz1 a b = x }

Instead, use a

`case`

expression:f3 x = case x of Baz1 a b -> a==b

In general, you can only pattern-match on an existentially-quantified constructor in a

`case`

expression or in the patterns of a function definition. The reason for this restriction is really an implementation one. Type-checking binding groups is already a nightmare without existentials complicating the picture. Also an existential pattern binding at the top level of a module doesn't make sense, because it's not clear how to prevent the existentially-quantified type "escaping". So for now, there's a simple-to-state restriction. We'll see how annoying it is.You can't use existential quantification for

`newtype`

declarations. So this is illegal:newtype T = forall a. Ord a => MkT a

Reason: a value of type

`T`

must be represented as a pair of a dictionary for`Ord t`

and a value of type`t`

. That contradicts the idea that`newtype`

should have no concrete representation. You can get just the same efficiency and effect by using`data`

instead of`newtype`

. If there is no overloading involved, then there is more of a case for allowing an existentially-quantified`newtype`

, because the`data`

version does carry an implementation cost, but single-field existentially quantified constructors aren't much use. So the simple restriction (no existential stuff on`newtype`

) stands, unless there are convincing reasons to change it.You can't use

`deriving`

to define instances of a data type with existentially quantified data constructors. Reason: in most cases it would not make sense. For example:;data T = forall a. MkT [a] deriving( Eq )

To derive

`Eq`

in the standard way we would need to have equality between the single component of two`MkT`

constructors:instance Eq T where (MkT a) == (MkT b) = ???

But

`a`

and`b`

have distinct types, and so can't be compared. It's just about possible to imagine examples in which the derived instance would make sense, but it seems altogether simpler simply to prohibit such declarations. Define your own instances!

GHC allows you to declare an algebraic data type by giving the type signatures of constructors explicitly. For example:

data Maybe a where Nothing :: Maybe a Just :: a -> Maybe a

The form is called a "GADT-style declaration" because Generalised Algebraic Data Types, described in Section 7.4.7, “Generalised Algebraic Data Types (GADTs)”, can only be declared using this form.

Notice that GADT-style syntax generalises existential types (Section 7.4.5, “Existentially quantified data constructors ”). For example, these two declarations are equivalent:

data Foo = forall a. MkFoo a (a -> Bool) data Foo' where { MKFoo :: a -> (a->Bool) -> Foo' }

Any data type that can be declared in standard Haskell-98 syntax can also be declared using GADT-style syntax. The choice is largely stylistic, but GADT-style declarations differ in one important respect: they treat class constraints on the data constructors differently. Specifically, if the constructor is given a type-class context, that context is made available by pattern matching. For example:

data Set a where MkSet :: Eq a => [a] -> Set a makeSet :: Eq a => [a] -> Set a makeSet xs = MkSet (nub xs) insert :: a -> Set a -> Set a insert a (MkSet as) | a `elem` as = MkSet as | otherwise = MkSet (a:as)

A use of `MkSet`

as a constructor (e.g. in the definition of `makeSet`

)
gives rise to a `(Eq a)`

constraint, as you would expect. The new feature is that pattern-matching on `MkSet`

(as in the definition of `insert`

) makes *available* an `(Eq a)`

context. In implementation terms, the `MkSet`

constructor has a hidden field that stores
the `(Eq a)`

dictionary that is passed to `MkSet`

; so
when pattern-matching that dictionary becomes available for the right-hand side of the match.
In the example, the equality dictionary is used to satisfy the equality constraint
generated by the call to `elem`

, so that the type of
`insert`

itself has no `Eq`

constraint.

For example, one possible application is to reify dictionaries:

data NumInst a where MkNumInst :: Num a => NumInst a intInst :: NumInst Int intInst = MkNumInst plus :: NumInst a -> a -> a -> a plus MkNumInst p q = p + q

Here, a value of type `NumInst a`

is equivalent
to an explicit `(Num a)`

dictionary.

All this applies to constructors declared using the syntax of Section 7.4.5.2, “Existentials and type classes”.
For example, the `NumInst`

data type above could equivalently be declared
like this:

data NumInst a = Num a => MkNumInst (NumInst a)

Notice that, unlike the situation when declaring an existential, there is
no `forall`

, because the `Num`

constrains the
data type's universally quantified type variable `a`

.
A constructor may have both universal and existential type variables: for example,
the following two declarations are equivalent:

data T1 a = forall b. (Num a, Eq b) => MkT1 a b data T2 a where MkT2 :: (Num a, Eq b) => a -> b -> T2 a

All this behaviour contrasts with Haskell 98's peculiar treatment of contexts on a data type declaration (Section 4.2.1 of the Haskell 98 Report). In Haskell 98 the definition

data Eq a => Set' a = MkSet' [a]

gives `MkSet'`

the same type as `MkSet`

above. But instead of
*making available* an `(Eq a)`

constraint, pattern-matching
on `MkSet'`

*requires* an `(Eq a)`

constraint!
GHC faithfully implements this behaviour, odd though it is. But for GADT-style declarations,
GHC's behaviour is much more useful, as well as much more intuitive.

The rest of this section gives further details about GADT-style data type declarations.

The result type of each data constructor must begin with the type constructor being defined. If the result type of all constructors has the form

`T a1 ... an`

, where`a1 ... an`

are distinct type variables, then the data type is*ordinary*; otherwise is a*generalised*data type (Section 7.4.7, “Generalised Algebraic Data Types (GADTs)”).As with other type signatures, you can give a single signature for several data constructors. In this example we give a single signature for

`T1`

and`T2`

:data T a where T1,T2 :: a -> T a T3 :: T a

The type signature of each constructor is independent, and is implicitly universally quantified as usual. In particular, the type variable(s) in the "

`data T a where`

" header have no scope, and different constructors may have different universally-quantified type variables:data T a where -- The 'a' has no scope T1,T2 :: b -> T b -- Means forall b. b -> T b T3 :: T a -- Means forall a. T a

A constructor signature may mention type class constraints, which can differ for different constructors. For example, this is fine:

data T a where T1 :: Eq b => b -> b -> T b T2 :: (Show c, Ix c) => c -> [c] -> T c

When patten matching, these constraints are made available to discharge constraints in the body of the match. For example:

f :: T a -> String f (T1 x y) | x==y = "yes" | otherwise = "no" f (T2 a b) = show a

Note that

`f`

is not overloaded; the`Eq`

constraint arising from the use of`==`

is discharged by the pattern match on`T1`

and similarly the`Show`

constraint arising from the use of`show`

.Unlike a Haskell-98-style data type declaration, the type variable(s) in the "

`data Set a where`

" header have no scope. Indeed, one can write a kind signature instead:data Set :: * -> * where ...

or even a mixture of the two:

data Bar a :: (* -> *) -> * where ...

The type variables (if given) may be explicitly kinded, so we could also write the header for

`Foo`

like this:data Bar a (b :: * -> *) where ...

You can use strictness annotations, in the obvious places in the constructor type:

data Term a where Lit :: !Int -> Term Int If :: Term Bool -> !(Term a) -> !(Term a) -> Term a Pair :: Term a -> Term b -> Term (a,b)

You can use a

`deriving`

clause on a GADT-style data type declaration. For example, these two declarations are equivalentdata Maybe1 a where { Nothing1 :: Maybe1 a ; Just1 :: a -> Maybe1 a } deriving( Eq, Ord ) data Maybe2 a = Nothing2 | Just2 a deriving( Eq, Ord )

The type signature may have quantified type variables that do not appear in the result type:

data Foo where MkFoo :: a -> (a->Bool) -> Foo Nil :: Foo

Here the type variable

`a`

does not appear in the result type of either constructor. Although it is universally quantified in the type of the constructor, such a type variable is often called "existential". Indeed, the above declaration declares precisely the same type as the`data Foo`

in Section 7.4.5, “Existentially quantified data constructors ”.The type may contain a class context too, of course:

data Showable where MkShowable :: Show a => a -> Showable

You can use record syntax on a GADT-style data type declaration:

data Person where Adult :: { name :: String, children :: [Person] } -> Person Child :: Show a => { name :: !String, funny :: a } -> Person

As usual, for every constructor that has a field

`f`

, the type of field`f`

must be the same (modulo alpha conversion). The`Child`

constructor above shows that the signature may have a context, existentially-quantified variables, and strictness annotations, just as in the non-record case. (NB: the "type" that follows the double-colon is not really a type, because of the record syntax and strictness annotations. A "type" of this form can appear only in a constructor signature.)Record updates are allowed with GADT-style declarations, only fields that have the following property: the type of the field mentions no existential type variables.

As in the case of existentials declared using the Haskell-98-like record syntax (Section 7.4.5.3, “Record Constructors”), record-selector functions are generated only for those fields that have well-typed selectors. Here is the example of that section, in GADT-style syntax:

data Counter a where NewCounter { _this :: self , _inc :: self -> self , _display :: self -> IO () , tag :: a } :: Counter a

As before, only one selector function is generated here, that for

`tag`

. Nevertheless, you can still use all the field names in pattern matching and record construction.

Generalised Algebraic Data Types generalise ordinary algebraic data types by allowing constructors to have richer return types. Here is an example:

data Term a where Lit :: Int -> Term Int Succ :: Term Int -> Term Int IsZero :: Term Int -> Term Bool If :: Term Bool -> Term a -> Term a -> Term a Pair :: Term a -> Term b -> Term (a,b)

Notice that the return type of the constructors is not always `Term a`

, as is the
case with ordinary data types. This generality allows us to
write a well-typed `eval`

function
for these `Terms`

:

eval :: Term a -> a eval (Lit i) = i eval (Succ t) = 1 + eval t eval (IsZero t) = eval t == 0 eval (If b e1 e2) = if eval b then eval e1 else eval e2 eval (Pair e1 e2) = (eval e1, eval e2)

The key point about GADTs is that *pattern matching causes type refinement*.
For example, in the right hand side of the equation

eval :: Term a -> a eval (Lit i) = ...

the type `a`

is refined to `Int`

. That's the whole point!
A precise specification of the type rules is beyond what this user manual aspires to,
but the design closely follows that described in
the paper Simple
unification-based type inference for GADTs,
(ICFP 2006).
The general principle is this: *type refinement is only carried out
based on user-supplied type annotations*.
So if no type signature is supplied for `eval`

, no type refinement happens,
and lots of obscure error messages will
occur. However, the refinement is quite general. For example, if we had:

eval :: Term a -> a -> a eval (Lit i) j = i+j

the pattern match causes the type `a`

to be refined to `Int`

(because of the type
of the constructor `Lit`

), and that refinement also applies to the type of `j`

, and
the result type of the `case`

expression. Hence the addition `i+j`

is legal.

These and many other examples are given in papers by Hongwei Xi, and Tim Sheard. There is a longer introduction on the wiki, and Ralf Hinze's Fun with phantom types also has a number of examples. Note that papers may use different notation to that implemented in GHC.

The rest of this section outlines the extensions to GHC that support GADTs. The extension is enabled with
`-XGADTs`

. The `-XGADTs`

flag also sets `-XRelaxedPolyRec`

.

A GADT can only be declared using GADT-style syntax (Section 7.4.6, “Declaring data types with explicit constructor signatures”); the old Haskell-98 syntax for data declarations always declares an ordinary data type. The result type of each constructor must begin with the type constructor being defined, but for a GADT the arguments to the type constructor can be arbitrary monotypes. For example, in the

`Term`

data type above, the type of each constructor must end with`Term ty`

, but the`ty`

need not be a type variable (e.g. the`Lit`

constructor).It is permitted to declare an ordinary algebraic data type using GADT-style syntax. What makes a GADT into a GADT is not the syntax, but rather the presence of data constructors whose result type is not just

`T a b`

.You cannot use a

`deriving`

clause for a GADT; only for an ordinary data type.As mentioned in Section 7.4.6, “Declaring data types with explicit constructor signatures”, record syntax is supported. For example:

data Term a where Lit { val :: Int } :: Term Int Succ { num :: Term Int } :: Term Int Pred { num :: Term Int } :: Term Int IsZero { arg :: Term Int } :: Term Bool Pair { arg1 :: Term a , arg2 :: Term b } :: Term (a,b) If { cnd :: Term Bool , tru :: Term a , fls :: Term a } :: Term a

However, for GADTs there is the following additional constraint: every constructor that has a field

`f`

must have the same result type (modulo alpha conversion) Hence, in the above example, we cannot merge the`num`

and`arg`

fields above into a single name. Although their field types are both`Term Int`

, their selector functions actually have different types:num :: Term Int -> Term Int arg :: Term Bool -> Term Int

When pattern-matching against data constructors drawn from a GADT, for example in a

`case`

expression, the following rules apply:The type of the scrutinee must be rigid.

The type of the entire

`case`

expression must be rigid.The type of any free variable mentioned in any of the

`case`

alternatives must be rigid.

A type is "rigid" if it is completely known to the compiler at its binding site. The easiest way to ensure that a variable a rigid type is to give it a type signature. For more precise details see Simple unification-based type inference for GADTs . The criteria implemented by GHC are given in the Appendix.