This section, and the next one, documents GHC's type-class extensions. There's lots of background in the paper Type classes: exploring the design space (Simon Peyton Jones, Mark Jones, Erik Meijer).

All the extensions are enabled by the `-fglasgow-exts`

flag.

Multi-parameter type classes are permitted. For example:

class Collection c a where union :: c a -> c a -> c a ...etc.

There are no restrictions on the context in a class declaration (which introduces superclasses), except that the class hierarchy must be acyclic. So these class declarations are OK:

class Functor (m k) => FiniteMap m k where ... class (Monad m, Monad (t m)) => Transform t m where lift :: m a -> (t m) a

As in Haskell 98, The class hierarchy must be acyclic. However, the definition of "acyclic" involves only the superclass relationships. For example, this is OK:

class C a where { op :: D b => a -> b -> b } class C a => D a where { ... }

Here, `C`

is a superclass of `D`

, but it's OK for a
class operation `op`

of `C`

to mention `D`

. (It
would not be OK for `D`

to be a superclass of `C`

.)

Haskell 98 prohibits class method types to mention constraints on the class type variable, thus:

class Seq s a where fromList :: [a] -> s a elem :: Eq a => a -> s a -> Bool

The type of `elem`

is illegal in Haskell 98, because it
contains the constraint `Eq a`

, constrains only the
class type variable (in this case `a`

).
GHC lifts this restriction (flag `-XConstrainedClassMethods`

).

Functional dependencies are implemented as described by Mark Jones in “Type Classes with Functional Dependencies”, Mark P. Jones, In Proceedings of the 9th European Symposium on Programming, ESOP 2000, Berlin, Germany, March 2000, Springer-Verlag LNCS 1782, .

Functional dependencies are introduced by a vertical bar in the syntax of a class declaration; e.g.

class (Monad m) => MonadState s m | m -> s where ... class Foo a b c | a b -> c where ...

There should be more documentation, but there isn't (yet). Yell if you need it.

In a class declaration, all of the class type variables must be reachable (in the sense mentioned in Section 7.8.1, “Type signatures”) from the free variables of each method type. For example:

class Coll s a where empty :: s insert :: s -> a -> s

is not OK, because the type of `empty`

doesn't mention
`a`

. Functional dependencies can make the type variable
reachable:

class Coll s a | s -> a where empty :: s insert :: s -> a -> s

Alternatively `Coll`

might be rewritten

class Coll s a where empty :: s a insert :: s a -> a -> s a

which makes the connection between the type of a collection of
`a`

's (namely `(s a)`

) and the element type `a`

.
Occasionally this really doesn't work, in which case you can split the
class like this:

class CollE s where empty :: s class CollE s => Coll s a where insert :: s -> a -> s

The following description of the motivation and use of functional dependencies is taken from the Hugs user manual, reproduced here (with minor changes) by kind permission of Mark Jones.

Consider the following class, intended as part of a library for collection types:

class Collects e ce where empty :: ce insert :: e -> ce -> ce member :: e -> ce -> Bool

The type variable e used here represents the element type, while ce is the type of the container itself. Within this framework, we might want to define instances of this class for lists or characteristic functions (both of which can be used to represent collections of any equality type), bit sets (which can be used to represent collections of characters), or hash tables (which can be used to represent any collection whose elements have a hash function). Omitting standard implementation details, this would lead to the following declarations:

instance Eq e => Collects e [e] where ... instance Eq e => Collects e (e -> Bool) where ... instance Collects Char BitSet where ... instance (Hashable e, Collects a ce) => Collects e (Array Int ce) where ...

All this looks quite promising; we have a class and a range of interesting implementations. Unfortunately, there are some serious problems with the class declaration. First, the empty function has an ambiguous type:

empty :: Collects e ce => ce

By "ambiguous" we mean that there is a type variable e that appears on the left
of the `=>`

symbol, but not on the right. The problem with
this is that, according to the theoretical foundations of Haskell overloading,
we cannot guarantee a well-defined semantics for any term with an ambiguous
type.

We can sidestep this specific problem by removing the empty member from the class declaration. However, although the remaining members, insert and member, do not have ambiguous types, we still run into problems when we try to use them. For example, consider the following two functions:

f x y = insert x . insert y g = f True 'a'

for which GHC infers the following types:

f :: (Collects a c, Collects b c) => a -> b -> c -> c g :: (Collects Bool c, Collects Char c) => c -> c

Notice that the type for f allows the two parameters x and y to be assigned different types, even though it attempts to insert each of the two values, one after the other, into the same collection. If we're trying to model collections that contain only one type of value, then this is clearly an inaccurate type. Worse still, the definition for g is accepted, without causing a type error. As a result, the error in this code will not be flagged at the point where it appears. Instead, it will show up only when we try to use g, which might even be in a different module.

Faced with the problems described above, some Haskell programmers might be tempted to use something like the following version of the class declaration:

class Collects e c where empty :: c e insert :: e -> c e -> c e member :: e -> c e -> Bool

The key difference here is that we abstract over the type constructor c that is
used to form the collection type c e, and not over that collection type itself,
represented by ce in the original class declaration. This avoids the immediate
problems that we mentioned above: empty has type ```
Collects e c => c
e
```

, which is not ambiguous.

The function f from the previous section has a more accurate type:

f :: (Collects e c) => e -> e -> c e -> c e

The function g from the previous section is now rejected with a type error as
we would hope because the type of f does not allow the two arguments to have
different types.
This, then, is an example of a multiple parameter class that does actually work
quite well in practice, without ambiguity problems.
There is, however, a catch. This version of the Collects class is nowhere near
as general as the original class seemed to be: only one of the four instances
for `Collects`

given above can be used with this version of Collects because only one of
them---the instance for lists---has a collection type that can be written in
the form c e, for some type constructor c, and element type e.

To get a more useful version of the Collects class, Hugs provides a mechanism that allows programmers to specify dependencies between the parameters of a multiple parameter class (For readers with an interest in theoretical foundations and previous work: The use of dependency information can be seen both as a generalization of the proposal for `parametric type classes' that was put forward by Chen, Hudak, and Odersky, or as a special case of Mark Jones's later framework for "improvement" of qualified types. The underlying ideas are also discussed in a more theoretical and abstract setting in a manuscript [implparam], where they are identified as one point in a general design space for systems of implicit parameterization.). To start with an abstract example, consider a declaration such as:

class C a b where ...

which tells us simply that C can be thought of as a binary relation on types (or type constructors, depending on the kinds of a and b). Extra clauses can be included in the definition of classes to add information about dependencies between parameters, as in the following examples:

class D a b | a -> b where ... class E a b | a -> b, b -> a where ...

The notation `a -> b`

used here between the | and where
symbols --- not to be
confused with a function type --- indicates that the a parameter uniquely
determines the b parameter, and might be read as "a determines b." Thus D is
not just a relation, but actually a (partial) function. Similarly, from the two
dependencies that are included in the definition of E, we can see that E
represents a (partial) one-one mapping between types.

More generally, dependencies take the form `x1 ... xn -> y1 ... ym`

,
where x1, ..., xn, and y1, ..., yn are type variables with n>0 and
m>=0, meaning that the y parameters are uniquely determined by the x
parameters. Spaces can be used as separators if more than one variable appears
on any single side of a dependency, as in `t -> a b`

. Note that a class may be
annotated with multiple dependencies using commas as separators, as in the
definition of E above. Some dependencies that we can write in this notation are
redundant, and will be rejected because they don't serve any useful
purpose, and may instead indicate an error in the program. Examples of
dependencies like this include `a -> a `

,
`a -> a a `

,
`a -> `

, etc. There can also be
some redundancy if multiple dependencies are given, as in
`a->b`

,
`b->c `

, `a->c `

, and
in which some subset implies the remaining dependencies. Examples like this are
not treated as errors. Note that dependencies appear only in class
declarations, and not in any other part of the language. In particular, the
syntax for instance declarations, class constraints, and types is completely
unchanged.

By including dependencies in a class declaration, we provide a mechanism for the programmer to specify each multiple parameter class more precisely. The compiler, on the other hand, is responsible for ensuring that the set of instances that are in scope at any given point in the program is consistent with any declared dependencies. For example, the following pair of instance declarations cannot appear together in the same scope because they violate the dependency for D, even though either one on its own would be acceptable:

instance D Bool Int where ... instance D Bool Char where ...

Note also that the following declaration is not allowed, even by itself:

instance D [a] b where ...

The problem here is that this instance would allow one particular choice of [a] to be associated with more than one choice for b, which contradicts the dependency specified in the definition of D. More generally, this means that, in any instance of the form:

instance D t s where ...

for some particular types t and s, the only variables that can appear in s are the ones that appear in t, and hence, if the type t is known, then s will be uniquely determined.

The benefit of including dependency information is that it allows us to define
more general multiple parameter classes, without ambiguity problems, and with
the benefit of more accurate types. To illustrate this, we return to the
collection class example, and annotate the original definition of `Collects`

with a simple dependency:

class Collects e ce | ce -> e where empty :: ce insert :: e -> ce -> ce member :: e -> ce -> Bool

The dependency `ce -> e`

here specifies that the type e of elements is uniquely
determined by the type of the collection ce. Note that both parameters of
Collects are of kind *; there are no constructor classes here. Note too that
all of the instances of Collects that we gave earlier can be used
together with this new definition.

What about the ambiguity problems that we encountered with the original definition? The empty function still has type Collects e ce => ce, but it is no longer necessary to regard that as an ambiguous type: Although the variable e does not appear on the right of the => symbol, the dependency for class Collects tells us that it is uniquely determined by ce, which does appear on the right of the => symbol. Hence the context in which empty is used can still give enough information to determine types for both ce and e, without ambiguity. More generally, we need only regard a type as ambiguous if it contains a variable on the left of the => that is not uniquely determined (either directly or indirectly) by the variables on the right.

Dependencies also help to produce more accurate types for user defined functions, and hence to provide earlier detection of errors, and less cluttered types for programmers to work with. Recall the previous definition for a function f:

f x y = insert x y = insert x . insert y

for which we originally obtained a type:

f :: (Collects a c, Collects b c) => a -> b -> c -> c

Given the dependency information that we have for Collects, however, we can deduce that a and b must be equal because they both appear as the second parameter in a Collects constraint with the same first parameter c. Hence we can infer a shorter and more accurate type for f:

f :: (Collects a c) => a -> a -> c -> c

In a similar way, the earlier definition of g will now be flagged as a type error.

Although we have given only a few examples here, it should be clear that the addition of dependency information can help to make multiple parameter classes more useful in practice, avoiding ambiguity problems, and allowing more general sets of instance declarations.

An instance declaration has the form

instance (`assertion`

_{1}, ...,`assertion`

_{n}) =>`class`

`type`

_{1}...`type`

_{m}where ...

The part before the "`=>`

" is the
*context*, while the part after the
"`=>`

" is the *head* of the instance declaration.

In Haskell 98 the head of an instance declaration
must be of the form `C (T a1 ... an)`

, where
`C`

is the class, `T`

is a type constructor,
and the `a1 ... an`

are distinct type variables.
Furthermore, the assertions in the context of the instance declaration
must be of the form `C a`

where `a`

is a type variable that occurs in the head.

The `-XFlexibleInstances`

flag loosens these restrictions
considerably. Firstly, multi-parameter type classes are permitted. Secondly,
the context and head of the instance declaration can each consist of arbitrary
(well-kinded) assertions `(C t1 ... tn)`

subject only to the
following rules:

The Paterson Conditions: for each assertion in the context

No type variable has more occurrences in the assertion than in the head

The assertion has fewer constructors and variables (taken together and counting repetitions) than the head

The Coverage Condition. For each functional dependency,

`tvs`

_{left}`->`

`tvs`

_{right}, of the class, every type variable in S(`tvs`

_{right}) must appear in S(`tvs`

_{left}), where S is the substitution mapping each type variable in the class declaration to the corresponding type in the instance declaration.

These restrictions ensure that context reduction terminates: each reduction
step makes the problem smaller by at least one
constructor. Both the Paterson Conditions and the Coverage Condition are lifted
if you give the `-XUndecidableInstances`

flag (Section 7.6.3.2, “Undecidable instances”).
You can find lots of background material about the reason for these
restrictions in the paper
Understanding functional dependencies via Constraint Handling Rules.

For example, these are OK:

instance C Int [a] -- Multiple parameters instance Eq (S [a]) -- Structured type in head -- Repeated type variable in head instance C4 a a => C4 [a] [a] instance Stateful (ST s) (MutVar s) -- Head can consist of type variables only instance C a instance (Eq a, Show b) => C2 a b -- Non-type variables in context instance Show (s a) => Show (Sized s a) instance C2 Int a => C3 Bool [a] instance C2 Int a => C3 [a] b

But these are not:

-- Context assertion no smaller than head instance C a => C a where ... -- (C b b) has more more occurrences of b than the head instance C b b => Foo [b] where ...

The same restrictions apply to instances generated by
`deriving`

clauses. Thus the following is accepted:

data MinHeap h a = H a (h a) deriving (Show)

because the derived instance

instance (Show a, Show (h a)) => Show (MinHeap h a)

conforms to the above rules.

A useful idiom permitted by the above rules is as follows. If one allows overlapping instance declarations then it's quite convenient to have a "default instance" declaration that applies if something more specific does not:

instance C a where op = ... -- Default

Sometimes even the rules of Section 7.6.3.1, “Relaxed rules for instance declarations” are too onerous. For example, sometimes you might want to use the following to get the effect of a "class synonym":

class (C1 a, C2 a, C3 a) => C a where { } instance (C1 a, C2 a, C3 a) => C a where { }

This allows you to write shorter signatures:

f :: C a => ...

instead of

f :: (C1 a, C2 a, C3 a) => ...

The restrictions on functional dependencies (Section 7.6.2, “Functional dependencies ”) are particularly troublesome. It is tempting to introduce type variables in the context that do not appear in the head, something that is excluded by the normal rules. For example:

class HasConverter a b | a -> b where convert :: a -> b data Foo a = MkFoo a instance (HasConverter a b,Show b) => Show (Foo a) where show (MkFoo value) = show (convert value)

This is dangerous territory, however. Here, for example, is a program that would make the typechecker loop:

class D a class F a b | a->b instance F [a] [[a]] instance (D c, F a c) => D [a] -- 'c' is not mentioned in the head

Similarly, it can be tempting to lift the coverage condition:

class Mul a b c | a b -> c where (.*.) :: a -> b -> c instance Mul Int Int Int where (.*.) = (*) instance Mul Int Float Float where x .*. y = fromIntegral x * y instance Mul a b c => Mul a [b] [c] where x .*. v = map (x.*.) v

The third instance declaration does not obey the coverage condition; and indeed the (somewhat strange) definition:

f = \ b x y -> if b then x .*. [y] else y

makes instance inference go into a loop, because it requires the constraint
`(Mul a [b] b)`

.

Nevertheless, GHC allows you to experiment with more liberal rules. If you use
the experimental flag `-XUndecidableInstances`

,
both the Paterson Conditions and the Coverage Condition
(described in Section 7.6.3.1, “Relaxed rules for instance declarations”) are lifted. Termination is ensured by having a
fixed-depth recursion stack. If you exceed the stack depth you get a
sort of backtrace, and the opportunity to increase the stack depth
with `-fcontext-stack=`

*N*.

In general, *GHC requires that that it be unambiguous which instance
declaration
should be used to resolve a type-class constraint*. This behaviour
can be modified by two flags: `-XOverlappingInstances`

and `-XIncoherentInstances`

, as this section discusses. Both these
flags are dynamic flags, and can be set on a per-module basis, using
an `OPTIONS_GHC`

pragma if desired (Section 4.1.2, “Command line options in source files”).

When GHC tries to resolve, say, the constraint `C Int Bool`

,
it tries to match every instance declaration against the
constraint,
by instantiating the head of the instance declaration. For example, consider
these declarations:

instance context1 => C Int a where ... -- (A) instance context2 => C a Bool where ... -- (B) instance context3 => C Int [a] where ... -- (C) instance context4 => C Int [Int] where ... -- (D)

The instances (A) and (B) match the constraint `C Int Bool`

,
but (C) and (D) do not. When matching, GHC takes
no account of the context of the instance declaration
(`context1`

etc).
GHC's default behaviour is that *exactly one instance must match the
constraint it is trying to resolve*.
It is fine for there to be a *potential* of overlap (by
including both declarations (A) and (B), say); an error is only reported if a
particular constraint matches more than one.

The `-XOverlappingInstances`

flag instructs GHC to allow
more than one instance to match, provided there is a most specific one. For
example, the constraint `C Int [Int]`

matches instances (A),
(C) and (D), but the last is more specific, and hence is chosen. If there is no
most-specific match, the program is rejected.

However, GHC is conservative about committing to an overlapping instance. For example:

f :: [b] -> [b] f x = ...

Suppose that from the RHS of `f`

we get the constraint
`C Int [b]`

. But
GHC does not commit to instance (C), because in a particular
call of `f`

, `b`

might be instantiate
to `Int`

, in which case instance (D) would be more specific still.
So GHC rejects the program.
(If you add the flag `-XIncoherentInstances`

,
GHC will instead pick (C), without complaining about
the problem of subsequent instantiations.)

Notice that we gave a type signature to `f`

, so GHC had to
*check* that `f`

has the specified type.
Suppose instead we do not give a type signature, asking GHC to *infer*
it instead. In this case, GHC will refrain from
simplifying the constraint `C Int [b]`

(for the same reason
as before) but, rather than rejecting the program, it will infer the type

f :: C Int [b] => [b] -> [b]

That postpones the question of which instance to pick to the
call site for `f`

by which time more is known about the type `b`

.
You can write this type signature yourself if you use the
`-XFlexibleContexts`

flag.

Exactly the same situation can arise in instance declarations themselves. Suppose we have

class Foo a where f :: a -> a instance Foo [b] where f x = ...

and, as before, the constraint `C Int [b]`

arises from `f`

's
right hand side. GHC will reject the instance, complaining as before that it does not know how to resolve
the constraint `C Int [b]`

, because it matches more than one instance
declaration. The solution is to postpone the choice by adding the constraint to the context
of the instance declaration, thus:

instance C Int [b] => Foo [b] where f x = ...

(You need `-XFlexibleInstances`

to do this.)

The willingness to be overlapped or incoherent is a property of
the *instance declaration* itself, controlled by the
presence or otherwise of the `-XOverlappingInstances`

and `-XIncoherentInstances`

flags when that module is
being defined. Neither flag is required in a module that imports and uses the
instance declaration. Specifically, during the lookup process:

An instance declaration is ignored during the lookup process if (a) a more specific match is found, and (b) the instance declaration was compiled with

`-XOverlappingInstances`

. The flag setting for the more-specific instance does not matter.Suppose an instance declaration does not match the constraint being looked up, but does unify with it, so that it might match when the constraint is further instantiated. Usually GHC will regard this as a reason for not committing to some other constraint. But if the instance declaration was compiled with

`-XIncoherentInstances`

, GHC will skip the "does-it-unify?" check for that declaration.

These rules make it possible for a library author to design a library that relies on overlapping instances without the library client having to know.

If an instance declaration is compiled without
`-XOverlappingInstances`

,
then that instance can never be overlapped. This could perhaps be
inconvenient. Perhaps the rule should instead say that the
*overlapping* instance declaration should be compiled in
this way, rather than the *overlapped* one. Perhaps overlap
at a usage site should be permitted regardless of how the instance declarations
are compiled, if the `-XOverlappingInstances`

flag is
used at the usage site. (Mind you, the exact usage site can occasionally be
hard to pin down.) We are interested to receive feedback on these points.

The `-XIncoherentInstances`

flag implies the
`-XOverlappingInstances`

flag, but not vice versa.

*Unlike Haskell 98, instance heads may use type
synonyms*. (The instance "head" is the bit after the "=>" in an instance decl.)
As always, using a type synonym is just shorthand for
writing the RHS of the type synonym definition. For example:

type Point = (Int,Int) instance C Point where ... instance C [Point] where ...

is legal. However, if you added

instance C (Int,Int) where ...

as well, then the compiler will complain about the overlapping (actually, identical) instance declarations. As always, type synonyms must be fully applied. You cannot, for example, write:

type P a = [[a]] instance Monad P where ...

This design decision is independent of all the others, and easily reversed, but it makes sense to me.

GHC supports *overloaded string literals*. Normally a
string literal has type `String`

, but with overloaded string
literals enabled (with `-XOverloadedStrings`

)
a string literal has type `(IsString a) => a`

.

This means that the usual string syntax can be used, e.g., for packed strings and other variations of string like types. String literals behave very much like integer literals, i.e., they can be used in both expressions and patterns. If used in a pattern the literal with be replaced by an equality test, in the same way as an integer literal is.

The class `IsString`

is defined as:

class IsString a where fromString :: String -> a

The only predefined instance is the obvious one to make strings work as usual:

instance IsString [Char] where fromString cs = cs

The class `IsString`

is not in scope by default. If you want to mention
it explicitly (for example, to give an instance declaration for it), you can import it
from module `GHC.Exts`

.

Haskell's defaulting mechanism is extended to cover string literals, when `-XOverloadedStrings`

is specified.
Specifically:

Each type in a default declaration must be an instance of

`Num`

*or*of`IsString`

.The standard defaulting rule (Haskell Report, Section 4.3.4) is extended thus: defaulting applies when all the unresolved constraints involve standard classes

*or*`IsString`

; and at least one is a numeric class*or*`IsString`

.

A small example:

module Main where import GHC.Exts( IsString(..) ) newtype MyString = MyString String deriving (Eq, Show) instance IsString MyString where fromString = MyString greet :: MyString -> MyString greet "hello" = "world" greet other = other main = do print $ greet "hello" print $ greet "fool"

Note that deriving `Eq`

is necessary for the pattern matching
to work since it gets translated into an equality comparison.