Unlike Haskell 98, constraints in types do not have to be of the form (class type-variable) or (class (type-variable type-variable ...)). Thus, these type signatures are perfectly OK
g :: Eq [a] => ... g :: Ord (T a ()) => ...
GHC imposes the following restrictions on the constraints in a type signature. Consider the type:
forall tv1..tvn (c1, ...,cn) => type
(Here, we write the "foralls" explicitly, although the Haskell source language omits them; in Haskell 98, all the free type variables of an explicit source-language type signature are universally quantified, except for the class type variables in a class declaration. However, in GHC, you can give the foralls if you want. See Section 8.7.4, “Arbitrary-rank polymorphism ”).
Each universally quantified type variable
tvi
must be reachable from type
.
A type variable a
is "reachable" if it it appears
in the same constraint as either a type variable free in in
type
, or another reachable type variable.
A value with a type that does not obey
this reachability restriction cannot be used without introducing
ambiguity; that is why the type is rejected.
Here, for example, is an illegal type:
forall a. Eq a => Int
When a value with this type was used, the constraint Eq tv
would be introduced where tv
is a fresh type variable, and
(in the dictionary-translation implementation) the value would be
applied to a dictionary for Eq tv
. The difficulty is that we
can never know which instance of Eq
to use because we never
get any more information about tv
.
Note
that the reachability condition is weaker than saying that a
is
functionally dependent on a type variable free in
type
(see Section 8.6.2, “Functional dependencies
”). The reason for this is there
might be a "hidden" dependency, in a superclass perhaps. So
"reachable" is a conservative approximation to "functionally dependent".
For example, consider:
class C a b | a -> b where ... class C a b => D a b where ... f :: forall a b. D a b => a -> a
This is fine, because in fact a
does functionally determine b
but that is not immediately apparent from f
's type.
Every constraint ci
must mention at least one of the
universally quantified type variables tvi
.
For example, this type is OK because C a b
mentions the
universally quantified type variable b
:
forall a. C a b => burble
The next type is illegal because the constraint Eq b
does not
mention a
:
forall a. Eq b => burble
The reason for this restriction is milder than the other one. The excluded types are never useful or necessary (because the offending context doesn't need to be witnessed at this point; it can be floated out). Furthermore, floating them out increases sharing. Lastly, excluding them is a conservative choice; it leaves a patch of territory free in case we need it later.
Implicit parameters are implemented as described in "Implicit parameters: dynamic scoping with static types", J Lewis, MB Shields, E Meijer, J Launchbury, 27th ACM Symposium on Principles of Programming Languages (POPL'00), Boston, Jan 2000.
(Most of the following, still rather incomplete, documentation is due to Jeff Lewis.)
Implicit parameter support is enabled with the option
-XImplicitParams
.
A variable is called dynamically bound when it is bound by the calling context of a function and statically bound when bound by the callee's context. In Haskell, all variables are statically bound. Dynamic binding of variables is a notion that goes back to Lisp, but was later discarded in more modern incarnations, such as Scheme. Dynamic binding can be very confusing in an untyped language, and unfortunately, typed languages, in particular Hindley-Milner typed languages like Haskell, only support static scoping of variables.
However, by a simple extension to the type class system of Haskell, we
can support dynamic binding. Basically, we express the use of a
dynamically bound variable as a constraint on the type. These
constraints lead to types of the form (?x::t') => t
, which says "this
function uses a dynamically-bound variable ?x
of type t'
". For
example, the following expresses the type of a sort function,
implicitly parameterized by a comparison function named cmp
.
sort :: (?cmp :: a -> a -> Bool) => [a] -> [a]
The dynamic binding constraints are just a new form of predicate in the type class system.
An implicit parameter occurs in an expression using the special form ?x
,
where x
is
any valid identifier (e.g. ord ?x
is a valid expression).
Use of this construct also introduces a new
dynamic-binding constraint in the type of the expression.
For example, the following definition
shows how we can define an implicitly parameterized sort function in
terms of an explicitly parameterized sortBy
function:
sortBy :: (a -> a -> Bool) -> [a] -> [a] sort :: (?cmp :: a -> a -> Bool) => [a] -> [a] sort = sortBy ?cmp
Dynamic binding constraints behave just like other type class
constraints in that they are automatically propagated. Thus, when a
function is used, its implicit parameters are inherited by the
function that called it. For example, our sort
function might be used
to pick out the least value in a list:
least :: (?cmp :: a -> a -> Bool) => [a] -> a least xs = head (sort xs)
Without lifting a finger, the ?cmp
parameter is
propagated to become a parameter of least
as well. With explicit
parameters, the default is that parameters must always be explicit
propagated. With implicit parameters, the default is to always
propagate them.
An implicit-parameter type constraint differs from other type class constraints in the
following way: All uses of a particular implicit parameter must have
the same type. This means that the type of (?x, ?x)
is (?x::a) => (a,a)
, and not
(?x::a, ?x::b) => (a, b)
, as would be the case for type
class constraints.
You can't have an implicit parameter in the context of a class or instance declaration. For example, both these declarations are illegal:
class (?x::Int) => C a where ... instance (?x::a) => Foo [a] where ...
Reason: exactly which implicit parameter you pick up depends on exactly where you invoke a function. But the ``invocation'' of instance declarations is done behind the scenes by the compiler, so it's hard to figure out exactly where it is done. Easiest thing is to outlaw the offending types.
Implicit-parameter constraints do not cause ambiguity. For example, consider:
f :: (?x :: [a]) => Int -> Int f n = n + length ?x g :: (Read a, Show a) => String -> String g s = show (read s)
Here, g
has an ambiguous type, and is rejected, but f
is fine. The binding for ?x
at f
's call site is
quite unambiguous, and fixes the type a
.
An implicit parameter is bound using the standard
let
or where
binding forms.
For example, we define the min
function by binding
cmp
.
min :: [a] -> a min = let ?cmp = (<=) in least
A group of implicit-parameter bindings may occur anywhere a normal group of Haskell
bindings can occur, except at top level. That is, they can occur in a let
(including in a list comprehension, or do-notation, or pattern guards),
or a where
clause.
Note the following points:
An implicit-parameter binding group must be a collection of simple bindings to implicit-style variables (no function-style bindings, and no type signatures); these bindings are neither polymorphic or recursive.
You may not mix implicit-parameter bindings with ordinary bindings in a
single let
expression; use two nested let
s instead.
(In the case of where
you are stuck, since you can't nest where
clauses.)
You may put multiple implicit-parameter bindings in a
single binding group; but they are not treated
as a mutually recursive group (as ordinary let
bindings are).
Instead they are treated as a non-recursive group, simultaneously binding all the implicit
parameter. The bindings are not nested, and may be re-ordered without changing
the meaning of the program.
For example, consider:
f t = let { ?x = t; ?y = ?x+(1::Int) } in ?x + ?y
The use of ?x
in the binding for ?y
does not "see"
the binding for ?x
, so the type of f
is
f :: (?x::Int) => Int -> Int
Consider these two definitions:
len1 :: [a] -> Int len1 xs = let ?acc = 0 in len_acc1 xs len_acc1 [] = ?acc len_acc1 (x:xs) = let ?acc = ?acc + (1::Int) in len_acc1 xs ------------ len2 :: [a] -> Int len2 xs = let ?acc = 0 in len_acc2 xs len_acc2 :: (?acc :: Int) => [a] -> Int len_acc2 [] = ?acc len_acc2 (x:xs) = let ?acc = ?acc + (1::Int) in len_acc2 xs
The only difference between the two groups is that in the second group
len_acc
is given a type signature.
In the former case, len_acc1
is monomorphic in its own
right-hand side, so the implicit parameter ?acc
is not
passed to the recursive call. In the latter case, because len_acc2
has a type signature, the recursive call is made to the
polymorphic version, which takes ?acc
as an implicit parameter. So we get the following results in GHCi:
Prog> len1 "hello" 0 Prog> len2 "hello" 5
Adding a type signature dramatically changes the result! This is a rather counter-intuitive phenomenon, worth watching out for.
GHC applies the dreaded Monomorphism Restriction (section 4.5.5 of the Haskell Report) to implicit parameters. For example, consider:
f :: Int -> Int f v = let ?x = 0 in let y = ?x + v in let ?x = 5 in y
Since the binding for y
falls under the Monomorphism
Restriction it is not generalised, so the type of y
is
simply Int
, not (?x::Int) => Int
.
Hence, (f 9)
returns result 9
.
If you add a type signature for y
, then y
will get type (?x::Int) => Int
, so the occurrence of
y
in the body of the let
will see the
inner binding of ?x
, so (f 9)
will return
14
.
Haskell infers the kind of each type variable. Sometimes it is nice to be able to give the kind explicitly as (machine-checked) documentation, just as it is nice to give a type signature for a function. On some occasions, it is essential to do so. For example, in his paper "Restricted Data Types in Haskell" (Haskell Workshop 1999) John Hughes had to define the data type:
data Set cxt a = Set [a] | Unused (cxt a -> ())
The only use for the Unused
constructor was to force the correct
kind for the type variable cxt
.
GHC now instead allows you to specify the kind of a type variable directly, wherever
a type variable is explicitly bound, with the flag -XKindSignatures
.
This flag enables kind signatures in the following places:
data
declarations:
data Set (cxt :: * -> *) a = Set [a]
type
declarations:
type T (f :: * -> *) = f Int
class
declarations:
class (Eq a) => C (f :: * -> *) a where ...
forall
's in type signatures:
f :: forall (cxt :: * -> *). Set cxt Int
The parentheses are required. Some of the spaces are required too, to
separate the lexemes. If you write (f::*->*)
you
will get a parse error, because "::*->*
" is a
single lexeme in Haskell.
As part of the same extension, you can put kind annotations in types as well. Thus:
f :: (Int :: *) -> Int g :: forall a. a -> (a :: *)
The syntax is
atype ::= '(' ctype '::' kind ')
The parentheses are required.
Haskell type signatures are implicitly quantified. The new keyword forall
allows us to say exactly what this means. For example:
g :: b -> b
means this:
g :: forall b. (b -> b)
The two are treated identically.
However, GHC's type system supports arbitrary-rank explicit universal quantification in types. For example, all the following types are legal:
f1 :: forall a b. a -> b -> a g1 :: forall a b. (Ord a, Eq b) => a -> b -> a f2 :: (forall a. a->a) -> Int -> Int g2 :: (forall a. Eq a => [a] -> a -> Bool) -> Int -> Int f3 :: ((forall a. a->a) -> Int) -> Bool -> Bool f4 :: Int -> (forall a. a -> a)
Here, f1
and g1
are rank-1 types, and
can be written in standard Haskell (e.g. f1 :: a->b->a
).
The forall
makes explicit the universal quantification that
is implicitly added by Haskell.
The functions f2
and g2
have rank-2 types;
the forall
is on the left of a function arrow. As g2
shows, the polymorphic type on the left of the function arrow can be overloaded.
The function f3
has a rank-3 type;
it has rank-2 types on the left of a function arrow.
GHC has three flags to control higher-rank types:
-XPolymorphicComponents
: data constructors (only) can have polymorphic argment types.
-XRank2Types
: any function (including data constructors) can have a rank-2 type.
-XRankNTypes
: any function (including data constructors) can have an arbitrary-rank type.
That is, you can nest forall
s
arbitrarily deep in function arrows.
In particular, a forall-type (also called a "type scheme"),
including an operational type class context, is legal:
On the left or right (see f4
, for example)
of a function arrow
As the argument of a constructor, or type of a field, in a data type declaration. For
example, any of the f1,f2,f3,g1,g2
above would be valid
field type signatures.
As the type of an implicit parameter
In a pattern type signature (see Section 8.7.6, “Lexically scoped type variables ”)
Of course forall
becomes a keyword; you can't use forall
as
a type variable any more!
In a data
or newtype
declaration one can quantify
the types of the constructor arguments. Here are several examples:
data T a = T1 (forall b. b -> b -> b) a data MonadT m = MkMonad { return :: forall a. a -> m a, bind :: forall a b. m a -> (a -> m b) -> m b } newtype Swizzle = MkSwizzle (Ord a => [a] -> [a])
The constructors have rank-2 types:
T1 :: forall a. (forall b. b -> b -> b) -> a -> T a MkMonad :: forall m. (forall a. a -> m a) -> (forall a b. m a -> (a -> m b) -> m b) -> MonadT m MkSwizzle :: (Ord a => [a] -> [a]) -> Swizzle
Notice that you don't need to use a forall
if there's an
explicit context. For example in the first argument of the
constructor MkSwizzle
, an implicit "forall a.
" is
prefixed to the argument type. The implicit forall
quantifies all type variables that are not already in scope, and are
mentioned in the type quantified over.
As for type signatures, implicit quantification happens for non-overloaded types too. So if you write this:
data T a = MkT (Either a b) (b -> b)
it's just as if you had written this:
data T a = MkT (forall b. Either a b) (forall b. b -> b)
That is, since the type variable b
isn't in scope, it's
implicitly universally quantified. (Arguably, it would be better
to require explicit quantification on constructor arguments
where that is what is wanted. Feedback welcomed.)
You construct values of types T1, MonadT, Swizzle
by applying
the constructor to suitable values, just as usual. For example,
a1 :: T Int a1 = T1 (\xy->x) 3 a2, a3 :: Swizzle a2 = MkSwizzle sort a3 = MkSwizzle reverse a4 :: MonadT Maybe a4 = let r x = Just x b m k = case m of Just y -> k y Nothing -> Nothing in MkMonad r b mkTs :: (forall b. b -> b -> b) -> a -> [T a] mkTs f x y = [T1 f x, T1 f y]
The type of the argument can, as usual, be more general than the type
required, as (MkSwizzle reverse)
shows. (reverse
does not need the Ord
constraint.)
When you use pattern matching, the bound variables may now have polymorphic types. For example:
f :: T a -> a -> (a, Char) f (T1 w k) x = (w k x, w 'c' 'd') g :: (Ord a, Ord b) => Swizzle -> [a] -> (a -> b) -> [b] g (MkSwizzle s) xs f = s (map f (s xs)) h :: MonadT m -> [m a] -> m [a] h m [] = return m [] h m (x:xs) = bind m x $ \y -> bind m (h m xs) $ \ys -> return m (y:ys)
In the function h
we use the record selectors return
and bind
to extract the polymorphic bind and return functions
from the MonadT
data structure, rather than using pattern
matching.
In general, type inference for arbitrary-rank types is undecidable. GHC uses an algorithm proposed by Odersky and Laufer ("Putting type annotations to work", POPL'96) to get a decidable algorithm by requiring some help from the programmer. We do not yet have a formal specification of "some help" but the rule is this:
For a lambda-bound or case-bound variable, x, either the programmer provides an explicit polymorphic type for x, or GHC's type inference will assume that x's type has no foralls in it.
What does it mean to "provide" an explicit type for x? You can do that by giving a type signature for x directly, using a pattern type signature (Section 8.7.6, “Lexically scoped type variables ”), thus:
\ f :: (forall a. a->a) -> (f True, f 'c')
Alternatively, you can give a type signature to the enclosing context, which GHC can "push down" to find the type for the variable:
(\ f -> (f True, f 'c')) :: (forall a. a->a) -> (Bool,Char)
Here the type signature on the expression can be pushed inwards to give a type signature for f. Similarly, and more commonly, one can give a type signature for the function itself:
h :: (forall a. a->a) -> (Bool,Char) h f = (f True, f 'c')
You don't need to give a type signature if the lambda bound variable is a constructor argument. Here is an example we saw earlier:
f :: T a -> a -> (a, Char) f (T1 w k) x = (w k x, w 'c' 'd')
Here we do not need to give a type signature to w
, because
it is an argument of constructor T1
and that tells GHC all
it needs to know.
GHC performs implicit quantification as follows. At the top level (only) of
user-written types, if and only if there is no explicit forall
,
GHC finds all the type variables mentioned in the type that are not already
in scope, and universally quantifies them. For example, the following pairs are
equivalent:
f :: a -> a f :: forall a. a -> a g (x::a) = let h :: a -> b -> b h x y = y in ... g (x::a) = let h :: forall b. a -> b -> b h x y = y in ...
Notice that GHC does not find the innermost possible quantification point. For example:
f :: (a -> a) -> Int -- MEANS f :: forall a. (a -> a) -> Int -- NOT f :: (forall a. a -> a) -> Int g :: (Ord a => a -> a) -> Int -- MEANS the illegal type g :: forall a. (Ord a => a -> a) -> Int -- NOT g :: (forall a. Ord a => a -> a) -> Int
The latter produces an illegal type, which you might think is silly, but at least the rule is simple. If you want the latter type, you can write your for-alls explicitly. Indeed, doing so is strongly advised for rank-2 types.
GHC supports impredicative polymorphism. This means that you can call a polymorphic function at a polymorphic type, and parameterise data structures over polymorphic types. For example:
f :: Maybe (forall a. [a] -> [a]) -> Maybe ([Int], [Char]) f (Just g) = Just (g [3], g "hello") f Nothing = Nothing
Notice here that the Maybe
type is parameterised by the
polymorphic type (forall a. [a] ->
[a])
.
The technical details of this extension are described in the paper Boxy types: type inference for higher-rank types and impredicativity, which appeared at ICFP 2006.
GHC supports lexically scoped type variables, without which some type signatures are simply impossible to write. For example:
f :: forall a. [a] -> [a] f xs = ys ++ ys where ys :: [a] ys = reverse xs
The type signature for f
brings the type variable a
into scope; it scopes over
the entire definition of f
.
In particular, it is in scope at the type signature for ys
.
In Haskell 98 it is not possible to declare
a type for ys
; a major benefit of scoped type variables is that
it becomes possible to do so.
Lexically-scoped type variables are enabled by
-fglasgow-exts
.
Note: GHC 6.6 contains substantial changes to the way that scoped type variables work, compared to earlier releases. Read this section carefully!
The design follows the following principles
A scoped type variable stands for a type variable, and not for a type. (This is a change from GHC's earlier design.)
Furthermore, distinct lexical type variables stand for distinct type variables. This means that every programmer-written type signature (including one that contains free scoped type variables) denotes a rigid type; that is, the type is fully known to the type checker, and no inference is involved.
Lexical type variables may be alpha-renamed freely, without changing the program.
A lexically scoped type variable can be bound by:
A declaration type signature (Section 8.7.6.2, “Declaration type signatures”)
An expression type signature (Section 8.7.6.3, “Expression type signatures”)
A pattern type signature (Section 8.7.6.4, “Pattern type signatures”)
Class and instance declarations (Section 8.7.6.5, “Class and instance declarations”)
In Haskell, a programmer-written type signature is implicitly quantified over
its free type variables (Section
4.1.2
of the Haskel Report).
Lexically scoped type variables affect this implicit quantification rules
as follows: any type variable that is in scope is not universally
quantified. For example, if type variable a
is in scope,
then
(e :: a -> a) means (e :: a -> a) (e :: b -> b) means (e :: forall b. b->b) (e :: a -> b) means (e :: forall b. a->b)
A declaration type signature that has explicit
quantification (using forall
) brings into scope the
explicitly-quantified
type variables, in the definition of the named function(s). For example:
f :: forall a. [a] -> [a] f (x:xs) = xs ++ [ x :: a ]
The "forall a
" brings "a
" into scope in
the definition of "f
".
This only happens if the quantification in f
's type
signature is explicit. For example:
g :: [a] -> [a] g (x:xs) = xs ++ [ x :: a ]
This program will be rejected, because "a
" does not scope
over the definition of "f
", so "x::a
"
means "x::forall a. a
" by Haskell's usual implicit
quantification rules.
An expression type signature that has explicit
quantification (using forall
) brings into scope the
explicitly-quantified
type variables, in the annotated expression. For example:
f = runST ( (op >>= \(x :: STRef s Int) -> g x) :: forall s. ST s Bool )
Here, the type signature forall a. ST s Bool
brings the
type variable s
into scope, in the annotated expression
(op >>= \(x :: STRef s Int) -> g x)
.
A type signature may occur in any pattern; this is a pattern type signature. For example:
-- f and g assume that 'a' is already in scope f = \(x::Int, y::a) -> x g (x::a) = x h ((x,y) :: (Int,Bool)) = (y,x)
In the case where all the type variables in the pattern type signature are already in scope (i.e. bound by the enclosing context), matters are simple: the signature simply constrains the type of the pattern in the obvious way.
Unlike expression and declaration type signatures, pattern type signatures are not implictly generalised. The pattern in a patterm binding may only mention type variables that are already in scope. For example:
f :: forall a. [a] -> (Int, [a]) f xs = (n, zs) where (ys::[a], n) = (reverse xs, length xs) -- OK zs::[a] = xs ++ ys -- OK Just (v::b) = ... -- Not OK; b is not in scope
Here, the pattern signatures for ys
and zs
are fine, but the one for v
is not because b
is
not in scope.
However, in all patterns other than pattern bindings, a pattern type signature may mention a type variable that is not in scope; in this case, the signature brings that type variable into scope. This is particularly important for existential data constructors. For example:
data T = forall a. MkT [a] k :: T -> T k (MkT [t::a]) = MkT t3 where t3::[a] = [t,t,t]
Here, the pattern type signature (t::a)
mentions a lexical type
variable that is not already in scope. Indeed, it cannot already be in scope,
because it is bound by the pattern match. GHC's rule is that in this situation
(and only then), a pattern type signature can mention a type variable that is
not already in scope; the effect is to bring it into scope, standing for the
existentially-bound type variable.
When a pattern type signature binds a type variable in this way, GHC insists that the type variable is bound to a rigid, or fully-known, type variable. This means that any user-written type signature always stands for a completely known type.
If all this seems a little odd, we think so too. But we must have some way to bring such type variables into scope, else we could not name existentially-bound type variables in subsequent type signatures.
This is (now) the only situation in which a pattern type
signature is allowed to mention a lexical variable that is not already in
scope.
For example, both f
and g
would be
illegal if a
was not already in scope.
The Haskell Report specifies that a group of bindings (at top level, or in a
let
or where
) should be sorted into
strongly-connected components, and then type-checked in dependency order
(Haskell
Report, Section 4.5.1).
As each group is type-checked, any binders of the group that
have
an explicit type signature are put in the type environment with the specified
polymorphic type,
and all others are monomorphic until the group is generalised
(Haskell Report, Section 4.5.2).
Following a suggestion of Mark Jones, in his paper
Typing Haskell in
Haskell,
GHC implements a more general scheme. If -XRelaxedPolyRec
is
specified:
the dependency analysis ignores references to variables that have an explicit
type signature.
As a result of this refined dependency analysis, the dependency groups are smaller, and more bindings will
typecheck. For example, consider:
f :: Eq a => a -> Bool f x = (x == x) || g True || g "Yes" g y = (y <= y) || f True
This is rejected by Haskell 98, but under Jones's scheme the definition for
g
is typechecked first, separately from that for
f
,
because the reference to f
in g
's right
hand side is ignored by the dependency analysis. Then g
's
type is generalised, to get
g :: Ord a => a -> Bool
Now, the definition for f
is typechecked, with this type for
g
in the type environment.
The same refined dependency analysis also allows the type signatures of
mutually-recursive functions to have different contexts, something that is illegal in
Haskell 98 (Section 4.5.2, last sentence). With
-XRelaxedPolyRec
GHC only insists that the type signatures of a refined group have identical
type signatures; in practice this means that only variables bound by the same
pattern binding must have the same context. For example, this is fine:
f :: Eq a => a -> Bool f x = (x == x) || g True g :: Ord a => a -> Bool g y = (y <= y) || f True
GHC supports the definition of type families indexed by types. They may be seen as an extension of Haskell 98's class-based overloading of values to types. When type families are declared in classes, they are also known as associated types.
There are two forms of type families: data families and type synonym families. Currently, only the former are fully implemented, while we are still working on the latter. As a result, the specification of the language extension is also still to some degree in flux. Hence, a more detailed description of the language extension and its use is currently available from the Haskell wiki page on type families. The material will be moved to this user's guide when it has stabilised.
Type families are enabled by the flag -XTypeFamilies
.