With the -fglasgow-exts
flag, GHC lets you declare
a data type with no constructors. For example:
data S -- S :: * data T a -- T :: * -> *
Syntactically, the declaration lacks the "= constrs" part. The
type can be parameterised over types of any kind, but if the kind is
not *
then an explicit kind annotation must be used
(see Section 7.4.8, “Explicitly-kinded quantification”).
Such data types have only one value, namely bottom. Nevertheless, they can be useful when defining "phantom types".
GHC allows type constructors, classes, and type variables to be operators, and to be written infix, very much like expressions. More specifically:
A type constructor or class can be an operator, beginning with a colon; e.g. :*:
.
The lexical syntax is the same as that for data constructors.
Data type and type-synonym declarations can be written infix, parenthesised if you want further arguments. E.g.
data a :*: b = Foo a b type a :+: b = Either a b class a :=: b where ... data (a :**: b) x = Baz a b x type (a :++: b) y = Either (a,b) y
Types, and class constraints, can be written infix. For example
x :: Int :*: Bool f :: (a :=: b) => a -> b
A type variable can be an (unqualified) operator e.g. +
.
The lexical syntax is the same as that for variable operators, excluding "(.)",
"(!)", and "(*)". In a binding position, the operator must be
parenthesised. For example:
type T (+) = Int + Int f :: T Either f = Left 3 liftA2 :: Arrow (~>) => (a -> b -> c) -> (e ~> a) -> (e ~> b) -> (e ~> c) liftA2 = ...
Back-quotes work
as for expressions, both for type constructors and type variables; e.g. Int `Either` Bool
, or
Int `a` Bool
. Similarly, parentheses work the same; e.g. (:*:) Int Bool
.
Fixities may be declared for type constructors, or classes, just as for data constructors. However, one cannot distinguish between the two in a fixity declaration; a fixity declaration sets the fixity for a data constructor and the corresponding type constructor. For example:
infixl 7 T, :*:
sets the fixity for both type constructor T
and data constructor T
,
and similarly for :*:
.
Int `a` Bool
.
Function arrow is infixr
with fixity 0. (This might change; I'm not sure what it should be.)
Type synonyms are like macros at the type level, and GHC does validity checking on types only after expanding type synonyms. That means that GHC can be very much more liberal about type synonyms than Haskell 98:
You can write a forall
(including overloading)
in a type synonym, thus:
type Discard a = forall b. Show b => a -> b -> (a, String) f :: Discard a f x y = (x, show y) g :: Discard Int -> (Int,Bool) -- A rank-2 type g f = f Int True
You can write an unboxed tuple in a type synonym:
type Pr = (# Int, Int #) h :: Int -> Pr h x = (# x, x #)
You can apply a type synonym to a forall type:
type Foo a = a -> a -> Bool f :: Foo (forall b. b->b)
After expanding the synonym, f
has the legal (in GHC) type:
f :: (forall b. b->b) -> (forall b. b->b) -> Bool
You can apply a type synonym to a partially applied type synonym:
type Generic i o = forall x. i x -> o x type Id x = x foo :: Generic Id []
After expanding the synonym, foo
has the legal (in GHC) type:
foo :: forall x. x -> [x]
GHC currently does kind checking before expanding synonyms (though even that could be changed.)
After expanding type synonyms, GHC does validity checking on types, looking for the following mal-formedness which isn't detected simply by kind checking:
Type constructor applied to a type involving for-alls.
Unboxed tuple on left of an arrow.
Partially-applied type synonym.
So, for example, this will be rejected:
type Pr = (# Int, Int #) h :: Pr -> Int h x = ...
because GHC does not allow unboxed tuples on the left of a function arrow.
The idea of using existential quantification in data type declarations was suggested by Perry, and implemented in Hope+ (Nigel Perry, The Implementation of Practical Functional Programming Languages, PhD Thesis, University of London, 1991). It was later formalised by Laufer and Odersky (Polymorphic type inference and abstract data types, TOPLAS, 16(5), pp1411-1430, 1994). It's been in Lennart Augustsson's hbc Haskell compiler for several years, and proved very useful. Here's the idea. Consider the declaration:
data Foo = forall a. MkFoo a (a -> Bool) | Nil
The data type Foo
has two constructors with types:
MkFoo :: forall a. a -> (a -> Bool) -> Foo Nil :: Foo
Notice that the type variable a
in the type of MkFoo
does not appear in the data type itself, which is plain Foo
.
For example, the following expression is fine:
[MkFoo 3 even, MkFoo 'c' isUpper] :: [Foo]
Here, (MkFoo 3 even)
packages an integer with a function
even
that maps an integer to Bool
; and MkFoo 'c'
isUpper
packages a character with a compatible function. These
two things are each of type Foo
and can be put in a list.
What can we do with a value of type Foo
?. In particular,
what happens when we pattern-match on MkFoo
?
f (MkFoo val fn) = ???
Since all we know about val
and fn
is that they
are compatible, the only (useful) thing we can do with them is to
apply fn
to val
to get a boolean. For example:
f :: Foo -> Bool f (MkFoo val fn) = fn val
What this allows us to do is to package heterogenous values together with a bunch of functions that manipulate them, and then treat that collection of packages in a uniform manner. You can express quite a bit of object-oriented-like programming this way.
What has this to do with existential quantification?
Simply that MkFoo
has the (nearly) isomorphic type
MkFoo :: (exists a . (a, a -> Bool)) -> Foo
But Haskell programmers can safely think of the ordinary universally quantified type given above, thereby avoiding adding a new existential quantification construct.
An easy extension (implemented in hbc) is to allow arbitrary contexts before the constructor. For example:
data Baz = forall a. Eq a => Baz1 a a | forall b. Show b => Baz2 b (b -> b)
The two constructors have the types you'd expect:
Baz1 :: forall a. Eq a => a -> a -> Baz Baz2 :: forall b. Show b => b -> (b -> b) -> Baz
But when pattern matching on Baz1
the matched values can be compared
for equality, and when pattern matching on Baz2
the first matched
value can be converted to a string (as well as applying the function to it).
So this program is legal:
f :: Baz -> String f (Baz1 p q) | p == q = "Yes" | otherwise = "No" f (Baz2 v fn) = show (fn v)
Operationally, in a dictionary-passing implementation, the
constructors Baz1
and Baz2
must store the
dictionaries for Eq
and Show
respectively, and
extract it on pattern matching.
Notice the way that the syntax fits smoothly with that used for universal quantification earlier.
There are several restrictions on the ways in which existentially-quantified constructors can be use.
When pattern matching, each pattern match introduces a new, distinct, type for each existential type variable. These types cannot be unified with any other type, nor can they escape from the scope of the pattern match. For example, these fragments are incorrect:
f1 (MkFoo a f) = a
Here, the type bound by MkFoo
"escapes", because a
is the result of f1
. One way to see why this is wrong is to
ask what type f1
has:
f1 :: Foo -> a -- Weird!
What is this "a
" in the result type? Clearly we don't mean
this:
f1 :: forall a. Foo -> a -- Wrong!
The original program is just plain wrong. Here's another sort of error
f2 (Baz1 a b) (Baz1 p q) = a==q
It's ok to say a==b
or p==q
, but
a==q
is wrong because it equates the two distinct types arising
from the two Baz1
constructors.
You can't pattern-match on an existentially quantified
constructor in a let
or where
group of
bindings. So this is illegal:
f3 x = a==b where { Baz1 a b = x }
Instead, use a case
expression:
f3 x = case x of Baz1 a b -> a==b
In general, you can only pattern-match
on an existentially-quantified constructor in a case
expression or
in the patterns of a function definition.
The reason for this restriction is really an implementation one.
Type-checking binding groups is already a nightmare without
existentials complicating the picture. Also an existential pattern
binding at the top level of a module doesn't make sense, because it's
not clear how to prevent the existentially-quantified type "escaping".
So for now, there's a simple-to-state restriction. We'll see how
annoying it is.
You can't use existential quantification for newtype
declarations. So this is illegal:
newtype T = forall a. Ord a => MkT a
Reason: a value of type T
must be represented as a
pair of a dictionary for Ord t
and a value of type
t
. That contradicts the idea that
newtype
should have no concrete representation.
You can get just the same efficiency and effect by using
data
instead of newtype
. If
there is no overloading involved, then there is more of a case for
allowing an existentially-quantified newtype
,
because the data
version does carry an
implementation cost, but single-field existentially quantified
constructors aren't much use. So the simple restriction (no
existential stuff on newtype
) stands, unless there
are convincing reasons to change it.
You can't use deriving
to define instances of a
data type with existentially quantified data constructors.
Reason: in most cases it would not make sense. For example:#
data T = forall a. MkT [a] deriving( Eq )
To derive Eq
in the standard way we would need to have equality
between the single component of two MkT
constructors:
instance Eq T where (MkT a) == (MkT b) = ???
But a
and b
have distinct types, and so can't be compared.
It's just about possible to imagine examples in which the derived instance
would make sense, but it seems altogether simpler simply to prohibit such
declarations. Define your own instances!
This section documents GHC's implementation of multi-parameter type classes. There's lots of background in the paper Type classes: exploring the design space (Simon Peyton Jones, Mark Jones, Erik Meijer).
There are the following constraints on class declarations:
Multi-parameter type classes are permitted. For example:
class Collection c a where union :: c a -> c a -> c a ...etc.
The class hierarchy must be acyclic. However, the definition of "acyclic" involves only the superclass relationships. For example, this is OK:
class C a where { op :: D b => a -> b -> b } class C a => D a where { ... }
Here, C
is a superclass of D
, but it's OK for a
class operation op
of C
to mention D
. (It
would not be OK for D
to be a superclass of C
.)
There are no restrictions on the context in a class declaration (which introduces superclasses), except that the class hierarchy must be acyclic. So these class declarations are OK:
class Functor (m k) => FiniteMap m k where ... class (Monad m, Monad (t m)) => Transform t m where lift :: m a -> (t m) a
All of the class type variables must be reachable (in the sense mentioned in Section 7.4.3, “Type signatures”) from the free variables of each method type . For example:
class Coll s a where empty :: s insert :: s -> a -> s
is not OK, because the type of empty
doesn't mention
a
. This rule is a consequence of Rule 1(a), above, for
types, and has the same motivation.
Sometimes, offending class declarations exhibit misunderstandings. For
example, Coll
might be rewritten
class Coll s a where empty :: s a insert :: s a -> a -> s a
which makes the connection between the type of a collection of
a
's (namely (s a)
) and the element type a
.
Occasionally this really doesn't work, in which case you can split the
class like this:
class CollE s where empty :: s class CollE s => Coll s a where insert :: s -> a -> s
Haskell 98 prohibits class method types to mention constraints on the class type variable, thus:
class Seq s a where fromList :: [a] -> s a elem :: Eq a => a -> s a -> Bool
The type of elem
is illegal in Haskell 98, because it
contains the constraint Eq a
, constrains only the
class type variable (in this case a
).
With the -fglasgow-exts
GHC lifts this restriction.
Unlike Haskell 98, constraints in types do not have to be of the form (class type-variable) or (class (type-variable type-variable ...)). Thus, these type signatures are perfectly OK
g :: Eq [a] => ... g :: Ord (T a ()) => ...
GHC imposes the following restrictions on the constraints in a type signature. Consider the type:
forall tv1..tvn (c1, ...,cn) => type
(Here, we write the "foralls" explicitly, although the Haskell source language omits them; in Haskell 98, all the free type variables of an explicit source-language type signature are universally quantified, except for the class type variables in a class declaration. However, in GHC, you can give the foralls if you want. See Section 7.4.9, “Arbitrary-rank polymorphism ”).
Each universally quantified type variable
tvi
must be reachable from type
.
A type variable a
is "reachable" if it it appears
in the same constraint as either a type variable free in in
type
, or another reachable type variable.
A value with a type that does not obey
this reachability restriction cannot be used without introducing
ambiguity; that is why the type is rejected.
Here, for example, is an illegal type:
forall a. Eq a => Int
When a value with this type was used, the constraint Eq tv
would be introduced where tv
is a fresh type variable, and
(in the dictionary-translation implementation) the value would be
applied to a dictionary for Eq tv
. The difficulty is that we
can never know which instance of Eq
to use because we never
get any more information about tv
.
Note
that the reachability condition is weaker than saying that a
is
functionally dependent on a type variable free in
type
(see Section 7.4.7, “Functional dependencies
”). The reason for this is there
might be a "hidden" dependency, in a superclass perhaps. So
"reachable" is a conservative approximation to "functionally dependent".
For example, consider:
class C a b | a -> b where ... class C a b => D a b where ... f :: forall a b. D a b => a -> a
This is fine, because in fact a
does functionally determine b
but that is not immediately apparent from f
's type.
Every constraint ci
must mention at least one of the
universally quantified type variables tvi
.
For example, this type is OK because C a b
mentions the
universally quantified type variable b
:
forall a. C a b => burble
The next type is illegal because the constraint Eq b
does not
mention a
:
forall a. Eq b => burble
The reason for this restriction is milder than the other one. The excluded types are never useful or necessary (because the offending context doesn't need to be witnessed at this point; it can be floated out). Furthermore, floating them out increases sharing. Lastly, excluding them is a conservative choice; it leaves a patch of territory free in case we need it later.
It is often convenient to use generalised type synonyms (see Section 7.4.1.3, “Liberalised type synonyms”) at the right hand end of an arrow, thus:
type Discard a = forall b. a -> b -> a g :: Int -> Discard Int g x y z = x+y
Simply expanding the type synonym would give
g :: Int -> (forall b. Int -> b -> Int)
but GHC "hoists" the forall
to give the isomorphic type
g :: forall b. Int -> Int -> b -> Int
In general, the rule is this: to determine the type specified by any explicit user-written type (e.g. in a type signature), GHC expands type synonyms and then repeatedly performs the transformation:
type1 -> forall a1..an. context2 => type2 ==> forall a1..an. context2 => type1 -> type2
(In fact, GHC tries to retain as much synonym information as possible for use in
error messages, but that is a usability issue.) This rule applies, of course, whether
or not the forall
comes from a synonym. For example, here is another
valid way to write g
's type signature:
g :: Int -> Int -> forall b. b -> Int
When doing this hoisting operation, GHC eliminates duplicate constraints. For example:
type Foo a = (?x::Int) => Bool -> a g :: Foo (Foo Int)
means
g :: (?x::Int) => Bool -> Bool -> Int
In general, GHC requires that that it be unambiguous which instance
declaration
should be used to resolve a type-class constraint. This behaviour
can be modified by two flags: -fallow-overlapping-instances
and -fallow-incoherent-instances
, as this section discusses.
When GHC tries to resolve, say, the constraint C Int Bool
,
it tries to match every instance declaration against the
constraint,
by instantiating the head of the instance declaration. For example, consider
these declarations:
instance context1 => C Int a where ... -- (A) instance context2 => C a Bool where ... -- (B) instance context3 => C Int [a] where ... -- (C) instance context4 => C Int [Int] where ... -- (D)
The instances (A) and (B) match the constraint C Int Bool
, but (C) and (D) do not. When matching, GHC takes
no account of the context of the instance declaration
(context1
etc).
GHC's default behaviour is that exactly one instance must match the
constraint it is trying to resolve.
It is fine for there to be a potential of overlap (by
including both declarations (A) and (B), say); an error is only reported if a
particular constraint matches more than one.
The -fallow-overlapping-instances
flag instructs GHC to allow
more than one instance to match, provided there is a most specific one. For
example, the constraint C Int [Int]
matches instances (A),
(C) and (D), but the last is more specific, and hence is chosen. If there is no
most-specific match, the program is rejected.
However, GHC is conservative about committing to an overlapping instance. For example:
f :: [b] -> [b] f x = ...
Suppose that from the RHS of f
we get the constraint
C Int [b]
. But
GHC does not commit to instance (C), because in a particular
call of f
, b
might be instantiate
to Int
, in which case instance (D) would be more specific still.
So GHC rejects the program. If you add the flag -fallow-incoherent-instances
,
GHC will instead pick (C), without complaining about
the problem of subsequent instantiations.
Because overlaps are checked and reported lazily, as described above, you need
the -fallow-overlapping-instances
in the module that calls
the overloaded function, rather than in the module that defines it.
Unlike Haskell 98, instance heads may use type synonyms. (The instance "head" is the bit after the "=>" in an instance decl.) As always, using a type synonym is just shorthand for writing the RHS of the type synonym definition. For example:
type Point = (Int,Int) instance C Point where ... instance C [Point] where ...
is legal. However, if you added
instance C (Int,Int) where ...
as well, then the compiler will complain about the overlapping (actually, identical) instance declarations. As always, type synonyms must be fully applied. You cannot, for example, write:
type P a = [[a]] instance Monad P where ...
This design decision is independent of all the others, and easily reversed, but it makes sense to me.
An instance declaration must normally obey the following rules:
At least one of the types in the head of an instance declaration must not be a type variable. For example, these are OK:
instance C Int a where ... instance D (Int, Int) where ... instance E [[a]] where ...
but this is not:
instance F a where ...
Note that instance heads may contain repeated type variables. For example, this is OK:
instance Stateful (ST s) (MutVar s) where ...
All of the types in the context of an instance declaration must be type variables. Thus
instance C a b => Eq (a,b) where ...
is OK, but
instance C Int b => Foo b where ...
is not OK.
These restrictions ensure that context reduction terminates: each reduction step removes one type constructor. For example, the following would make the type checker loop if it wasn't excluded:
instance C a => C a where ...
There are two situations in which the rule is a bit of a pain. First, if one allows overlapping instance declarations then it's quite convenient to have a "default instance" declaration that applies if something more specific does not:
instance C a where op = ... -- Default
Second, sometimes you might want to use the following to get the effect of a "class synonym":
class (C1 a, C2 a, C3 a) => C a where { } instance (C1 a, C2 a, C3 a) => C a where { }
This allows you to write shorter signatures:
f :: C a => ...
instead of
f :: (C1 a, C2 a, C3 a) => ...
Voluminous correspondence on the Haskell mailing list has convinced me
that it's worth experimenting with more liberal rules. If you use
the experimental flag -fallow-undecidable-instances
, you can use arbitrary
types in both an instance context and instance head. Termination is ensured by having a
fixed-depth recursion stack. If you exceed the stack depth you get a
sort of backtrace, and the opportunity to increase the stack depth
with -fcontext-stack
N.
I'm on the lookout for a less brutal solution: a simple rule that preserves decidability while allowing these idioms interesting idioms.
Implicit parameters are implemented as described in "Implicit parameters: dynamic scoping with static types", J Lewis, MB Shields, E Meijer, J Launchbury, 27th ACM Symposium on Principles of Programming Languages (POPL'00), Boston, Jan 2000.
(Most of the following, stil rather incomplete, documentation is due to Jeff Lewis.)
Implicit parameter support is enabled with the option
-fimplicit-params
.
A variable is called dynamically bound when it is bound by the calling context of a function and statically bound when bound by the callee's context. In Haskell, all variables are statically bound. Dynamic binding of variables is a notion that goes back to Lisp, but was later discarded in more modern incarnations, such as Scheme. Dynamic binding can be very confusing in an untyped language, and unfortunately, typed languages, in particular Hindley-Milner typed languages like Haskell, only support static scoping of variables.
However, by a simple extension to the type class system of Haskell, we
can support dynamic binding. Basically, we express the use of a
dynamically bound variable as a constraint on the type. These
constraints lead to types of the form (?x::t') => t
, which says "this
function uses a dynamically-bound variable ?x
of type t'
". For
example, the following expresses the type of a sort function,
implicitly parameterized by a comparison function named cmp
.
sort :: (?cmp :: a -> a -> Bool) => [a] -> [a]
The dynamic binding constraints are just a new form of predicate in the type class system.
An implicit parameter occurs in an expression using the special form ?x
,
where x
is
any valid identifier (e.g. ord ?x
is a valid expression).
Use of this construct also introduces a new
dynamic-binding constraint in the type of the expression.
For example, the following definition
shows how we can define an implicitly parameterized sort function in
terms of an explicitly parameterized sortBy
function:
sortBy :: (a -> a -> Bool) -> [a] -> [a] sort :: (?cmp :: a -> a -> Bool) => [a] -> [a] sort = sortBy ?cmp
Dynamic binding constraints behave just like other type class
constraints in that they are automatically propagated. Thus, when a
function is used, its implicit parameters are inherited by the
function that called it. For example, our sort
function might be used
to pick out the least value in a list:
least :: (?cmp :: a -> a -> Bool) => [a] -> a least xs = fst (sort xs)
Without lifting a finger, the ?cmp
parameter is
propagated to become a parameter of least
as well. With explicit
parameters, the default is that parameters must always be explicit
propagated. With implicit parameters, the default is to always
propagate them.
An implicit-parameter type constraint differs from other type class constraints in the
following way: All uses of a particular implicit parameter must have
the same type. This means that the type of (?x, ?x)
is (?x::a) => (a,a)
, and not
(?x::a, ?x::b) => (a, b)
, as would be the case for type
class constraints.
You can't have an implicit parameter in the context of a class or instance declaration. For example, both these declarations are illegal:
class (?x::Int) => C a where ... instance (?x::a) => Foo [a] where ...
Reason: exactly which implicit parameter you pick up depends on exactly where you invoke a function. But the ``invocation'' of instance declarations is done behind the scenes by the compiler, so it's hard to figure out exactly where it is done. Easiest thing is to outlaw the offending types.
Implicit-parameter constraints do not cause ambiguity. For example, consider:
f :: (?x :: [a]) => Int -> Int f n = n + length ?x g :: (Read a, Show a) => String -> String g s = show (read s)
Here, g
has an ambiguous type, and is rejected, but f
is fine. The binding for ?x
at f
's call site is
quite unambiguous, and fixes the type a
.
An implicit parameter is bound using the standard
let
or where
binding forms.
For example, we define the min
function by binding
cmp
.
min :: [a] -> a min = let ?cmp = (<=) in least
A group of implicit-parameter bindings may occur anywhere a normal group of Haskell
bindings can occur, except at top level. That is, they can occur in a let
(including in a list comprehension, or do-notation, or pattern guards),
or a where
clause.
Note the following points:
An implicit-parameter binding group must be a collection of simple bindings to implicit-style variables (no function-style bindings, and no type signatures); these bindings are neither polymorphic or recursive.
You may not mix implicit-parameter bindings with ordinary bindings in a
single let
expression; use two nested let
s instead.
(In the case of where
you are stuck, since you can't nest where
clauses.)
You may put multiple implicit-parameter bindings in a
single binding group; but they are not treated
as a mutually recursive group (as ordinary let
bindings are).
Instead they are treated as a non-recursive group, simultaneously binding all the implicit
parameter. The bindings are not nested, and may be re-ordered without changing
the meaning of the program.
For example, consider:
f t = let { ?x = t; ?y = ?x+(1::Int) } in ?x + ?y
The use of ?x
in the binding for ?y
does not "see"
the binding for ?x
, so the type of f
is
f :: (?x::Int) => Int -> Int
Consider these two definitions:
len1 :: [a] -> Int len1 xs = let ?acc = 0 in len_acc1 xs len_acc1 [] = ?acc len_acc1 (x:xs) = let ?acc = ?acc + (1::Int) in len_acc1 xs ------------ len2 :: [a] -> Int len2 xs = let ?acc = 0 in len_acc2 xs len_acc2 :: (?acc :: Int) => [a] -> Int len_acc2 [] = ?acc len_acc2 (x:xs) = let ?acc = ?acc + (1::Int) in len_acc2 xs
The only difference between the two groups is that in the second group
len_acc
is given a type signature.
In the former case, len_acc1
is monomorphic in its own
right-hand side, so the implicit parameter ?acc
is not
passed to the recursive call. In the latter case, because len_acc2
has a type signature, the recursive call is made to the
polymoprhic version, which takes ?acc
as an implicit parameter. So we get the following results in GHCi:
Prog> len1 "hello" 0 Prog> len2 "hello" 5
Adding a type signature dramatically changes the result! This is a rather counter-intuitive phenomenon, worth watching out for.
GHC applies the dreaded Monomorphism Restriction (section 4.5.5 of the Haskell Report) to implicit parameters. For example, consider:
f :: Int -> Int f v = let ?x = 0 in let y = ?x + v in let ?x = 5 in y
Since the binding for y
falls under the Monomorphism
Restriction it is not generalised, so the type of y
is
simply Int
, not (?x::Int) => Int
.
Hence, (f 9)
returns result 9
.
If you add a type signature for y
, then y
will get type (?x::Int) => Int
, so the occurrence of
y
in the body of the let
will see the
inner binding of ?x
, so (f 9)
will return
14
.
Linear implicit parameters are an idea developed by Koen Claessen, Mark Shields, and Simon PJ. They address the long-standing problem that monads seem over-kill for certain sorts of problem, notably:
distributing a supply of unique names
distributing a supply of random numbers
distributing an oracle (as in QuickCheck)
Linear implicit parameters are just like ordinary implicit parameters,
except that they are "linear" -- that is, they cannot be copied, and
must be explicitly "split" instead. Linear implicit parameters are
written '%x
' instead of '?x
'.
(The '/' in the '%' suggests the split!)
For example:
import GHC.Exts( Splittable ) data NameSupply = ... splitNS :: NameSupply -> (NameSupply, NameSupply) newName :: NameSupply -> Name instance Splittable NameSupply where split = splitNS f :: (%ns :: NameSupply) => Env -> Expr -> Expr f env (Lam x e) = Lam x' (f env e) where x' = newName %ns env' = extend env x x' ...more equations for f...
Notice that the implicit parameter %ns is consumed
once by the call to newName
once by the recursive call to f
So the translation done by the type checker makes the parameter explicit:
f :: NameSupply -> Env -> Expr -> Expr f ns env (Lam x e) = Lam x' (f ns1 env e) where (ns1,ns2) = splitNS ns x' = newName ns2 env = extend env x x'
Notice the call to 'split' introduced by the type checker.
How did it know to use 'splitNS'? Because what it really did
was to introduce a call to the overloaded function 'split',
defined by the class Splittable
:
class Splittable a where split :: a -> (a,a)
The instance for Splittable NameSupply
tells GHC how to implement
split for name supplies. But we can simply write
g x = (x, %ns, %ns)
and GHC will infer
g :: (Splittable a, %ns :: a) => b -> (b,a,a)
The Splittable
class is built into GHC. It's exported by module
GHC.Exts
.
Other points:
'?x
' and '%x
'
are entirely distinct implicit parameters: you
can use them together and they won't intefere with each other.
You can bind linear implicit parameters in 'with' clauses.
You cannot have implicit parameters (whether linear or not) in the context of a class or instance declaration.
The monomorphism restriction is even more important than usual. Consider the example above:
f :: (%ns :: NameSupply) => Env -> Expr -> Expr f env (Lam x e) = Lam x' (f env e) where x' = newName %ns env' = extend env x x'
If we replaced the two occurrences of x' by (newName %ns), which is usually a harmless thing to do, we get:
f :: (%ns :: NameSupply) => Env -> Expr -> Expr f env (Lam x e) = Lam (newName %ns) (f env e) where env' = extend env x (newName %ns)
But now the name supply is consumed in three places (the two calls to newName,and the recursive call to f), so the result is utterly different. Urk! We don't even have the beta rule.
Well, this is an experimental change. With implicit parameters we have already lost beta reduction anyway, and (as John Launchbury puts it) we can't sensibly reason about Haskell programs without knowing their typing.
Linear implicit parameters can be particularly tricky when you have a recursive function Consider
foo :: %x::T => Int -> [Int] foo 0 = [] foo n = %x : foo (n-1)
where T is some type in class Splittable.
Do you get a list of all the same T's or all different T's (assuming that split gives two distinct T's back)?
If you supply the type signature, taking advantage of polymorphic recursion, you get what you'd probably expect. Here's the translated term, where the implicit param is made explicit:
foo x 0 = [] foo x n = let (x1,x2) = split x in x1 : foo x2 (n-1)
But if you don't supply a type signature, GHC uses the Hindley Milner trick of using a single monomorphic instance of the function for the recursive calls. That is what makes Hindley Milner type inference work. So the translation becomes
foo x = let foom 0 = [] foom n = x : foom (n-1) in foom
Result: 'x' is not split, and you get a list of identical T's. So the semantics of the program depends on whether or not foo has a type signature. Yikes!
You may say that this is a good reason to dislike linear implicit parameters and you'd be right. That is why they are an experimental feature.
Functional dependencies are implemented as described by Mark Jones in “Type Classes with Functional Dependencies”, Mark P. Jones, In Proceedings of the 9th European Symposium on Programming, ESOP 2000, Berlin, Germany, March 2000, Springer-Verlag LNCS 1782, .
Functional dependencies are introduced by a vertical bar in the syntax of a class declaration; e.g.
class (Monad m) => MonadState s m | m -> s where ... class Foo a b c | a b -> c where ...
There should be more documentation, but there isn't (yet). Yell if you need it.
Haskell infers the kind of each type variable. Sometimes it is nice to be able to give the kind explicitly as (machine-checked) documentation, just as it is nice to give a type signature for a function. On some occasions, it is essential to do so. For example, in his paper "Restricted Data Types in Haskell" (Haskell Workshop 1999) John Hughes had to define the data type:
data Set cxt a = Set [a] | Unused (cxt a -> ())
The only use for the Unused
constructor was to force the correct
kind for the type variable cxt
.
GHC now instead allows you to specify the kind of a type variable directly, wherever a type variable is explicitly bound. Namely:
data
declarations:
data Set (cxt :: * -> *) a = Set [a]
type
declarations:
type T (f :: * -> *) = f Int
class
declarations:
class (Eq a) => C (f :: * -> *) a where ...
forall
's in type signatures:
f :: forall (cxt :: * -> *). Set cxt Int
The parentheses are required. Some of the spaces are required too, to
separate the lexemes. If you write (f::*->*)
you
will get a parse error, because "::*->*
" is a
single lexeme in Haskell.
As part of the same extension, you can put kind annotations in types as well. Thus:
f :: (Int :: *) -> Int g :: forall a. a -> (a :: *)
The syntax is
atype ::= '(' ctype '::' kind ')
The parentheses are required.
Haskell type signatures are implicitly quantified. The new keyword forall
allows us to say exactly what this means. For example:
g :: b -> b
means this:
g :: forall b. (b -> b)
The two are treated identically.
However, GHC's type system supports arbitrary-rank explicit universal quantification in types. For example, all the following types are legal:
f1 :: forall a b. a -> b -> a g1 :: forall a b. (Ord a, Eq b) => a -> b -> a f2 :: (forall a. a->a) -> Int -> Int g2 :: (forall a. Eq a => [a] -> a -> Bool) -> Int -> Int f3 :: ((forall a. a->a) -> Int) -> Bool -> Bool
Here, f1
and g1
are rank-1 types, and
can be written in standard Haskell (e.g. f1 :: a->b->a
).
The forall
makes explicit the universal quantification that
is implicitly added by Haskell.
The functions f2
and g2
have rank-2 types;
the forall
is on the left of a function arrow. As g2
shows, the polymorphic type on the left of the function arrow can be overloaded.
The function f3
has a rank-3 type;
it has rank-2 types on the left of a function arrow.
GHC allows types of arbitrary rank; you can nest forall
s
arbitrarily deep in function arrows. (GHC used to be restricted to rank 2, but
that restriction has now been lifted.)
In particular, a forall-type (also called a "type scheme"),
including an operational type class context, is legal:
On the left of a function arrow
On the right of a function arrow (see Section 7.4.3.2, “For-all hoisting”)
As the argument of a constructor, or type of a field, in a data type declaration. For
example, any of the f1,f2,f3,g1,g2
above would be valid
field type signatures.
As the type of an implicit parameter
In a pattern type signature (see Section 7.4.10, “Scoped type variables ”)
There is one place you cannot put a forall
:
you cannot instantiate a type variable with a forall-type. So you cannot
make a forall-type the argument of a type constructor. So these types are illegal:
x1 :: [forall a. a->a] x2 :: (forall a. a->a, Int) x3 :: Maybe (forall a. a->a)
Of course forall
becomes a keyword; you can't use forall
as
a type variable any more!
In a data
or newtype
declaration one can quantify
the types of the constructor arguments. Here are several examples:
data T a = T1 (forall b. b -> b -> b) a data MonadT m = MkMonad { return :: forall a. a -> m a, bind :: forall a b. m a -> (a -> m b) -> m b } newtype Swizzle = MkSwizzle (Ord a => [a] -> [a])
The constructors have rank-2 types:
T1 :: forall a. (forall b. b -> b -> b) -> a -> T a MkMonad :: forall m. (forall a. a -> m a) -> (forall a b. m a -> (a -> m b) -> m b) -> MonadT m MkSwizzle :: (Ord a => [a] -> [a]) -> Swizzle
Notice that you don't need to use a forall
if there's an
explicit context. For example in the first argument of the
constructor MkSwizzle
, an implicit "forall a.
" is
prefixed to the argument type. The implicit forall
quantifies all type variables that are not already in scope, and are
mentioned in the type quantified over.
As for type signatures, implicit quantification happens for non-overloaded types too. So if you write this:
data T a = MkT (Either a b) (b -> b)
it's just as if you had written this:
data T a = MkT (forall b. Either a b) (forall b. b -> b)
That is, since the type variable b
isn't in scope, it's
implicitly universally quantified. (Arguably, it would be better
to require explicit quantification on constructor arguments
where that is what is wanted. Feedback welcomed.)
You construct values of types T1, MonadT, Swizzle
by applying
the constructor to suitable values, just as usual. For example,
a1 :: T Int a1 = T1 (\xy->x) 3 a2, a3 :: Swizzle a2 = MkSwizzle sort a3 = MkSwizzle reverse a4 :: MonadT Maybe a4 = let r x = Just x b m k = case m of Just y -> k y Nothing -> Nothing in MkMonad r b mkTs :: (forall b. b -> b -> b) -> a -> [T a] mkTs f x y = [T1 f x, T1 f y]
The type of the argument can, as usual, be more general than the type
required, as (MkSwizzle reverse)
shows. (reverse
does not need the Ord
constraint.)
When you use pattern matching, the bound variables may now have polymorphic types. For example:
f :: T a -> a -> (a, Char) f (T1 w k) x = (w k x, w 'c' 'd') g :: (Ord a, Ord b) => Swizzle -> [a] -> (a -> b) -> [b] g (MkSwizzle s) xs f = s (map f (s xs)) h :: MonadT m -> [m a] -> m [a] h m [] = return m [] h m (x:xs) = bind m x $ \y -> bind m (h m xs) $ \ys -> return m (y:ys)
In the function h
we use the record selectors return
and bind
to extract the polymorphic bind and return functions
from the MonadT
data structure, rather than using pattern
matching.
In general, type inference for arbitrary-rank types is undecidable. GHC uses an algorithm proposed by Odersky and Laufer ("Putting type annotations to work", POPL'96) to get a decidable algorithm by requiring some help from the programmer. We do not yet have a formal specification of "some help" but the rule is this:
For a lambda-bound or case-bound variable, x, either the programmer provides an explicit polymorphic type for x, or GHC's type inference will assume that x's type has no foralls in it.
What does it mean to "provide" an explicit type for x? You can do that by giving a type signature for x directly, using a pattern type signature (Section 7.4.10, “Scoped type variables ”), thus:
\ f :: (forall a. a->a) -> (f True, f 'c')
Alternatively, you can give a type signature to the enclosing context, which GHC can "push down" to find the type for the variable:
(\ f -> (f True, f 'c')) :: (forall a. a->a) -> (Bool,Char)
Here the type signature on the expression can be pushed inwards to give a type signature for f. Similarly, and more commonly, one can give a type signature for the function itself:
h :: (forall a. a->a) -> (Bool,Char) h f = (f True, f 'c')
You don't need to give a type signature if the lambda bound variable is a constructor argument. Here is an example we saw earlier:
f :: T a -> a -> (a, Char) f (T1 w k) x = (w k x, w 'c' 'd')
Here we do not need to give a type signature to w
, because
it is an argument of constructor T1
and that tells GHC all
it needs to know.
GHC performs implicit quantification as follows. At the top level (only) of
user-written types, if and only if there is no explicit forall
,
GHC finds all the type variables mentioned in the type that are not already
in scope, and universally quantifies them. For example, the following pairs are
equivalent:
f :: a -> a f :: forall a. a -> a g (x::a) = let h :: a -> b -> b h x y = y in ... g (x::a) = let h :: forall b. a -> b -> b h x y = y in ...
Notice that GHC does not find the innermost possible quantification point. For example:
f :: (a -> a) -> Int -- MEANS f :: forall a. (a -> a) -> Int -- NOT f :: (forall a. a -> a) -> Int g :: (Ord a => a -> a) -> Int -- MEANS the illegal type g :: forall a. (Ord a => a -> a) -> Int -- NOT g :: (forall a. Ord a => a -> a) -> Int
The latter produces an illegal type, which you might think is silly, but at least the rule is simple. If you want the latter type, you can write your for-alls explicitly. Indeed, doing so is strongly advised for rank-2 types.
A lexically scoped type variable can be bound by:
A declaration type signature (Section 7.4.10.3, “Declaration type signatures”)
A pattern type signature (Section 7.4.10.4, “Where a pattern type signature can occur”)
A result type signature (Section 7.4.10.5, “Result type signatures”)
For example:
f (xs::[a]) = ys ++ ys where ys :: [a] ys = reverse xs
The pattern (xs::[a])
includes a type signature for xs
.
This brings the type variable a
into scope; it scopes over
all the patterns and right hand sides for this equation for f
.
In particular, it is in scope at the type signature for y
.
At ordinary type signatures, such as that for ys
, any type variables
mentioned in the type signature that are not in scope are
implicitly universally quantified. (If there are no type variables in
scope, all type variables mentioned in the signature are universally
quantified, which is just as in Haskell 98.) In this case, since a
is in scope, it is not universally quantified, so the type of ys
is
the same as that of xs
. In Haskell 98 it is not possible to declare
a type for ys
; a major benefit of scoped type variables is that
it becomes possible to do so.
Scoped type variables are implemented in both GHC and Hugs. Where the implementations differ from the specification below, those differences are noted.
So much for the basic idea. Here are the details.
A lexically-scoped type variable is simply the name for a type. The restriction it expresses is that all occurrences of the same name mean the same type. For example:
f :: [Int] -> Int -> Int f (xs::[a]) (y::a) = (head xs + y) :: a
The pattern type signatures on the left hand side of
f
express the fact that xs
must be a list of things of some type a
; and that y
must have this same type. The type signature on the expression (head xs)
specifies that this expression must have the same type a
.
There is no requirement that the type named by "a
" is
in fact a type variable. Indeed, in this case, the type named by "a
" is
Int
. (This is a slight liberalisation from the original rather complex
rules, which specified that a pattern-bound type variable should be universally quantified.)
For example, all of these are legal:
t (x::a) (y::a) = x+y*2 f (x::a) (y::b) = [x,y] -- a unifies with b g (x::a) = x + 1::Int -- a unifies with Int h x = let k (y::a) = [x,y] -- a is free in the in k x -- environment k (x::a) True = ... -- a unifies with Int k (x::Int) False = ... w :: [b] -> [b] w (x::a) = x -- a unifies with [b]
All the type variables mentioned in a pattern, that are not already in scope, are brought into scope by the pattern. We describe this set as the type variables bound by the pattern. For example:
f (x::a) = let g (y::(a,b)) = fst y in g (x,True)
The pattern (x::a)
brings the type variable
a
into scope, as well as the term
variable x
. The pattern (y::(a,b))
contains an occurrence of the already-in-scope type variable a
,
and brings into scope the type variable b
.
The type variable(s) bound by the pattern have the same scope as the term variable(s) bound by the pattern. For example:
let f (x::a) = <...rhs of f...> (p::b, q::b) = (1,2) in <...body of let...>
Here, the type variable a
scopes over the right hand side of f
,
just like x
does; while the type variable b
scopes over the
body of the let
, and all the other definitions in the let
,
just like p
and q
do.
Indeed, the newly bound type variables also scope over any ordinary, separate
type signatures in the let
group.
The type variables bound by the pattern may be mentioned in ordinary type signatures or pattern type signatures anywhere within their scope.
In ordinary type signatures, any type variable mentioned in the signature that is in scope is not universally quantified.
Ordinary type signatures do not bring any new type variables into scope (except in the type signature itself!). So this is illegal:
f :: a -> a f x = x::a
It's illegal because a
is not in scope in the body of f
,
so the ordinary signature x::a
is equivalent to x::forall a.a
;
and that is an incorrect typing.
The pattern type signature is a monotype:
A pattern type signature cannot contain any explicit forall
quantification.
The type variables bound by a pattern type signature can only be instantiated to monotypes, not to type schemes.
There is no implicit universal quantification on pattern type signatures (in contrast to ordinary type signatures).
The type variables in the head of a class
or instance
declaration
scope over the methods defined in the where
part. For example:
class C a where op :: [a] -> a op xs = let ys::[a] ys = reverse xs in head ys
(Not implemented in Hugs yet, Dec 98).
A declaration type signature that has explicit
quantification (using forall
) brings into scope the
explicitly-quantified
type variables, in the definition of the named function(s). For example:
f :: forall a. [a] -> [a] f (x:xs) = xs ++ [ x :: a ]
The "forall a
" brings "a
" into scope in
the definition of "f
".
This only happens if the quantification in f
's type
signature is explicit. For example:
g :: [a] -> [a] g (x:xs) = xs ++ [ x :: a ]
This program will be rejected, because "a
" does not scope
over the definition of "f
", so "x::a
"
means "x::forall a. a
" by Haskell's usual implicit
quantification rules.
A pattern type signature can occur in any pattern. For example:
A pattern type signature can be on an arbitrary sub-pattern, not just on a variable:
f ((x,y)::(a,b)) = (y,x) :: (b,a)
Pattern type signatures, including the result part, can be used in lambda abstractions:
(\ (x::a, y) :: a -> x)
Pattern type signatures, including the result part, can be used
in case
expressions:
case e of { ((x::a, y) :: (a,b)) -> x }
Note that the ->
symbol in a case alternative
leads to difficulties when parsing a type signature in the pattern: in
the absence of the extra parentheses in the example above, the parser
would try to interpret the ->
as a function
arrow and give a parse error later.
To avoid ambiguity, the type after the “::
” in a result
pattern signature on a lambda or case
must be atomic (i.e. a single
token or a parenthesised type of some sort). To see why,
consider how one would parse this:
\ x :: a -> b -> x
Pattern type signatures can bind existential type variables. For example:
data T = forall a. MkT [a] f :: T -> T f (MkT [t::a]) = MkT t3 where t3::[a] = [t,t,t]
Pattern type signatures can be used in pattern bindings:
f x = let (y, z::a) = x in ... f1 x = let (y, z::Int) = x in ... f2 (x::(Int,a)) = let (y, z::a) = x in ... f3 :: (b->b) = \x -> x
In all such cases, the binding is not generalised over the pattern-bound
type variables. Thus f3
is monomorphic; f3
has type b -> b
for some type b
,
and not forall b. b -> b
.
In contrast, the binding
f4 :: b->b f4 = \x -> x
makes a polymorphic function, but b
is not in scope anywhere
in f4
's scope.
Pattern type signatures are completely orthogonal to ordinary, separate type signatures. The two can be used independently or together.
The result type of a function can be given a signature, thus:
f (x::a) :: [a] = [x,x,x]
The final :: [a]
after all the patterns gives a signature to the
result type. Sometimes this is the only way of naming the type variable
you want:
f :: Int -> [a] -> [a] f n :: ([a] -> [a]) = let g (x::a, y::a) = (y,x) in \xs -> map g (reverse xs `zip` xs)
The type variables bound in a result type signature scope over the right hand side of the definition. However, consider this corner-case:
rev1 :: [a] -> [a] = \xs -> reverse xs foo ys = rev (ys::[a])
The signature on rev1
is considered a pattern type signature, not a result
type signature, and the type variables it binds have the same scope as rev1
itself (i.e. the right-hand side of rev1
and the rest of the module too).
In particular, the expression (ys::[a])
is OK, because the type variable a
is in scope (otherwise it would mean (ys::forall a.[a])
, which would be rejected).
As mentioned above, rev1
is made monomorphic by this scoping rule.
For example, the following program would be rejected, because it claims that rev1
is polymorphic:
rev1 :: [b] -> [b] rev1 :: [a] -> [a] = \xs -> reverse xs
Result type signatures are not yet implemented in Hugs.
Haskell 98 allows the programmer to add "deriving( Eq, Ord )
" to a data type
declaration, to generate a standard instance declaration for classes specified in the deriving
clause.
In Haskell 98, the only classes that may appear in the deriving
clause are the standard
classes Eq
, Ord
,
Enum
, Ix
, Bounded
, Read
, and Show
.
GHC extends this list with two more classes that may be automatically derived
(provided the -fglasgow-exts
flag is specified):
Typeable
, and Data
. These classes are defined in the library
modules Data.Typeable
and Data.Generics
respectively, and the
appropriate class must be in scope before it can be mentioned in the deriving
clause.
When you define an abstract type using newtype
, you may want
the new type to inherit some instances from its representation. In
Haskell 98, you can inherit instances of Eq
, Ord
,
Enum
and Bounded
by deriving them, but for any
other classes you have to write an explicit instance declaration. For
example, if you define
newtype Dollars = Dollars Int
and you want to use arithmetic on Dollars
, you have to
explicitly define an instance of Num
:
instance Num Dollars where Dollars a + Dollars b = Dollars (a+b) ...
All the instance does is apply and remove the newtype
constructor. It is particularly galling that, since the constructor
doesn't appear at run-time, this instance declaration defines a
dictionary which is wholly equivalent to the Int
dictionary, only slower!
GHC now permits such instances to be derived instead, so one can write
newtype Dollars = Dollars Int deriving (Eq,Show,Num)
and the implementation uses the same Num
dictionary
for Dollars
as for Int
. Notionally, the compiler
derives an instance declaration of the form
instance Num Int => Num Dollars
which just adds or removes the newtype
constructor according to the type.
We can also derive instances of constructor classes in a similar way. For example, suppose we have implemented state and failure monad transformers, such that
instance Monad m => Monad (State s m) instance Monad m => Monad (Failure m)
In Haskell 98, we can define a parsing monad by
type Parser tok m a = State [tok] (Failure m) a
which is automatically a monad thanks to the instance declarations
above. With the extension, we can make the parser type abstract,
without needing to write an instance of class Monad
, via
newtype Parser tok m a = Parser (State [tok] (Failure m) a) deriving Monad
In this case the derived instance declaration is of the form
instance Monad (State [tok] (Failure m)) => Monad (Parser tok m)
Notice that, since Monad
is a constructor class, the
instance is a partial application of the new type, not the
entire left hand side. We can imagine that the type declaration is
``eta-converted'' to generate the context of the instance
declaration.
We can even derive instances of multi-parameter classes, provided the
newtype is the last class parameter. In this case, a ``partial
application'' of the class appears in the deriving
clause. For example, given the class
class StateMonad s m | m -> s where ... instance Monad m => StateMonad s (State s m) where ...
then we can derive an instance of StateMonad
for Parser
s by
newtype Parser tok m a = Parser (State [tok] (Failure m) a) deriving (Monad, StateMonad [tok])
The derived instance is obtained by completing the application of the class to the new type:
instance StateMonad [tok] (State [tok] (Failure m)) => StateMonad [tok] (Parser tok m)
As a result of this extension, all derived instances in newtype
declarations are treated uniformly (and implemented just by reusing
the dictionary for the representation type), except
Show
and Read
, which really behave differently for
the newtype and its representation.
Derived instance declarations are constructed as follows. Consider the declaration (after expansion of any type synonyms)
newtype T v1...vn = T' (S t1...tk vk+1...vn) deriving (c1...cm)
where
S
is a type constructor,
The t1...tk
are types,
The vk+1...vn
are type variables which do not occur in any of
the ti
, and
The ci
are partial applications of
classes of the form C t1'...tj'
, where the arity of C
is exactly j+1
. That is, C
lacks exactly one type argument.
None of the ci
is Read
, Show
,
Typeable
, or Data
. These classes
should not "look through" the type or its constructor. You can still
derive these classes for a newtype, but it happens in the usual way, not
via this new mechanism.
Then, for each ci
, the derived instance
declaration is:
instance ci (S t1...tk vk+1...v) => ci (T v1...vp)
where p
is chosen so that T v1...vp
is of the
right kind for the last parameter of class Ci
.
As an example which does not work, consider
newtype NonMonad m s = NonMonad (State s m s) deriving Monad
Here we cannot derive the instance
instance Monad (State s m) => Monad (NonMonad m)
because the type variable s
occurs in State s m
,
and so cannot be "eta-converted" away. It is a good thing that this
deriving
clause is rejected, because NonMonad m
is
not, in fact, a monad --- for the same reason. Try defining
>>=
with the correct type: you won't be able to.
Notice also that the order of class parameters becomes
important, since we can only derive instances for the last one. If the
StateMonad
class above were instead defined as
class StateMonad m s | m -> s where ...
then we would not have been able to derive an instance for the
Parser
type above. We hypothesise that multi-parameter
classes usually have one "main" parameter for which deriving new
instances is most interesting.
Lastly, all of this applies only for classes other than
Read
, Show
, Typeable
,
and Data
, for which the built-in derivation applies (section
4.3.3. of the Haskell Report).
(For the standard classes Eq
, Ord
,
Ix
, and Bounded
it is immaterial whether
the standard method is used or the one described here.)