Indexed type families form an extension to facilitate type-level programming. Type families are a generalisation of associated data types (“Associated Types with Class”, M. Chakravarty, G. Keller, S. Peyton Jones, and S. Marlow. In Proceedings of “The 32nd Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL'05)”, pages 1-13, ACM Press, 2005) and associated type synonyms (“Type Associated Type Synonyms”. M. Chakravarty, G. Keller, and S. Peyton Jones. In Proceedings of “The Tenth ACM SIGPLAN International Conference on Functional Programming”, ACM Press, pages 241-253, 2005). Type families themselves are described in the paper “Type Checking with Open Type Functions”, T. Schrijvers, S. Peyton-Jones, M. Chakravarty, and M. Sulzmann, in Proceedings of “ICFP 2008: The 13th ACM SIGPLAN International Conference on Functional Programming”, ACM Press, pages 51-62, 2008. Type families essentially provide type-indexed data types and named functions on types, which are useful for generic programming and highly parameterised library interfaces as well as interfaces with enhanced static information, much like dependent types. They might also be regarded as an alternative to functional dependencies, but provide a more functional style of type-level programming than the relational style of functional dependencies.
Indexed type families, or type families for short, are type constructors that represent sets of types. Set members are denoted by supplying the type family constructor with type parameters, which are called type indices. The difference between vanilla parametrised type constructors and family constructors is much like between parametrically polymorphic functions and (ad-hoc polymorphic) methods of type classes. Parametric polymorphic functions behave the same at all type instances, whereas class methods can change their behaviour in dependence on the class type parameters. Similarly, vanilla type constructors imply the same data representation for all type instances, but family constructors can have varying representation types for varying type indices.
Indexed type families come in three flavours: data families, open type synonym families, and closed type synonym families. They are the indexed family variants of algebraic data types and type synonyms, respectively. The instances of data families can be data types and newtypes.
Type families are enabled by the flag
Additional information on the use of type families in GHC is available on
Haskell wiki page on type families.
Data families appear in two flavours: (1) they can be defined on the toplevel or (2) they can appear inside type classes (in which case they are known as associated types). The former is the more general variant, as it lacks the requirement for the type-indexes to coincide with the class parameters. However, the latter can lead to more clearly structured code and compiler warnings if some type instances were - possibly accidentally - omitted. In the following, we always discuss the general toplevel form first and then cover the additional constraints placed on associated types.
Indexed data families are introduced by a signature, such as
data family GMap k :: * -> *
family distinguishes family from standard
data declarations. The result kind annotation is optional and, as
usual, defaults to
* if omitted. An example is
data family Array e
Named arguments can also be given explicit kind signatures if needed.
Just as with
declarations] named arguments are entirely optional, so that we can
Array alternatively with
data family Array :: * -> *
Instance declarations of data and newtype families are very similar to
standard data and newtype declarations. The only two differences are
that the keyword
is followed by
instance and that some or all of the
type arguments can be non-variable types, but may not contain forall
types or type synonym families. However, data families are generally
allowed in type parameters, and type synonyms are allowed as long as
they are fully applied and expand to a type that is itself admissible -
exactly as this is required for occurrences of type synonyms in class
instance parameters. For example, the
data instance GMap (Either a b) v = GMapEither (GMap a v) (GMap b v)
In this example, the declaration has only one variant. In general, it can be any number.
Data and newtype instance declarations are only permitted when an appropriate family declaration is in scope - just as a class instance declaration requires the class declaration to be visible. Moreover, each instance declaration has to conform to the kind determined by its family declaration. This implies that the number of parameters of an instance declaration matches the arity determined by the kind of the family.
A data family instance declaration can use the full expressiveness of
Although, a data family is introduced with
the keyword "
data", a data family instance can
newtype. For example:
data family T a data instance T Int = T1 Int | T2 Bool newtype instance T Char = TC Bool
data instance can use GADT syntax for the data constructors,
and indeed can define a GADT. For example:
data family G a b data instance G [a] b where G1 :: c -> G [Int] b G2 :: G [a] Bool
You can use a
deriving clause on a
data instance or
Even if data families are defined as toplevel declarations, functions that perform different computations for different family instances may still need to be defined as methods of type classes. In particular, the following is not possible:
data family T a data instance T Int = A data instance T Char = B foo :: T a -> Int foo A = 1 -- WRONG: These two equations together... foo B = 2 -- ...will produce a type error.
Instead, you would have to write
foo as a class operation, thus:
class Foo a where foo :: T a -> Int instance Foo Int where foo A = 1 instance Foo Char where foo B = 2
(Given the functionality provided by GADTs (Generalised Algebraic Data Types), it might seem as if a definition, such as the above, should be feasible. However, type families are - in contrast to GADTs - are open; i.e., new instances can always be added, possibly in other modules. Supporting pattern matching across different data instances would require a form of extensible case construct.)
Type families appear in three flavours: (1) they can be defined as open families on the toplevel, (2) they can be defined as closed families on the toplevel, or (3) they can appear inside type classes (in which case they are known as associated type synonyms). Toplevel families are more general, as they lack the requirement for the type-indexes to coincide with the class parameters. However, associated type synonyms can lead to more clearly structured code and compiler warnings if some type instances were - possibly accidentally - omitted. In the following, we always discuss the general toplevel forms first and then cover the additional constraints placed on associated types. Note that closed associated type synonyms do not exist.
Open indexed type families are introduced by a signature, such as
type family Elem c :: *
family distinguishes family from standard
type declarations. The result kind annotation is optional and, as
usual, defaults to
* if omitted. An example is
type family Elem c
Parameters can also be given explicit kind signatures if needed. We call the number of parameters in a type family declaration, the family's arity, and all applications of a type family must be fully saturated w.r.t. to that arity. This requirement is unlike ordinary type synonyms and it implies that the kind of a type family is not sufficient to determine a family's arity, and hence in general, also insufficient to determine whether a type family application is well formed. As an example, consider the following declaration:
type family F a b :: * -> * -- F's arity is 2, -- although its overall kind is * -> * -> * -> *
Given this declaration the following are examples of well-formed and malformed types:
F Char [Int] -- OK! Kind: * -> * F Char [Int] Bool -- OK! Kind: * F IO Bool -- WRONG: kind mismatch in the first argument F Bool -- WRONG: unsaturated application
Instance declarations of type families are very similar to
standard type synonym declarations. The only two differences are that
type is followed by
instance and that some or all of the type arguments
can be non-variable types, but may not contain forall types or type
synonym families. However, data families are generally allowed, and type
synonyms are allowed as long as they are fully applied and expand to a
type that is admissible - these are the exact same requirements as for
data instances. For example, the
[e] instance for
type instance Elem [e] = e
Type family instance declarations are only legitimate when an appropriate family declaration is in scope - just like class instances require the class declaration to be visible. Moreover, each instance declaration has to conform to the kind determined by its family declaration, and the number of type parameters in an instance declaration must match the number of type parameters in the family declaration. Finally, the right-hand side of a type instance must be a monotype (i.e., it may not include foralls) and after the expansion of all saturated vanilla type synonyms, no synonyms, except family synonyms may remain.
A type family can also be declared with a
defining the full set of equations for that family. For example:
type family F a where F Int = Double F Bool = Char F a = String
A closed type family's equations are tried in order, from top to bottom,
when simplifying a type family application. In this example, we declare
an instance for
F such that
Char, and for any other type
a that is known not to be
F a simplifies to
String. Note that GHC must be sure that
a cannot unify with
Bool in that last case; if a programmer specifies
F a in their code, GHC will not be able to
simplify the type. After all,
a might later be
A closed type family's equations have the same restrictions as the equations for an open type family instances.
Here are some examples of admissible and illegal type instances:
type family F a :: * type instance F [Int] = Int -- OK! type instance F String = Char -- OK! type instance F (F a) = a -- WRONG: type parameter mentions a type family type instance F (forall a. (a, b)) = b -- WRONG: a forall type appears in a type parameter type instance F Float = forall a.a -- WRONG: right-hand side may not be a forall type type family H a where -- OK! H Int = Int H Bool = Bool H a = String type instance H Char = Char -- WRONG: cannot have instances of closed family type family G a b :: * -> * type instance G Int = (,) -- WRONG: must be two type parameters type instance G Int Char Float = Double -- WRONG: must be two type parameters
There must be some restrictions on the equations of type families, lest we define an ambiguous rewrite system. So, equations of open type families are restricted to be compatible. Two type patterns are compatible if
all corresponding types in the patterns are apart, or
the two patterns unify producing a substitution, and the right-hand sides are equal under that substitution.
Two types are considered apart if, for all possible substitutions, the types cannot reduce to a common reduct.
The first clause of "compatible" is the more straightforward one. It says that the patterns of two distinct type family instances cannot overlap. For example, the following is disallowed:
type instance F Int = Bool type instance F Int = Char
The second clause is a little more interesting. It says that two overlapping type family instances are allowed if the right-hand sides coincide in the region of overlap. Some examples help here:
type instance F (a, Int) = [a] type instance F (Int, b) = [b] -- overlap permitted type instance G (a, Int) = [a] type instance G (Char, a) = [a] -- ILLEGAL overlap, as [Char] /= [Int]
Note that this compatibility condition is independent of whether the type family is associated or not, and it is not only a matter of consistency, but one of type safety.
The definition for "compatible" uses a notion of "apart", whose definition in turn relies on type family reduction. This condition of "apartness", as stated, is impossible to check, so we use this conservative approximation: two types are considered to be apart when the two types cannot be unified, even by a potentially infinite unifier. Allowing the unifier to be infinite disallows the following pair of instances:
type instance H x x = Int type instance H [x] x = Bool
The type patterns in this pair equal if
x is replaced
by an infinite nesting of lists. Rejecting instances such as these is
necessary for type soundness.
Compatibility also affects closed type families. When simplifying an application of a closed type family, GHC will select an equation only when it is sure that no incompatible previous equation will ever apply. Here are some examples:
type family F a where F Int = Bool F a = Char type family G a where G Int = Int G a = a
In the definition for
F, the two equations are
incompatible -- their patterns are not apart, and yet their
right-hand sides do not coincide. Thus, before GHC selects the
second equation, it must be sure that the first can never apply. So,
F a does not simplify; only a type such
F Double will simplify to
G, on the other hand,
the two equations are compatible. Thus, GHC can ignore the first
equation when looking at the second. So,
G a will
However see Section 2.4.4, “Type, class and other declarations” for the overlap rules in GHCi.
In order to guarantee that type inference in the presence of type families decidable, we need to place a number of additional restrictions on the formation of type instance declarations (c.f., Definition 5 (Relaxed Conditions) of “Type Checking with Open Type Functions”). Instance declarations have the general form
type instance F t1 .. tn = t
where we require that for every type family application
.. sm) in
s1 .. sm do not contain any type family
the total number of symbols (data type constructors and type
s1 .. sm is strictly smaller than
t1 .. tn, and
for every type
s1 .. sm at most as often as in
These restrictions are easily verified and ensure termination of type
inference. However, they are not sufficient to guarantee completeness
of type inference in the presence of, so called, ''loopy equalities'',
a ~ [F a], where a recursive occurrence of
a type variable is underneath a family application and data
constructor application - see the above mentioned paper for details.
If the option
-XUndecidableInstances is passed to the
compiler, the above restrictions are not enforced and it is on the
programmer to ensure termination of the normalisation of type families
during type inference.
A data or type synonym family can be declared as part of a type class, thus:
class GMapKey k where data GMap k :: * -> * ... class Collects ce where type Elem ce :: * ...
When doing so, we (optionally) may drop the "
The type parameters must all be type variables, of course, and some (but not necessarily all) of then can be the class parameters. Each class parameter may only be used at most once per associated type, but some may be omitted and they may be in an order other than in the class head. Hence, the following contrived example is admissible:
class C a b c where type T c a x :: *
a are class parameters,
but the type is also indexed on a third parameter
When an associated data or type synonym family instance is declared within a type
class instance, we (optionally) may drop the
instance keyword in the
instance (GMapKey a, GMapKey b) => GMapKey (Either a b) where data GMap (Either a b) v = GMapEither (GMap a v) (GMap b v) ... instance Eq (Elem [e]) => Collects [e] where type Elem [e] = e ...
Note the following points:
The type indexes corresponding to class parameters must have precisely the same shape the type given in the instance head. To have the same "shape" means that the two types are identical modulo renaming of type variables. For example:
instance Eq (Elem [e]) => Collects [e] where -- Choose one of the following alternatives: type Elem [e] = e -- OK type Elem [x] = x -- OK type Elem x = x -- BAD; shape of 'x' is different to '[e]' type Elem [Maybe x] = x -- BAD: shape of '[Maybe x]' is different to '[e]'
An instances for an associated family can only appear as part of an instance declarations of the class in which the family was declared, just as with the equations of the methods of a class.
The instance for an associated type can be omitted in class instances. In that case,
unless there is a default instance (see Section 22.214.171.124, “Associated type synonym defaults”),
the corresponding instance type is not inhabited;
i.e., only diverging expressions, such
undefined, can assume the type.
Although it is unusual, there (currently) can be multiple instances for an associated family in a single instance declaration. For example, this is legitimate:
instance GMapKey Flob where data GMap Flob [v] = G1 v data GMap Flob Int = G2 Int ...
Here we give two data instance declarations, one in which the last
[v], and one for which it is
Since you cannot give any subsequent instances for
(GMap Flob ...), this facility is most useful when
the free indexed parameter is of a kind with a finite number of alternatives
*). WARNING: this facility may be withdrawn in the future.
It is possible for the class defining the associated type to specify a default for associated type instances. So for example, this is OK:
class IsBoolMap v where type Key v type instance Key v = Int lookupKey :: Key v -> v -> Maybe Bool instance IsBoolMap [(Int, Bool)] where lookupKey = lookup
instance declaration for the class, if no explicit
type instance declaration is given for the associated type, the default declaration
is used instead, just as with default class methods.
Note the following points:
instance keyword is optional.
There can be at most one default declaration for an associated type synonym.
A default declaration is not permitted for an associated data type.
The default declaration must mention only type variables on the left hand side, and the right hand side must mention only type varaibels bound on the left hand side. However, unlike the associated type family declaration itself, the type variables of the default instance are independent of those of the parent class.
Here are some examples:
class C a where type F1 a :: * type instance F1 a = [a] -- OK type instance F1 a = a->a -- BAD; only one default instance is allowed type F2 b a -- OK; note the family has more type -- variables than the class type instance F2 c d = c->d -- OK; you don't have to use 'a' in the type instance type F3 a type F3 [b] = b -- BAD; only type variables allowed on the LHS type F4 a type F4 b = a -- BAD; 'a' is not in scope in the RHS
The visibility of class parameters in the right-hand side of associated family instances depends solely on the parameters of the family. As an example, consider the simple class declaration
class C a b where data T a
Only one of the two class parameters is a parameter to the data family. Hence, the following instance declaration is invalid:
instance C [c] d where data T [c] = MkT (c, d) -- WRONG!! 'd' is not in scope
Here, the right-hand side of the data instance mentions the type
d that does not occur in its left-hand
side. We cannot admit such data instances as they would compromise
Associated type and data instance declarations do not inherit any
context specified on the enclosing instance. For type instance declarations,
it is unclear what the context would mean. For data instance declarations,
it is unlikely a user would want the context repeated for every data constructor.
The only place where the context might likely be useful is in a
deriving clause of an associated data instance. However,
even here, the role of the outer instance context is murky. So, for
clarity, we just stick to the rule above: the enclosing instance context
is ignored. If you need to use
a non-trivial context on a derived instance,
use a standalone
deriving clause (at the top level).
The rules for export lists (Haskell Report Section 5.2) needs adjustment for type families:
is a data family, names the family
T and all the in-scope
constructors (whether in scope qualified or unqualified) that are data
T(.., ci, .., fj, ..), where
a data family, names
T and the specified constructors
fj as usual. The constructors and field names must
belong to some data instance of
T, but are not required to belong
to the same instance.
is a class, names the class
C and all its methods
and associated types.
C(.., mi, .., type Tj, ..), where
C is a class,
names the class
C, and the specified methods
and associated types
Tj. The types need a keyword "
to distinguish them from data constructors.
Recall our running
GMapKey class example:
class GMapKey k where data GMap k :: * -> * insert :: GMap k v -> k -> v -> GMap k v lookup :: GMap k v -> k -> Maybe v empty :: GMap k v instance (GMapKey a, GMapKey b) => GMapKey (Either a b) where data GMap (Either a b) v = GMapEither (GMap a v) (GMap b v) ...method declarations...
Here are some export lists and their meaning:
module GMap( GMapKey ): Exports
just the class name.
module GMap( GMapKey(..) ):
Exports the class, the associated type
and the member
insert. The data constructors of
(in this case
GMapEither) are not exported.
module GMap( GMapKey( type GMap, empty, lookup, insert ) ):
Same as the previous item. Note the "
module GMap( GMapKey(..), GMap(..) ):
Same as previous item, but also exports all the data
module GMap ( GMapKey( empty, lookup, insert), GMap(..) ):
Same as previous item.
module GMap ( GMapKey, empty, lookup, insert, GMap(..) ):
Same as previous item.
Two things to watch out for:
You cannot write
GMapKey(type GMap(..)) — i.e.,
sub-component specifications cannot be nested. To
GMap's data constructors, you have to list
Consider this example:
module X where data family D module Y where import X data instance D Int = D1 | D2
Module Y exports all the entities defined in Y, namely the data constructors
D2, but not the data family
That (annoyingly) means that you cannot selectively import Y selectively,
import Y( D(D1,D2) )", because Y does not export
Instead you should list the exports explicitly, thus:
module Y( D(..) ) where ... or module Y( module Y, D ) where ...
Type families require us to extend the rules for the form of instance heads, which are given in Section 126.96.36.199, “Relaxed rules for the instance head”. Specifically:
Data type families may appear in an instance head
Type synonym families may not appear (at all) in an instance head
The reason for the latter restriction is that there is no way to check for instance matching. Consider
type family F a type instance F Bool = Int class C a instance C Int instance C (F a)
Now a constraint
(C (F Bool)) would match both instances.
The situation is especially bad because the type instance for
might be in another module, or even in a module that is not yet written.
However, type class instances of instances of data families can be defined much like any other data type. For example, we can say
data instance T Int = T1 Int | T2 Bool instance Eq (T Int) where (T1 i) == (T1 j) = i==j (T2 i) == (T2 j) = i==j _ == _ = False
Note that class instances are always for particular instances of a data family and never for an entire family as a whole. This is for essentially the same reasons that we cannot define a toplevel function that performs pattern matching on the data constructors of different instances of a single type family. It would require a form of extensible case construct.
Data instance declarations can also
deriving clauses. For example, we can write
data GMap () v = GMapUnit (Maybe v) deriving Show
which implicitly defines an instance of the form
instance Show v => Show (GMap () v) where ...