Table Of Contents

Previous topic

10. GHC Language Features

Next topic

10.39. Concurrent and Parallel Haskell

This Page

As with all known Haskell systems, GHC implements some extensions to the standard Haskell language. They can all be enabled or disabled by command line flags or language pragmas. By default GHC understands the most recent Haskell version it supports, plus a handful of extensions.

Some of the Glasgow extensions serve to give you access to the underlying facilities with which we implement Haskell. Thus, you can get at the Raw Iron, if you are willing to write some non-portable code at a more primitive level. You need not be “stuck” on performance because of the implementation costs of Haskell’s “high-level” features—you can always code “under” them. In an extreme case, you can write all your time-critical code in C, and then just glue it together with Haskell!

Before you get too carried away working at the lowest level (e.g., sloshing MutableByteArray#s around your program), you may wish to check if there are libraries that provide a “Haskellised veneer” over the features you want. The separate libraries documentation describes all the libraries that come with GHC.

10.1. Language options

The language extensions control what variation of the language are permitted.

Language options can be controlled in two ways:

  • Every language option can switched on by a command-line flag “-X...” (e.g. -XTemplateHaskell), and switched off by the flag “-XNo...”; (e.g. -XNoTemplateHaskell).
  • Language options recognised by Cabal can also be enabled using the LANGUAGE pragma, thus {-# LANGUAGE TemplateHaskell #-} (see LANGUAGE pragma).

GHC supports these language options:

Extension Description
AllowAmbiguousTypes Allow the user to write ambiguous types, and the type inference engine to infer them.
ApplicativeDo Enable Applicative do-notation desugaring
Arrows Enable arrow notation extension
BangPatterns Enable bang patterns.
BinaryLiterals Enable support for binary literals.
BlockArguments Allow do blocks and other constructs as function arguments.
CApiFFI Enable the CAPI calling convention.
ConstrainedClassMethods Enable constrained class methods.
ConstraintKinds Enable a kind of constraints.
CPP Enable the C preprocessor.
DataKinds Enable datatype promotion.
DatatypeContexts Allow contexts on data types.
DefaultSignatures Enable default signatures.
DeriveAnyClass Enable deriving for any class.
DeriveDataTypeable Enable deriving for the Data class. Implied by AutoDeriveTypeable.
DeriveFoldable Enable deriving for the Foldable class. Implied by DeriveTraversable.
DeriveFunctor Enable deriving for the Functor class. Implied by DeriveTraversable.
DeriveGeneric Enable deriving for the Generic class.
DeriveLift Enable deriving for the Lift class
DeriveTraversable Enable deriving for the Traversable class. Implies DeriveFunctor and DeriveFoldable.
DerivingStrategies Enables deriving strategies.
DerivingVia Enable deriving instances via types of the same runtime representation. Implies DerivingStrategies.
DisambiguateRecordFields Enable record field disambiguation. Implied by RecordWildCards.
DuplicateRecordFields Allow definition of record types with identically-named fields.
EmptyCase Allow empty case alternatives.
EmptyDataDecls Allow definition of empty data types.
ExistentialQuantification Enable liberalised type synonyms.
ExplicitForAll Enable explicit universal quantification. Implied by ScopedTypeVariables, LiberalTypeSynonyms, RankNTypes and ExistentialQuantification.
ExplicitNamespaces Enable using the keyword type to specify the namespace of entries in imports and exports (Explicit namespaces in import/export). Implied by TypeOperators and TypeFamilies.
ExtendedDefaultRules Use GHCi’s extended default rules in a normal module.
FlexibleContexts Enable flexible contexts. Implied by ImplicitParams.
FlexibleInstances Enable flexible instances. Implies TypeSynonymInstances. Implied by ImplicitParams.
ForeignFunctionInterface Enable foreign function interface.
FunctionalDependencies Enable functional dependencies. Implies MultiParamTypeClasses.
GADTs Enable generalised algebraic data types. Implies GADTSyntax and MonoLocalBinds.
GADTSyntax Enable generalised algebraic data type syntax.
GeneralisedNewtypeDeriving Enable newtype deriving.
HexFloatLiterals Enable support for hexadecimal floating point literals.
ImplicitParams Enable Implicit Parameters. Implies FlexibleContexts and FlexibleInstances.
ImplicitPrelude Don’t implicitly import Prelude. Implied by RebindableSyntax.
ImpredicativeTypes Enable impredicative types. Implies RankNTypes.
IncoherentInstances Enable incoherent instances. Implies OverlappingInstances.
InstanceSigs Enable instance signatures.
InterruptibleFFI Enable interruptible FFI.
KindSignatures Enable kind signatures. Implied by TypeFamilies and PolyKinds.
LambdaCase Enable lambda-case expressions.
LiberalTypeSynonyms Enable liberalised type synonyms.
MagicHash Allow # as a postfix modifier on identifiers.
MonadComprehensions Enable monad comprehensions.
MonadFailDesugaring Enable monadfail desugaring.
MonoLocalBinds Enable do not generalise local bindings. Implied by TypeFamilies and GADTs.
MonomorphismRestriction Disable the monomorphism restriction.
MultiParamTypeClasses Enable multi parameter type classes. Implied by FunctionalDependencies.
MultiWayIf Enable multi-way if-expressions.
NamedFieldPuns Enable record puns.
NamedWildCards Enable named wildcards.
NegativeLiterals Enable support for negative literals.
NPlusKPatterns Enable support for n+k patterns. Implied by Haskell98.
NullaryTypeClasses Deprecated, does nothing. nullary (no parameter) type classes are now enabled using MultiParamTypeClasses.
NumDecimals Enable support for ‘fractional’ integer literals.
NumericUnderscores Enable support for numeric underscores.
OverlappingInstances Enable overlapping instances.
OverloadedLabels Enable overloaded labels.
OverloadedLists Enable overloaded lists.
OverloadedStrings Enable overloaded string literals.
PackageImports Enable package-qualified imports.
ParallelListComp Enable parallel list comprehensions.
PartialTypeSignatures Enable partial type signatures.
PatternGuards Disable pattern guards. Implied by Haskell98.
PatternSynonyms Enable pattern synonyms.
PolyKinds Enable kind polymorphism. Implies KindSignatures.
PostfixOperators Enable postfix operators.
QuantifiedConstraints Allow forall quantifiers in constraints.
QuasiQuotes Enable quasiquotation.
Rank2Types Enable rank-2 types. Synonym for RankNTypes.
RankNTypes Enable rank-N types. Implied by ImpredicativeTypes.
RebindableSyntax Employ rebindable syntax. Implies NoImplicitPrelude.
RecordWildCards Enable record wildcards. Implies DisambiguateRecordFields.
RecursiveDo Enable recursive do (mdo) notation.
RoleAnnotations Enable role annotations.
Safe Enable the Safe Haskell Safe mode.
ScopedTypeVariables Enable lexically-scoped type variables.
StandaloneDeriving Enable standalone deriving.
StarIsType Treat * as Data.Kind.Type.
StaticPointers Enable static pointers.
Strict Make bindings in the current module strict by default.
StrictData Enable default strict datatype fields.
TemplateHaskell Enable Template Haskell.
TemplateHaskellQuotes Enable quotation subset of Template Haskell.
TraditionalRecordSyntax Disable support for traditional record syntax (as supported by Haskell 98) C {f = x}
TransformListComp Enable generalised list comprehensions.
Trustworthy Enable the Safe Haskell Trustworthy mode.
TupleSections Enable tuple sections.
TypeApplications Enable type application syntax.
TypeFamilies Enable type families. Implies ExplicitNamespaces, KindSignatures, and MonoLocalBinds.
TypeFamilyDependencies Enable injective type families. Implies TypeFamilies.
TypeInType Deprecated. Enable kind polymorphism and datatype promotion.
TypeOperators Enable type operators. Implies ExplicitNamespaces.
TypeSynonymInstances Enable type synonyms in instance heads. Implied by FlexibleInstances.
UnboxedSums Enable unboxed sums.
UnboxedTuples Enable the use of unboxed tuple syntax.
UndecidableInstances Enable undecidable instances.
UndecidableSuperClasses Allow all superclass constraints, including those that may result in non-termination of the typechecker.
UnicodeSyntax Enable unicode syntax.
Unsafe Enable Safe Haskell Unsafe mode.
ViewPatterns Enable view patterns.

Although not recommended, the deprecated -fglasgow-exts flag enables a large swath of the extensions supported by GHC at once.

-fglasgow-exts

The flag -fglasgow-exts is equivalent to enabling the following extensions:

Enabling these options is the only effect of -fglasgow-exts. We are trying to move away from this portmanteau flag, and towards enabling features individually.

10.2. Unboxed types and primitive operations

GHC is built on a raft of primitive data types and operations; “primitive” in the sense that they cannot be defined in Haskell itself. While you really can use this stuff to write fast code, we generally find it a lot less painful, and more satisfying in the long run, to use higher-level language features and libraries. With any luck, the code you write will be optimised to the efficient unboxed version in any case. And if it isn’t, we’d like to know about it.

All these primitive data types and operations are exported by the library GHC.Prim, for which there is detailed online documentation <GHC.Prim.>. (This documentation is generated from the file compiler/prelude/primops.txt.pp.)

If you want to mention any of the primitive data types or operations in your program, you must first import GHC.Prim to bring them into scope. Many of them have names ending in #, and to mention such names you need the MagicHash extension.

The primops make extensive use of unboxed types and unboxed tuples, which we briefly summarise here.

10.2.1. Unboxed types

Most types in GHC are boxed, which means that values of that type are represented by a pointer to a heap object. The representation of a Haskell Int, for example, is a two-word heap object. An unboxed type, however, is represented by the value itself, no pointers or heap allocation are involved.

Unboxed types correspond to the “raw machine” types you would use in C: Int# (long int), Double# (double), Addr# (void *), etc. The primitive operations (PrimOps) on these types are what you might expect; e.g., (+#) is addition on Int#s, and is the machine-addition that we all know and love—usually one instruction.

Primitive (unboxed) types cannot be defined in Haskell, and are therefore built into the language and compiler. Primitive types are always unlifted; that is, a value of a primitive type cannot be bottom. (Note: a “boxed” type means that a value is represented by a pointer to a heap object; a “lifted” type means that terms of that type may be bottom. See the next paragraph for an example.) We use the convention (but it is only a convention) that primitive types, values, and operations have a # suffix (see The magic hash). For some primitive types we have special syntax for literals, also described in the same section.

Primitive values are often represented by a simple bit-pattern, such as Int#, Float#, Double#. But this is not necessarily the case: a primitive value might be represented by a pointer to a heap-allocated object. Examples include Array#, the type of primitive arrays. Thus, Array# is an unlifted, boxed type. A primitive array is heap-allocated because it is too big a value to fit in a register, and would be too expensive to copy around; in a sense, it is accidental that it is represented by a pointer. If a pointer represents a primitive value, then it really does point to that value: no unevaluated thunks, no indirections. Nothing can be at the other end of the pointer than the primitive value. A numerically-intensive program using unboxed types can go a lot faster than its “standard” counterpart—we saw a threefold speedup on one example.

10.2.2. Unboxed type kinds

Because unboxed types are represented without the use of pointers, we cannot store them in use a polymorphic datatype at an unboxed type. For example, the Just node of Just 42# would have to be different from the Just node of Just 42; the former stores an integer directly, while the latter stores a pointer. GHC currently does not support this variety of Just nodes (nor for any other datatype). Accordingly, the kind of an unboxed type is different from the kind of a boxed type.

The Haskell Report describes that * (spelled Type and imported from Data.Kind in the GHC dialect of Haskell) is the kind of ordinary datatypes, such as Int. Furthermore, type constructors can have kinds with arrows; for example, Maybe has kind Type -> Type. Unboxed types have a kind that specifies their runtime representation. For example, the type Int# has kind TYPE 'IntRep and Double# has kind TYPE 'DoubleRep. These kinds say that the runtime representation of an Int# is a machine integer, and the runtime representation of a Double# is a machine double-precision floating point. In contrast, the kind Type is actually just a synonym for TYPE 'LiftedRep. More details of the TYPE mechanisms appear in the section on runtime representation polymorphism.

Given that Int#‘s kind is not Type, it then it follows that Maybe Int# is disallowed. Similarly, because type variables tend to be of kind Type (for example, in (.) :: (b -> c) -> (a -> b) -> a -> c, all the type variables have kind Type), polymorphism tends not to work over primitive types. Stepping back, this makes some sense, because a polymorphic function needs to manipulate the pointers to its data, and most primitive types are unboxed.

There are some restrictions on the use of primitive types:

  • You cannot define a newtype whose representation type (the argument type of the data constructor) is an unboxed type. Thus, this is illegal:

    newtype A = MkA Int#
    
  • You cannot bind a variable with an unboxed type in a top-level binding.

  • You cannot bind a variable with an unboxed type in a recursive binding.

  • You may bind unboxed variables in a (non-recursive, non-top-level) pattern binding, but you must make any such pattern-match strict. (Failing to do so emits a warning -Wunbanged-strict-patterns.) For example, rather than:

    data Foo = Foo Int Int#
    
    f x = let (Foo a b, w) = ..rhs.. in ..body..
    

    you must write:

    data Foo = Foo Int Int#
    
    f x = let !(Foo a b, w) = ..rhs.. in ..body..
    

    since b has type Int#.

10.2.3. Unboxed tuples

UnboxedTuples
Since:6.8.1

Unboxed tuples aren’t really exported by GHC.Exts; they are a syntactic extension (UnboxedTuples). An unboxed tuple looks like this:

(# e_1, ..., e_n #)

where e_1..e_n are expressions of any type (primitive or non-primitive). The type of an unboxed tuple looks the same.

Note that when unboxed tuples are enabled, (# is a single lexeme, so for example when using operators like # and #- you need to write ( # ) and ( #- ) rather than (#) and (#-).

Unboxed tuples are used for functions that need to return multiple values, but they avoid the heap allocation normally associated with using fully-fledged tuples. When an unboxed tuple is returned, the components are put directly into registers or on the stack; the unboxed tuple itself does not have a composite representation. Many of the primitive operations listed in primops.txt.pp return unboxed tuples. In particular, the IO and ST monads use unboxed tuples to avoid unnecessary allocation during sequences of operations.

There are some restrictions on the use of unboxed tuples:

  • The typical use of unboxed tuples is simply to return multiple values, binding those multiple results with a case expression, thus:

    f x y = (# x+1, y-1 #)
    g x = case f x x of { (# a, b #) -> a + b }
    

    You can have an unboxed tuple in a pattern binding, thus

    f x = let (# p,q #) = h x in ..body..
    

    If the types of p and q are not unboxed, the resulting binding is lazy like any other Haskell pattern binding. The above example desugars like this:

    f x = let t = case h x of { (# p,q #) -> (p,q) }
              p = fst t
              q = snd t
          in ..body..
    

    Indeed, the bindings can even be recursive.

10.2.4. Unboxed sums

UnboxedSums
Since:8.2.1

Enable the use of unboxed sum syntax.

-XUnboxedSums enables new syntax for anonymous, unboxed sum types. The syntax for an unboxed sum type with N alternatives is

(# t_1 | t_2 | ... | t_N #)

where t_1 ... t_N are types (which can be unlifted, including unboxed tuples and sums).

Unboxed tuples can be used for multi-arity alternatives. For example:

(# (# Int, String #) | Bool #)

The term level syntax is similar. Leading and preceding bars (|) indicate which alternative it is. Here are two terms of the type shown above:

(# (# 1, "foo" #) | #) -- first alternative

(# | True #) -- second alternative

The pattern syntax reflects the term syntax:

case x of
  (# (# i, str #) | #) -> ...
  (# | bool #) -> ...

Unboxed sums are “unboxed” in the sense that, instead of allocating sums in the heap and representing values as pointers, unboxed sums are represented as their components, just like unboxed tuples. These “components” depend on alternatives of a sum type. Like unboxed tuples, unboxed sums are lazy in their lifted components.

The code generator tries to generate as compact layout as possible for each unboxed sum. In the best case, size of an unboxed sum is size of its biggest alternative plus one word (for a tag). The algorithm for generating the memory layout for a sum type works like this:

  • All types are classified as one of these classes: 32bit word, 64bit word, 32bit float, 64bit float, pointer.

  • For each alternative of the sum type, a layout that consists of these fields is generated. For example, if an alternative has Int, Float# and String fields, the layout will have an 32bit word, 32bit float and pointer fields.

  • Layout fields are then overlapped so that the final layout will be as compact as possible. For example, suppose we have the unboxed sum:

    (# (# Word32#, String, Float# #)
    |  (# Float#, Float#, Maybe Int #) #)
    

    The final layout will be something like

    Int32, Float32, Float32, Word32, Pointer
    

    The first Int32 is for the tag. There are two Float32 fields because floating point types can’t overlap with other types, because of limitations of the code generator that we’re hoping to overcome in the future. The second alternative needs two Float32 fields: The Word32 field is for the Word32# in the first alternative. The Pointer field is shared between String and Maybe Int values of the alternatives.

    As another example, this is the layout for the unboxed version of Maybe a type, (# (# #) | a #):

    Int32, Pointer
    

    The Pointer field is not used when tag says that it’s Nothing. Otherwise Pointer points to the value in Just. As mentioned above, this type is lazy in its lifted field. Therefore, the type

    data Maybe' a = Maybe' (# (# #) | a #)
    

    is precisely isomorphic to the type Maybe a, although its memory representation is different.

    In the degenerate case where all the alternatives have zero width, such as the Bool-like (# (# #) | (# #) #), the unboxed sum layout only has an Int32 tag field (i.e., the whole thing is represented by an integer).

10.3. Syntactic extensions

10.3.1. Unicode syntax

UnicodeSyntax
Since:6.8.1

Enable the use of Unicode characters in place of their equivalent ASCII sequences.

The language extension UnicodeSyntax enables Unicode characters to be used to stand for certain ASCII character sequences. The following alternatives are provided:

ASCII Unicode alternative Code point Name
:: 0x2237 PROPORTION
=> 0x21D2 RIGHTWARDS DOUBLE ARROW
-> 0x2192 RIGHTWARDS ARROW
<- 0x2190 LEFTWARDS ARROW
>- 0x291a RIGHTWARDS ARROW-TAIL
-< 0x2919 LEFTWARDS ARROW-TAIL
>>- 0x291C RIGHTWARDS DOUBLE ARROW-TAIL
-<< 0x291B LEFTWARDS DOUBLE ARROW-TAIL
* 0x2605 BLACK STAR
forall 0x2200 FOR ALL
(| 0x2987 Z NOTATION LEFT IMAGE BRACKET
|) 0x2988 Z NOTATION RIGHT IMAGE BRACKET
[| 0x27E6 MATHEMATICAL LEFT WHITE SQUARE BRACKET
|] 0x27E7 MATHEMATICAL RIGHT WHITE SQUARE BRACKET

10.3.2. The magic hash

MagicHash
Since:6.8.1

Enables the use of the hash character (#) as an identifier suffix.

The language extension MagicHash allows # as a postfix modifier to identifiers. Thus, x# is a valid variable, and T# is a valid type constructor or data constructor.

The hash sign does not change semantics at all. We tend to use variable names ending in “#” for unboxed values or types (e.g. Int#), but there is no requirement to do so; they are just plain ordinary variables. Nor does the MagicHash extension bring anything into scope. For example, to bring Int# into scope you must import GHC.Prim (see Unboxed types and primitive operations); the MagicHash extension then allows you to refer to the Int# that is now in scope. Note that with this option, the meaning of x#y = 0 is changed: it defines a function x# taking a single argument y; to define the operator #, put a space: x # y = 0.

The MagicHash also enables some new forms of literals (see Unboxed types):

  • 'x'# has type Char#
  • "foo"# has type Addr#
  • 3# has type Int#. In general, any Haskell integer lexeme followed by a # is an Int# literal, e.g. -0x3A# as well as 32#.
  • 3## has type Word#. In general, any non-negative Haskell integer lexeme followed by ## is a Word#.
  • 3.2# has type Float#.
  • 3.2## has type Double#

10.3.3. Negative literals

NegativeLiterals
Since:7.8.1

Enable the use of un-parenthesized negative numeric literals.

The literal -123 is, according to Haskell98 and Haskell 2010, desugared as negate (fromInteger 123). The language extension NegativeLiterals means that it is instead desugared as fromInteger (-123).

This can make a difference when the positive and negative range of a numeric data type don’t match up. For example, in 8-bit arithmetic -128 is representable, but +128 is not. So negate (fromInteger 128) will elicit an unexpected integer-literal-overflow message.

10.3.4. Fractional looking integer literals

NumDecimals
Since:7.8.1

Allow the use of floating-point literal syntax for integral types.

Haskell 2010 and Haskell 98 define floating literals with the syntax 1.2e6. These literals have the type Fractional a => a.

The language extension NumDecimals allows you to also use the floating literal syntax for instances of Integral, and have values like (1.2e6 :: Num a => a)

10.3.5. Binary integer literals

BinaryLiterals
Since:7.10.1

Allow the use of binary notation in integer literals.

Haskell 2010 and Haskell 98 allows for integer literals to be given in decimal, octal (prefixed by 0o or 0O), or hexadecimal notation (prefixed by 0x or 0X).

The language extension BinaryLiterals adds support for expressing integer literals in binary notation with the prefix 0b or 0B. For instance, the binary integer literal 0b11001001 will be desugared into fromInteger 201 when BinaryLiterals is enabled.

10.3.6. Hexadecimal floating point literals

HexFloatLiterals
Since:8.4.1

Allow writing floating point literals using hexadecimal notation.

The hexadecimal notation for floating point literals is useful when you need to specify floating point constants precisely, as the literal notation corresponds closely to the underlying bit-encoding of the number.

In this notation floating point numbers are written using hexadecimal digits, and so the digits are interpreted using base 16, rather then the usual 10. This means that digits left of the decimal point correspond to positive powers of 16, while the ones to the right correspond to negative ones.

You may also write an explicit exponent, which is similar to the exponent in decimal notation with the following differences: - the exponent begins with p instead of e - the exponent is written in base 10 (not 16) - the base of the exponent is 2 (not 16).

In terms of the underlying bit encoding, each hexadecimal digit corresponds to 4 bits, and you may think of the exponent as “moving” the floating point by one bit left (negative) or right (positive). Here are some examples:

  • 0x0.1 is the same as 1/16
  • 0x0.01 is the same as 1/256
  • 0xF.FF is the same as 15 + 15/16 + 15/256
  • 0x0.1p4 is the same as 1
  • 0x0.1p-4 is the same as 1/256
  • 0x0.1p12 is the same as 256

10.3.7. Numeric underscores

NumericUnderscores
Since:8.6.1

Allow the use of underscores in numeric literals.

GHC allows for numeric literals to be given in decimal, octal, hexadecimal, binary, or float notation.

The language extension NumericUnderscores adds support for expressing underscores in numeric literals. For instance, the numeric literal 1_000_000 will be parsed into 1000000 when NumericUnderscores is enabled. That is, underscores in numeric literals are ignored when NumericUnderscores is enabled. See also Trac #14473.

For example:

-- decimal
million    = 1_000_000
billion    = 1_000_000_000
lightspeed = 299_792_458
version    = 8_04_1
date       = 2017_12_31

-- hexadecimal
red_mask = 0xff_00_00
size1G   = 0x3fff_ffff

-- binary
bit8th   = 0b01_0000_0000
packbits = 0b1_11_01_0000_0_111
bigbits  = 0b1100_1011__1110_1111__0101_0011

-- float
pi       = 3.141_592_653_589_793
faraday  = 96_485.332_89
avogadro = 6.022_140_857e+23

-- function
isUnderMillion = (< 1_000_000)

clip64M x
    | x > 0x3ff_ffff = 0x3ff_ffff
    | otherwise = x

test8bit x = (0b01_0000_0000 .&. x) /= 0

About validity:

x0 = 1_000_000   -- valid
x1 = 1__000000   -- valid
x2 = 1000000_    -- invalid
x3 = _1000000    -- invalid

e0 = 0.0001      -- valid
e1 = 0.000_1     -- valid
e2 = 0_.0001     -- invalid
e3 = _0.0001     -- invalid
e4 = 0._0001     -- invalid
e5 = 0.0001_     -- invalid

f0 = 1e+23       -- valid
f1 = 1_e+23      -- valid
f2 = 1__e+23     -- valid
f3 = 1e_+23      -- invalid

g0 = 1e+23       -- valid
g1 = 1e+_23      -- invalid
g2 = 1e+23_      -- invalid

h0 = 0xffff      -- valid
h1 = 0xff_ff     -- valid
h2 = 0x_ffff     -- valid
h3 = 0x__ffff    -- valid
h4 = _0xffff     -- invalid

10.3.8. Pattern guards

NoPatternGuards
Implied by:Haskell98
Since:6.8.1

Disable pattern guards.

10.3.9. View patterns

ViewPatterns
Since:6.10.1

Allow use of view pattern syntax.

View patterns are enabled by the language extension ViewPatterns. More information and examples of view patterns can be found on the Wiki page.

View patterns are somewhat like pattern guards that can be nested inside of other patterns. They are a convenient way of pattern-matching against values of abstract types. For example, in a programming language implementation, we might represent the syntax of the types of the language as follows:

type Typ

data TypView = Unit
             | Arrow Typ Typ

view :: Typ -> TypView

-- additional operations for constructing Typ's ...

The representation of Typ is held abstract, permitting implementations to use a fancy representation (e.g., hash-consing to manage sharing). Without view patterns, using this signature is a little inconvenient:

size :: Typ -> Integer
size t = case view t of
  Unit -> 1
  Arrow t1 t2 -> size t1 + size t2

It is necessary to iterate the case, rather than using an equational function definition. And the situation is even worse when the matching against t is buried deep inside another pattern.

View patterns permit calling the view function inside the pattern and matching against the result:

size (view -> Unit) = 1
size (view -> Arrow t1 t2) = size t1 + size t2

That is, we add a new form of pattern, written ⟨expression⟩ -> ⟨pattern⟩ that means “apply the expression to whatever we’re trying to match against, and then match the result of that application against the pattern”. The expression can be any Haskell expression of function type, and view patterns can be used wherever patterns are used.

The semantics of a pattern ( ⟨exp⟩ -> ⟨pat⟩ ) are as follows:

  • Scoping: The variables bound by the view pattern are the variables bound by ⟨pat⟩.

    Any variables in ⟨exp⟩ are bound occurrences, but variables bound “to the left” in a pattern are in scope. This feature permits, for example, one argument to a function to be used in the view of another argument. For example, the function clunky from Pattern guards can be written using view patterns as follows:

    clunky env (lookup env -> Just val1) (lookup env -> Just val2) = val1 + val2
    ...other equations for clunky...
    

    More precisely, the scoping rules are:

    • In a single pattern, variables bound by patterns to the left of a view pattern expression are in scope. For example:

      example :: Maybe ((String -> Integer,Integer), String) -> Bool
      example Just ((f,_), f -> 4) = True
      

      Additionally, in function definitions, variables bound by matching earlier curried arguments may be used in view pattern expressions in later arguments:

      example :: (String -> Integer) -> String -> Bool
      example f (f -> 4) = True
      

      That is, the scoping is the same as it would be if the curried arguments were collected into a tuple.

    • In mutually recursive bindings, such as let, where, or the top level, view patterns in one declaration may not mention variables bound by other declarations. That is, each declaration must be self-contained. For example, the following program is not allowed:

      let {(x -> y) = e1 ;
           (y -> x) = e2 } in x
      

    (For some amplification on this design choice see Trac #4061.

  • Typing: If ⟨exp⟩ has type ⟨T1⟩ -> ⟨T2⟩ and ⟨pat⟩ matches a ⟨T2⟩, then the whole view pattern matches a ⟨T1⟩.

  • Matching: To the equations in Section 3.17.3 of the Haskell 98 Report, add the following:

    case v of { (e -> p) -> e1 ; _ -> e2 }
     =
    case (e v) of { p -> e1 ; _ -> e2 }
    

    That is, to match a variable ⟨v⟩ against a pattern ( ⟨exp⟩ -> ⟨pat⟩ ), evaluate ( ⟨exp⟩ ⟨v⟩ ) and match the result against ⟨pat⟩.

  • Efficiency: When the same view function is applied in multiple branches of a function definition or a case expression (e.g., in size above), GHC makes an attempt to collect these applications into a single nested case expression, so that the view function is only applied once. Pattern compilation in GHC follows the matrix algorithm described in Chapter 4 of The Implementation of Functional Programming Languages. When the top rows of the first column of a matrix are all view patterns with the “same” expression, these patterns are transformed into a single nested case. This includes, for example, adjacent view patterns that line up in a tuple, as in

    f ((view -> A, p1), p2) = e1
    f ((view -> B, p3), p4) = e2
    

    The current notion of when two view pattern expressions are “the same” is very restricted: it is not even full syntactic equality. However, it does include variables, literals, applications, and tuples; e.g., two instances of view ("hi", "there") will be collected. However, the current implementation does not compare up to alpha-equivalence, so two instances of (x, view x -> y) will not be coalesced.

10.3.10. n+k patterns

NPlusKPatterns
Implied by:Haskell98
Since:6.12.1

Enable use of n+k patterns.

10.3.11. The recursive do-notation

RecursiveDo
Since:6.8.1

Allow the use of recursive do notation.

The do-notation of Haskell 98 does not allow recursive bindings, that is, the variables bound in a do-expression are visible only in the textually following code block. Compare this to a let-expression, where bound variables are visible in the entire binding group.

It turns out that such recursive bindings do indeed make sense for a variety of monads, but not all. In particular, recursion in this sense requires a fixed-point operator for the underlying monad, captured by the mfix method of the MonadFix class, defined in Control.Monad.Fix as follows:

class Monad m => MonadFix m where
   mfix :: (a -> m a) -> m a

Haskell’s Maybe, [] (list), ST (both strict and lazy versions), IO, and many other monads have MonadFix instances. On the negative side, the continuation monad, with the signature (a -> r) -> r, does not.

For monads that do belong to the MonadFix class, GHC provides an extended version of the do-notation that allows recursive bindings. The RecursiveDo (language pragma: RecursiveDo) provides the necessary syntactic support, introducing the keywords mdo and rec for higher and lower levels of the notation respectively. Unlike bindings in a do expression, those introduced by mdo and rec are recursively defined, much like in an ordinary let-expression. Due to the new keyword mdo, we also call this notation the mdo-notation.

Here is a simple (albeit contrived) example:

{-# LANGUAGE RecursiveDo #-}
justOnes = mdo { xs <- Just (1:xs)
               ; return (map negate xs) }

or equivalently

{-# LANGUAGE RecursiveDo #-}
justOnes = do { rec { xs <- Just (1:xs) }
              ; return (map negate xs) }

As you can guess justOnes will evaluate to Just [-1,-1,-1,....

GHC’s implementation the mdo-notation closely follows the original translation as described in the paper A recursive do for Haskell, which in turn is based on the work Value Recursion in Monadic Computations. Furthermore, GHC extends the syntax described in the former paper with a lower level syntax flagged by the rec keyword, as we describe next.

10.3.11.1. Recursive binding groups

The extension RecursiveDo also introduces a new keyword rec, which wraps a mutually-recursive group of monadic statements inside a do expression, producing a single statement. Similar to a let statement inside a do, variables bound in the rec are visible throughout the rec group, and below it. For example, compare

do { a <- getChar            do { a <- getChar
   ; let { r1 = f a r2          ; rec { r1 <- f a r2
   ;     ; r2 = g r1 }          ;     ; r2 <- g r1 }
   ; return (r1 ++ r2) }        ; return (r1 ++ r2) }

In both cases, r1 and r2 are available both throughout the let or rec block, and in the statements that follow it. The difference is that let is non-monadic, while rec is monadic. (In Haskell let is really letrec, of course.)

The semantics of rec is fairly straightforward. Whenever GHC finds a rec group, it will compute its set of bound variables, and will introduce an appropriate call to the underlying monadic value-recursion operator mfix, belonging to the MonadFix class. Here is an example:

rec { b <- f a c     ===>    (b,c) <- mfix (\ ~(b,c) -> do { b <- f a c
    ; c <- f b a }                                         ; c <- f b a
                                                           ; return (b,c) })

As usual, the meta-variables b, c etc., can be arbitrary patterns. In general, the statement rec ss is desugared to the statement

vs <- mfix (\ ~vs -> do { ss; return vs })

where vs is a tuple of the variables bound by ss.

Note in particular that the translation for a rec block only involves wrapping a call to mfix: it performs no other analysis on the bindings. The latter is the task for the mdo notation, which is described next.

10.3.11.2. The mdo notation

A rec-block tells the compiler where precisely the recursive knot should be tied. It turns out that the placement of the recursive knots can be rather delicate: in particular, we would like the knots to be wrapped around as minimal groups as possible. This process is known as segmentation, and is described in detail in Section 3.2 of A recursive do for Haskell. Segmentation improves polymorphism and reduces the size of the recursive knot. Most importantly, it avoids unnecessary interference caused by a fundamental issue with the so-called right-shrinking axiom for monadic recursion. In brief, most monads of interest (IO, strict state, etc.) do not have recursion operators that satisfy this axiom, and thus not performing segmentation can cause unnecessary interference, changing the termination behavior of the resulting translation. (Details can be found in Sections 3.1 and 7.2.2 of Value Recursion in Monadic Computations.)

The mdo notation removes the burden of placing explicit rec blocks in the code. Unlike an ordinary do expression, in which variables bound by statements are only in scope for later statements, variables bound in an mdo expression are in scope for all statements of the expression. The compiler then automatically identifies minimal mutually recursively dependent segments of statements, treating them as if the user had wrapped a rec qualifier around them.

The definition is syntactic:

  • A generator ⟨g⟩ depends on a textually following generator ⟨g’⟩, if
    • ⟨g’⟩ defines a variable that is used by ⟨g⟩, or
    • ⟨g’⟩ textually appears between ⟨g⟩ and ⟨g’‘⟩, where ⟨g⟩ depends on ⟨g’‘⟩.
  • A segment of a given mdo-expression is a minimal sequence of generators such that no generator of the sequence depends on an outside generator. As a special case, although it is not a generator, the final expression in an mdo-expression is considered to form a segment by itself.

Segments in this sense are related to strongly-connected components analysis, with the exception that bindings in a segment cannot be reordered and must be contiguous.

Here is an example mdo-expression, and its translation to rec blocks:

mdo { a <- getChar      ===> do { a <- getChar
    ; b <- f a c                ; rec { b <- f a c
    ; c <- f b a                ;     ; c <- f b a }
    ; z <- h a b                ; z <- h a b
    ; d <- g d e                ; rec { d <- g d e
    ; e <- g a z                ;     ; e <- g a z }
    ; putChar c }               ; putChar c }

Note that a given mdo expression can cause the creation of multiple rec blocks. If there are no recursive dependencies, mdo will introduce no rec blocks. In this latter case an mdo expression is precisely the same as a do expression, as one would expect.

In summary, given an mdo expression, GHC first performs segmentation, introducing rec blocks to wrap over minimal recursive groups. Then, each resulting rec is desugared, using a call to Control.Monad.Fix.mfix as described in the previous section. The original mdo-expression typechecks exactly when the desugared version would do so.

Here are some other important points in using the recursive-do notation:

  • It is enabled with the extension RecursiveDo, or the LANGUAGE RecursiveDo pragma. (The same extension enables both mdo-notation, and the use of rec blocks inside do expressions.)
  • rec blocks can also be used inside mdo-expressions, which will be treated as a single statement. However, it is good style to either use mdo or rec blocks in a single expression.
  • If recursive bindings are required for a monad, then that monad must be declared an instance of the MonadFix class.
  • The following instances of MonadFix are automatically provided: List, Maybe, IO. Furthermore, the Control.Monad.ST and Control.Monad.ST.Lazy modules provide the instances of the MonadFix class for Haskell’s internal state monad (strict and lazy, respectively).
  • Like let and where bindings, name shadowing is not allowed within an mdo-expression or a rec-block; that is, all the names bound in a single rec must be distinct. (GHC will complain if this is not the case.)

10.3.12. Applicative do-notation

ApplicativeDo
Since:8.0.1

Allow use of Applicative do notation.

The language option ApplicativeDo enables an alternative translation for the do-notation, which uses the operators <$>, <*>, along with join as far as possible. There are two main reasons for wanting to do this:

  • We can use do-notation with types that are an instance of Applicative and Functor, but not Monad
  • In some monads, using the applicative operators is more efficient than monadic bind. For example, it may enable more parallelism.

Applicative do-notation desugaring preserves the original semantics, provided that the Applicative instance satisfies <*> = ap and pure = return (these are true of all the common monadic types). Thus, you can normally turn on ApplicativeDo without fear of breaking your program. There is one pitfall to watch out for; see Things to watch out for.

There are no syntactic changes with ApplicativeDo. The only way it shows up at the source level is that you can have a do expression that doesn’t require a Monad constraint. For example, in GHCi:

Prelude> :set -XApplicativeDo
Prelude> :t \m -> do { x <- m; return (not x) }
\m -> do { x <- m; return (not x) }
  :: Functor f => f Bool -> f Bool

This example only requires Functor, because it is translated into (\x -> not x) <$> m. A more complex example requires Applicative,

Prelude> :t \m -> do { x <- m 'a'; y <- m 'b'; return (x || y) }
\m -> do { x <- m 'a'; y <- m 'b'; return (x || y) }
  :: Applicative f => (Char -> f Bool) -> f Bool

Here GHC has translated the expression into

(\x y -> x || y) <$> m 'a' <*> m 'b'

It is possible to see the actual translation by using -ddump-ds, but be warned, the output is quite verbose.

Note that if the expression can’t be translated into uses of <$>, <*> only, then it will incur a Monad constraint as usual. This happens when there is a dependency on a value produced by an earlier statement in the do-block:

Prelude> :t \m -> do { x <- m True; y <- m x; return (x || y) }
\m -> do { x <- m True; y <- m x; return (x || y) }
  :: Monad m => (Bool -> m Bool) -> m Bool

Here, m x depends on the value of x produced by the first statement, so the expression cannot be translated using <*>.

In general, the rule for when a do statement incurs a Monad constraint is as follows. If the do-expression has the following form:

do p1 <- E1; ...; pn <- En; return E

where none of the variables defined by p1...pn are mentioned in E1...En, and p1...pn are all variables or lazy patterns, then the expression will only require Applicative. Otherwise, the expression will require Monad. The block may return a pure expression E depending upon the results p1...pn with either return or pure.

Note: the final statement must match one of these patterns exactly:

  • return E
  • return $ E
  • pure E
  • pure $ E

otherwise GHC cannot recognise it as a return statement, and the transformation to use <$> that we saw above does not apply. In particular, slight variations such as return . Just $ x or let x = e in return x would not be recognised.

If the final statement is not of one of these forms, GHC falls back to standard do desugaring, and the expression will require a Monad constraint.

When the statements of a do expression have dependencies between them, and ApplicativeDo cannot infer an Applicative type, it uses a heuristic algorithm to try to use <*> as much as possible. This algorithm usually finds the best solution, but in rare complex cases it might miss an opportunity. There is an algorithm that finds the optimal solution, provided as an option:

-foptimal-applicative-do
Since:8.0.1

Enables an alternative algorithm for choosing where to use <*> in conjunction with the ApplicativeDo language extension. This algorithm always finds the optimal solution, but it is expensive: O(n^3), so this option can lead to long compile times when there are very large do expressions (over 100 statements). The default ApplicativeDo algorithm is O(n^2).

10.3.12.1. Strict patterns

A strict pattern match in a bind statement prevents ApplicativeDo from transforming that statement to use Applicative. This is because the transformation would change the semantics by making the expression lazier.

For example, this code will require a Monad constraint:

> :t \m -> do { (x:xs) <- m; return x }
\m -> do { (x:xs) <- m; return x } :: Monad m => m [b] -> m b

but making the pattern match lazy allows it to have a Functor constraint:

> :t \m -> do { ~(x:xs) <- m; return x }
\m -> do { ~(x:xs) <- m; return x } :: Functor f => f [b] -> f b

A “strict pattern match” is any pattern match that can fail. For example, (), (x:xs), !z, and C x are strict patterns, but x and ~(1,2) are not. For the purposes of ApplicativeDo, a pattern match against a newtype constructor is considered strict.

When there’s a strict pattern match in a sequence of statements, ApplicativeDo places a >>= between that statement and the one that follows it. The sequence may be transformed to use <*> elsewhere, but the strict pattern match and the following statement will always be connected with >>=, to retain the same strictness semantics as the standard do-notation. If you don’t want this, simply put a ~ on the pattern match to make it lazy.

10.3.12.2. Things to watch out for

Your code should just work as before when ApplicativeDo is enabled, provided you use conventional Applicative instances. However, if you define a Functor or Applicative instance using do-notation, then it will likely get turned into an infinite loop by GHC. For example, if you do this:

instance Functor MyType where
    fmap f m = do x <- m; return (f x)

Then applicative desugaring will turn it into

instance Functor MyType where
    fmap f m = fmap (\x -> f x) m

And the program will loop at runtime. Similarly, an Applicative instance like this

instance Applicative MyType where
    pure = return
    x <*> y = do f <- x; a <- y; return (f a)

will result in an infinte loop when <*> is called.

Just as you wouldn’t define a Monad instance using the do-notation, you shouldn’t define Functor or Applicative instance using do-notation (when using ApplicativeDo) either. The correct way to define these instances in terms of Monad is to use the Monad operations directly, e.g.

instance Functor MyType where
    fmap f m = m >>= return . f

instance Applicative MyType where
    pure = return
    (<*>) = ap

10.3.13. Parallel List Comprehensions

ParallelListComp
Since:6.8.1

Allow parallel list comprehension syntax.

Parallel list comprehensions are a natural extension to list comprehensions. List comprehensions can be thought of as a nice syntax for writing maps and filters. Parallel comprehensions extend this to include the zipWith family.

A parallel list comprehension has multiple independent branches of qualifier lists, each separated by a | symbol. For example, the following zips together two lists:

[ (x, y) | x <- xs | y <- ys ]

The behaviour of parallel list comprehensions follows that of zip, in that the resulting list will have the same length as the shortest branch.

We can define parallel list comprehensions by translation to regular comprehensions. Here’s the basic idea:

Given a parallel comprehension of the form:

[ e | p1 <- e11, p2 <- e12, ...
    | q1 <- e21, q2 <- e22, ...
    ...
]

This will be translated to:

[ e | ((p1,p2), (q1,q2), ...) <- zipN [(p1,p2) | p1 <- e11, p2 <- e12, ...]
                                      [(q1,q2) | q1 <- e21, q2 <- e22, ...]
                                      ...
]

where zipN is the appropriate zip for the given number of branches.

10.3.14. Generalised (SQL-like) List Comprehensions

TransformListComp
Since:6.10.1

Allow use of generalised list (SQL-like) comprehension syntax. This introduces the group, by, and using keywords.

Generalised list comprehensions are a further enhancement to the list comprehension syntactic sugar to allow operations such as sorting and grouping which are familiar from SQL. They are fully described in the paper Comprehensive comprehensions: comprehensions with “order by” and “group by”, except that the syntax we use differs slightly from the paper.

The extension is enabled with the extension TransformListComp.

Here is an example:

employees = [ ("Simon", "MS", 80)
            , ("Erik", "MS", 100)
            , ("Phil", "Ed", 40)
            , ("Gordon", "Ed", 45)
            , ("Paul", "Yale", 60) ]

output = [ (the dept, sum salary)
         | (name, dept, salary) <- employees
         , then group by dept using groupWith
         , then sortWith by (sum salary)
         , then take 5 ]

In this example, the list output would take on the value:

[("Yale", 60), ("Ed", 85), ("MS", 180)]

There are three new keywords: group, by, and using. (The functions sortWith and groupWith are not keywords; they are ordinary functions that are exported by GHC.Exts.)

There are five new forms of comprehension qualifier, all introduced by the (existing) keyword then:

  • then f
    

    This statement requires that f have the type forall a. [a] -> [a] . You can see an example of its use in the motivating example, as this form is used to apply take 5 .

  • then f by e
    

    This form is similar to the previous one, but allows you to create a function which will be passed as the first argument to f. As a consequence f must have the type forall a. (a -> t) -> [a] -> [a]. As you can see from the type, this function lets f “project out” some information from the elements of the list it is transforming.

    An example is shown in the opening example, where sortWith is supplied with a function that lets it find out the sum salary for any item in the list comprehension it transforms.

  • then group by e using f
    

    This is the most general of the grouping-type statements. In this form, f is required to have type forall a. (a -> t) -> [a] -> [[a]]. As with the then f by e case above, the first argument is a function supplied to f by the compiler which lets it compute e on every element of the list being transformed. However, unlike the non-grouping case, f additionally partitions the list into a number of sublists: this means that at every point after this statement, binders occurring before it in the comprehension refer to lists of possible values, not single values. To help understand this, let’s look at an example:

    -- This works similarly to groupWith in GHC.Exts, but doesn't sort its input first
    groupRuns :: Eq b => (a -> b) -> [a] -> [[a]]
    groupRuns f = groupBy (\x y -> f x == f y)
    
    output = [ (the x, y)
    | x <- ([1..3] ++ [1..2])
    , y <- [4..6]
    , then group by x using groupRuns ]
    

    This results in the variable output taking on the value below:

    [(1, [4, 5, 6]), (2, [4, 5, 6]), (3, [4, 5, 6]), (1, [4, 5, 6]), (2, [4, 5, 6])]
    

    Note that we have used the the function to change the type of x from a list to its original numeric type. The variable y, in contrast, is left unchanged from the list form introduced by the grouping.

  • then group using f
    

    With this form of the group statement, f is required to simply have the type forall a. [a] -> [[a]], which will be used to group up the comprehension so far directly. An example of this form is as follows:

    output = [ x
    | y <- [1..5]
    , x <- "hello"
    , then group using inits]
    

    This will yield a list containing every prefix of the word “hello” written out 5 times:

    ["","h","he","hel","hell","hello","helloh","hellohe","hellohel","hellohell","hellohello","hellohelloh",...]
    

10.3.15. Monad comprehensions

MonadComprehensions
Since:7.2.1

Enable list comprehension syntax for arbitrary monads.

Monad comprehensions generalise the list comprehension notation, including parallel comprehensions (Parallel List Comprehensions) and transform comprehensions (Generalised (SQL-like) List Comprehensions) to work for any monad.

Monad comprehensions support:

  • Bindings:

    [ x + y | x <- Just 1, y <- Just 2 ]
    

    Bindings are translated with the (>>=) and return functions to the usual do-notation:

    do x <- Just 1
       y <- Just 2
       return (x+y)
    
  • Guards:

    [ x | x <- [1..10], x <= 5 ]
    

    Guards are translated with the guard function, which requires a MonadPlus instance:

    do x <- [1..10]
       guard (x <= 5)
       return x
    
  • Transform statements (as with TransformListComp):

    [ x+y | x <- [1..10], y <- [1..x], then take 2 ]
    

    This translates to:

    do (x,y) <- take 2 (do x <- [1..10]
                           y <- [1..x]
                           return (x,y))
       return (x+y)
    
  • Group statements (as with TransformListComp):

    [ x | x <- [1,1,2,2,3], then group by x using GHC.Exts.groupWith ]
    [ x | x <- [1,1,2,2,3], then group using myGroup ]
    
  • Parallel statements (as with ParallelListComp):

    [ (x+y) | x <- [1..10]
            | y <- [11..20]
            ]
    

    Parallel statements are translated using the mzip function, which requires a MonadZip instance defined in Control.Monad.Zip:

    do (x,y) <- mzip (do x <- [1..10]
                         return x)
                     (do y <- [11..20]
                         return y)
       return (x+y)
    

All these features are enabled by default if the MonadComprehensions extension is enabled. The types and more detailed examples on how to use comprehensions are explained in the previous chapters Generalised (SQL-like) List Comprehensions and Parallel List Comprehensions. In general you just have to replace the type [a] with the type Monad m => m a for monad comprehensions.

Note

Even though most of these examples are using the list monad, monad comprehensions work for any monad. The base package offers all necessary instances for lists, which make MonadComprehensions backward compatible to built-in, transform and parallel list comprehensions.

More formally, the desugaring is as follows. We write D[ e | Q] to mean the desugaring of the monad comprehension [ e | Q]:

Expressions: e
Declarations: d
Lists of qualifiers: Q,R,S

-- Basic forms
D[ e | ]               = return e
D[ e | p <- e, Q ]  = e >>= \p -> D[ e | Q ]
D[ e | e, Q ]          = guard e >> \p -> D[ e | Q ]
D[ e | let d, Q ]      = let d in D[ e | Q ]

-- Parallel comprehensions (iterate for multiple parallel branches)
D[ e | (Q | R), S ]    = mzip D[ Qv | Q ] D[ Rv | R ] >>= \(Qv,Rv) -> D[ e | S ]

-- Transform comprehensions
D[ e | Q then f, R ]                  = f D[ Qv | Q ] >>= \Qv -> D[ e | R ]

D[ e | Q then f by b, R ]             = f (\Qv -> b) D[ Qv | Q ] >>= \Qv -> D[ e | R ]

D[ e | Q then group using f, R ]      = f D[ Qv | Q ] >>= \ys ->
                                        case (fmap selQv1 ys, ..., fmap selQvn ys) of
                                         Qv -> D[ e | R ]

D[ e | Q then group by b using f, R ] = f (\Qv -> b) D[ Qv | Q ] >>= \ys ->
                                        case (fmap selQv1 ys, ..., fmap selQvn ys) of
                                           Qv -> D[ e | R ]

where  Qv is the tuple of variables bound by Q (and used subsequently)
       selQvi is a selector mapping Qv to the ith component of Qv

Operator     Standard binding       Expected type
--------------------------------------------------------------------
return       GHC.Base               t1 -> m t2
(>>=)        GHC.Base               m1 t1 -> (t2 -> m2 t3) -> m3 t3
(>>)         GHC.Base               m1 t1 -> m2 t2         -> m3 t3
guard        Control.Monad          t1 -> m t2
fmap         GHC.Base               forall a b. (a->b) -> n a -> n b
mzip         Control.Monad.Zip      forall a b. m a -> m b -> m (a,b)

The comprehension should typecheck when its desugaring would typecheck, except that (as discussed in Generalised (SQL-like) List Comprehensions) in the “then f” and “then group using f” clauses, when the “by b” qualifier is omitted, argument f should have a polymorphic type. In particular, “then Data.List.sort” and “then group using Data.List.group” are insufficiently polymorphic.

Monad comprehensions support rebindable syntax (Rebindable syntax and the implicit Prelude import). Without rebindable syntax, the operators from the “standard binding” module are used; with rebindable syntax, the operators are looked up in the current lexical scope. For example, parallel comprehensions will be typechecked and desugared using whatever “mzip” is in scope.

The rebindable operators must have the “Expected type” given in the table above. These types are surprisingly general. For example, you can use a bind operator with the type

(>>=) :: T x y a -> (a -> T y z b) -> T x z b

In the case of transform comprehensions, notice that the groups are parameterised over some arbitrary type n (provided it has an fmap, as well as the comprehension being over an arbitrary monad.

10.3.16. New monadic failure desugaring mechanism

MonadFailDesugaring
Since:8.0.1

Use the MonadFail.fail instead of the legacy Monad.fail function when desugaring refutable patterns in do blocks.

The -XMonadFailDesugaring extension switches the desugaring of do-blocks to use MonadFail.fail instead of Monad.fail.

This extension is enabled by default since GHC 8.6.1, under the MonadFail Proposal (MFP).

This extension is temporary, and will be deprecated in a future release.

10.3.17. Rebindable syntax and the implicit Prelude import

NoImplicitPrelude
Since:6.8.1

Don’t import Prelude by default.

GHC normally imports Prelude.hi files for you. If you’d rather it didn’t, then give it a -XNoImplicitPrelude option. The idea is that you can then import a Prelude of your own. (But don’t call it Prelude; the Haskell module namespace is flat, and you must not conflict with any Prelude module.)

RebindableSyntax
Implies:NoImplicitPrelude
Since:7.0.1

Enable rebinding of a variety of usually-built-in operations.

Suppose you are importing a Prelude of your own in order to define your own numeric class hierarchy. It completely defeats that purpose if the literal “1” means “Prelude.fromInteger 1”, which is what the Haskell Report specifies. So the RebindableSyntax extension causes the following pieces of built-in syntax to refer to whatever is in scope, not the Prelude versions:

  • An integer literal 368 means “fromInteger (368::Integer)”, rather than “Prelude.fromInteger (368::Integer)”.
  • Fractional literals are handed in just the same way, except that the translation is fromRational (3.68::Rational).
  • The equality test in an overloaded numeric pattern uses whatever (==) is in scope.
  • The subtraction operation, and the greater-than-or-equal test, in n+k patterns use whatever (-) and (>=) are in scope.
  • Negation (e.g. “- (f x)”) means “negate (f x)”, both in numeric patterns, and expressions.
  • Conditionals (e.g. “if e1 then e2 else e3”) means “ifThenElse e1 e2 e3”. However case expressions are unaffected.
  • “Do” notation is translated using whatever functions (>>=), (>>), and fail, are in scope (not the Prelude versions). List comprehensions, mdo (The recursive do-notation), and parallel array comprehensions, are unaffected.
  • Arrow notation (see Arrow notation) uses whatever arr, (>>>), first, app, (|||) and loop functions are in scope. But unlike the other constructs, the types of these functions must match the Prelude types very closely. Details are in flux; if you want to use this, ask!
  • List notation, such as [x,y] or [m..n] can also be treated via rebindable syntax if you use -XOverloadedLists; see Overloaded lists.
  • An overloaded label “#foo” means “fromLabel @"foo"”, rather than “GHC.OverloadedLabels.fromLabel @"foo"” (see Overloaded labels).

RebindableSyntax implies NoImplicitPrelude.

In all cases (apart from arrow notation), the static semantics should be that of the desugared form, even if that is a little unexpected. For example, the static semantics of the literal 368 is exactly that of fromInteger (368::Integer); it’s fine for fromInteger to have any of the types:

fromInteger :: Integer -> Integer
fromInteger :: forall a. Foo a => Integer -> a
fromInteger :: Num a => a -> Integer
fromInteger :: Integer -> Bool -> Bool

Be warned: this is an experimental facility, with fewer checks than usual. Use -dcore-lint to typecheck the desugared program. If Core Lint is happy you should be all right.

10.3.17.1. Things unaffected by RebindableSyntax

RebindableSyntax does not apply to any code generated from a deriving clause or declaration. To see why, consider the following code:

{-# LANGUAGE RebindableSyntax, OverloadedStrings #-}
newtype Text = Text String

fromString :: String -> Text
fromString = Text

data Foo = Foo deriving Show

This will generate code to the effect of:

instance Show Foo where
  showsPrec _ Foo = showString "Foo"

But because RebindableSyntax and OverloadedStrings are enabled, the "Foo" string literal would now be of type Text, not String, which showString doesn’t accept! This causes the generated Show instance to fail to typecheck. It’s hard to imagine any scenario where it would be desirable have RebindableSyntax behavior within derived code, so GHC simply ignores RebindableSyntax entirely when checking derived code.

10.3.18. Postfix operators

PostfixOperators
Since:7.10.1

Allow the use of post-fix operators

The PostfixOperators extension enables a small extension to the syntax of left operator sections, which allows you to define postfix operators. The extension is this: the left section

(e !)

is equivalent (from the point of view of both type checking and execution) to the expression

((!) e)

(for any expression e and operator (!). The strict Haskell 98 interpretation is that the section is equivalent to

(\y -> (!) e y)

That is, the operator must be a function of two arguments. GHC allows it to take only one argument, and that in turn allows you to write the function postfix.

The extension does not extend to the left-hand side of function definitions; you must define such a function in prefix form.

10.3.19. Tuple sections

TupleSections
Since:6.12

Allow the use of tuple section syntax

The TupleSections extension enables partially applied tuple constructors. For example, the following program

(, True)

is considered to be an alternative notation for the more unwieldy alternative

\x -> (x, True)

You can omit any combination of arguments to the tuple, as in the following

(, "I", , , "Love", , 1337)

which translates to

\a b c d -> (a, "I", b, c, "Love", d, 1337)

If you have unboxed tuples enabled, tuple sections will also be available for them, like so

(# , True #)

Because there is no unboxed unit tuple, the following expression

(# #)

continues to stand for the unboxed singleton tuple data constructor.

10.3.20. Lambda-case

LambdaCase
Since:7.6.1

Allow the use of lambda-case syntax.

The LambdaCase extension enables expressions of the form

\case { p1 -> e1; ...; pN -> eN }

which is equivalent to

\freshName -> case freshName of { p1 -> e1; ...; pN -> eN }

Note that \case starts a layout, so you can write

\case
  p1 -> e1
  ...
  pN -> eN

10.3.21. Empty case alternatives

EmptyCase
Since:7.8.1

Allow empty case expressions.

The EmptyCase extension enables case expressions, or lambda-case expressions, that have no alternatives, thus:

case e of { }   -- No alternatives

or

\case { }       -- -XLambdaCase is also required

This can be useful when you know that the expression being scrutinised has no non-bottom values. For example:

data Void
f :: Void -> Int
f x = case x of { }

With dependently-typed features it is more useful (see Trac #2431). For example, consider these two candidate definitions of absurd:

data a :~: b where
  Refl :: a :~: a

absurd :: True :~: False -> a
absurd x = error "absurd"    -- (A)
absurd x = case x of {}      -- (B)

We much prefer (B). Why? Because GHC can figure out that (True :~: False) is an empty type. So (B) has no partiality and GHC is able to compile with -Wincomplete-patterns and -Werror. On the other hand (A) looks dangerous, and GHC doesn’t check to make sure that, in fact, the function can never get called.

10.3.22. Multi-way if-expressions

MultiWayIf
Since:7.6.1

Allow the use of multi-way-if syntax.

With MultiWayIf extension GHC accepts conditional expressions with multiple branches:

if | guard1 -> expr1
   | ...
   | guardN -> exprN

which is roughly equivalent to

case () of
  _ | guard1 -> expr1
  ...
  _ | guardN -> exprN

Multi-way if expressions introduce a new layout context. So the example above is equivalent to:

if { | guard1 -> expr1
   ; | ...
   ; | guardN -> exprN
   }

The following behaves as expected:

if | guard1 -> if | guard2 -> expr2
                  | guard3 -> expr3
   | guard4 -> expr4

because layout translates it as

if { | guard1 -> if { | guard2 -> expr2
                    ; | guard3 -> expr3
                    }
   ; | guard4 -> expr4
   }

Layout with multi-way if works in the same way as other layout contexts, except that the semi-colons between guards in a multi-way if are optional. So it is not necessary to line up all the guards at the same column; this is consistent with the way guards work in function definitions and case expressions.

10.3.23. Local Fixity Declarations

A careful reading of the Haskell 98 Report reveals that fixity declarations (infix, infixl, and infixr) are permitted to appear inside local bindings such those introduced by let and where. However, the Haskell Report does not specify the semantics of such bindings very precisely.

In GHC, a fixity declaration may accompany a local binding:

let f = ...
    infixr 3 `f`
in
    ...

and the fixity declaration applies wherever the binding is in scope. For example, in a let, it applies in the right-hand sides of other let-bindings and the body of the letC. Or, in recursive do expressions (The recursive do-notation), the local fixity declarations of a let statement scope over other statements in the group, just as the bound name does.

Moreover, a local fixity declaration must accompany a local binding of that name: it is not possible to revise the fixity of name bound elsewhere, as in

let infixr 9 $ in ...

Because local fixity declarations are technically Haskell 98, no extension is necessary to enable them.

10.3.24. Import and export extensions

10.3.24.1. Hiding things the imported module doesn’t export

Technically in Haskell 2010 this is illegal:

module A( f ) where
  f = True

module B where
  import A hiding( g )  -- A does not export g
  g = f

The import A hiding( g ) in module B is technically an error (Haskell Report, 5.3.1) because A does not export g. However GHC allows it, in the interests of supporting backward compatibility; for example, a newer version of A might export g, and you want B to work in either case.

The warning -Wdodgy-imports, which is off by default but included with -W, warns if you hide something that the imported module does not export.

10.3.24.2. Package-qualified imports

PackageImports
Since:6.10.1

Allow the use of package-qualified import syntax.

With the PackageImports extension, GHC allows import declarations to be qualified by the package name that the module is intended to be imported from. For example:

import "network" Network.Socket

would import the module Network.Socket from the package network (any version). This may be used to disambiguate an import when the same module is available from multiple packages, or is present in both the current package being built and an external package.

The special package name this can be used to refer to the current package being built.

Note

You probably don’t need to use this feature, it was added mainly so that we can build backwards-compatible versions of packages when APIs change. It can lead to fragile dependencies in the common case: modules occasionally move from one package to another, rendering any package-qualified imports broken. See also Thinning and renaming modules for an alternative way of disambiguating between module names.

10.3.24.3. Safe imports

Safe
Since:7.2.1

Declare the Safe Haskell state of the current module.

Trustworthy
Since:7.2.1

Declare the Safe Haskell state of the current module.

Unsafe
Since:7.4.1

Declare the Safe Haskell state of the current module.

With the Safe, Trustworthy and Unsafe language flags, GHC extends the import declaration syntax to take an optional safe keyword after the import keyword. This feature is part of the Safe Haskell GHC extension. For example:

import safe qualified Network.Socket as NS

would import the module Network.Socket with compilation only succeeding if Network.Socket can be safely imported. For a description of when a import is considered safe see Safe Haskell.

10.3.24.4. Explicit namespaces in import/export

ExplicitNamespaces
Since:7.6.1

Enable use of explicit namespaces in module export lists.

In an import or export list, such as

module M( f, (++) ) where ...
  import N( f, (++) )
  ...

the entities f and (++) are values. However, with type operators (Type operators) it becomes possible to declare (++) as a type constructor. In that case, how would you export or import it?

The ExplicitNamespaces extension allows you to prefix the name of a type constructor in an import or export list with “type” to disambiguate this case, thus:

module M( f, type (++) ) where ...
  import N( f, type (++) )
  ...
module N( f, type (++) ) where
  data family a ++ b = L a | R b

The extension ExplicitNamespaces is implied by TypeOperators and (for some reason) by TypeFamilies.

In addition, with PatternSynonyms you can prefix the name of a data constructor in an import or export list with the keyword pattern, to allow the import or export of a data constructor without its parent type constructor (see Import and export of pattern synonyms).

10.3.25. More liberal syntax for function arguments

BlockArguments
Since:8.6.1

Allow do expressions, lambda expressions, etc. to be directly used as a function argument.

In Haskell 2010, certain kinds of expressions can be used without parentheses as an argument to an operator, but not as an argument to a function. They include do, lambda, if, case, and let expressions. Some GHC extensions also define language constructs of this type: mdo (The recursive do-notation), \case (Lambda-case), and proc (Arrow notation).

The BlockArguments extension allows these constructs to be directly used as a function argument. For example:

when (x > 0) do
  print x
  exitFailure

will be parsed as:

when (x > 0) (do
  print x
  exitFailure)

and

withForeignPtr fptr \ptr -> c_memcpy buf ptr size

will be parsed as:

withForeignPtr fptr (\ptr -> c_memcpy buf ptr size)

10.3.25.1. Changes to the grammar

The Haskell report defines the lexp nonterminal thus (* indicates a rule of interest):

lexp  →  \ apat1 … apatn -> exp            (lambda abstraction, n ≥ 1)  *
      |  let decls in exp                  (let expression)             *
      |  if exp [;] then exp [;] else exp  (conditional)                *
      |  case exp of { alts }              (case expression)            *
      |  do { stmts }                      (do expression)              *
      |  fexp

fexp  →  [fexp] aexp                       (function application)

aexp  →  qvar                              (variable)
      |  gcon                              (general constructor)
      |  literal
      |  ( exp )                           (parenthesized expression)
      |  qcon { fbind1 … fbindn }          (labeled construction)
      |  aexp { fbind1 … fbindn }          (labelled update)
      |  …

The BlockArguments extension moves these production rules under aexp:

lexp  →  fexp

fexp  →  [fexp] aexp                       (function application)

aexp  →  qvar                              (variable)
      |  gcon                              (general constructor)
      |  literal
      |  ( exp )                           (parenthesized expression)
      |  qcon { fbind1 … fbindn }          (labeled construction)
      |  aexp { fbind1 … fbindn }          (labelled update)
      |  \ apat1 … apatn -> exp            (lambda abstraction, n ≥ 1)  *
      |  let decls in exp                  (let expression)             *
      |  if exp [;] then exp [;] else exp  (conditional)                *
      |  case exp of { alts }              (case expression)            *
      |  do { stmts }                      (do expression)              *
      |  …

Now the lexp nonterminal is redundant and can be dropped from the grammar.

Note that this change relies on an existing meta-rule to resolve ambiguities:

The grammar is ambiguous regarding the extent of lambda abstractions, let expressions, and conditionals. The ambiguity is resolved by the meta-rule that each of these constructs extends as far to the right as possible.

For example, f \a -> a b will be parsed as f (\a -> a b), not as f (\a -> a) b.

10.3.26. Summary of stolen syntax

Turning on an option that enables special syntax might cause working Haskell 98 code to fail to compile, perhaps because it uses a variable name which has become a reserved word. This section lists the syntax that is “stolen” by language extensions. We use notation and nonterminal names from the Haskell 98 lexical syntax (see the Haskell 98 Report). We only list syntax changes here that might affect existing working programs (i.e. “stolen” syntax). Many of these extensions will also enable new context-free syntax, but in all cases programs written to use the new syntax would not be compilable without the option enabled.

There are two classes of special syntax:

  • New reserved words and symbols: character sequences which are no longer available for use as identifiers in the program.
  • Other special syntax: sequences of characters that have a different meaning when this particular option is turned on.

The following syntax is stolen:

forall

Stolen (in types) by: ExplicitForAll, and hence by ScopedTypeVariables, LiberalTypeSynonyms, RankNTypes, ExistentialQuantification

mdo

Stolen by: RecursiveDo

foreign

Stolen by: ForeignFunctionInterface

rec, proc, -<, >-, -<<, >>-, (|, |)

Stolen by: Arrows

?varid

Stolen by: ImplicitParams

[|, [e|, [p|, [d|, [t|, [||, [e||

Stolen by: QuasiQuotes. Moreover, this introduces an ambiguity with list comprehension syntax. See the discussion on quasi-quoting for details.

$(, $$(, $varid, $$varid

Stolen by: TemplateHaskell

[varid|

Stolen by: QuasiQuotes

⟨varid⟩, #⟨char⟩, #, ⟨string⟩, #, ⟨integer⟩, #, ⟨float⟩, #, ⟨float⟩, ##
Stolen by: MagicHash
(#, #)
Stolen by: UnboxedTuples
⟨varid⟩, !, ⟨varid⟩
Stolen by: BangPatterns
pattern
Stolen by: PatternSynonyms

10.4. Extensions to data types and type synonyms

10.4.1. Data types with no constructors

EmptyDataDecls
Since:6.8.1

Allow definition of empty data types.

With the EmptyDataDecls extension, GHC lets you declare a data type with no constructors. For example:

data S      -- S :: Type
data T a    -- T :: Type -> Type

Syntactically, the declaration lacks the “= constrs” part. The type can be parameterised over types of any kind, but if the kind is not Type then an explicit kind annotation must be used (see Explicitly-kinded quantification).

Such data types have only one value, namely bottom. Nevertheless, they can be useful when defining “phantom types”.

In conjunction with the -XEmptyDataDeriving extension, empty data declarations can also derive instances of standard type classes (see Deriving instances for empty data types).

10.4.2. Data type contexts

DatatypeContexts
Since:7.0.1

Allow contexts on data types.

Haskell allows datatypes to be given contexts, e.g.

data Eq a => Set a = NilSet | ConsSet a (Set a)

give constructors with types:

NilSet :: Set a
ConsSet :: Eq a => a -> Set a -> Set a

This is widely considered a misfeature, and is going to be removed from the language. In GHC, it is controlled by the deprecated extension DatatypeContexts.

10.4.3. Infix type constructors, classes, and type variables

GHC allows type constructors, classes, and type variables to be operators, and to be written infix, very much like expressions. More specifically:

  • A type constructor or class can be any non-reserved operator. Symbols used in types are always like capitalized identifiers; they are never variables. Note that this is different from the lexical syntax of data constructors, which are required to begin with a :.

  • Data type and type-synonym declarations can be written infix, parenthesised if you want further arguments. E.g.

    data a :*: b = Foo a b
    type a :+: b = Either a b
    class a :=: b where ...
    
    data (a :**: b) x = Baz a b x
    type (a :++: b) y = Either (a,b) y
    
  • Types, and class constraints, can be written infix. For example

    x :: Int :*: Bool
    f :: (a :=: b) => a -> b
    
  • Back-quotes work as for expressions, both for type constructors and type variables; e.g. Int `Either` Bool, or Int `a` Bool. Similarly, parentheses work the same; e.g. (:*:) Int Bool.

  • Fixities may be declared for type constructors, or classes, just as for data constructors. However, one cannot distinguish between the two in a fixity declaration; a fixity declaration sets the fixity for a data constructor and the corresponding type constructor. For example:

    infixl 7 T, :*:
    

    sets the fixity for both type constructor T and data constructor T, and similarly for :*:. Int `a` Bool.

  • Function arrow is infixr with fixity 0 (this might change; it’s not clear what it should be).

10.4.4. Type operators

TypeOperators
Implies:ExplicitNamespaces
Since:6.8.1

Allow the use and definition of types with operator names.

In types, an operator symbol like (+) is normally treated as a type variable, just like a. Thus in Haskell 98 you can say

type T (+) = ((+), (+))
-- Just like: type T a = (a,a)

f :: T Int -> Int
f (x,y)= x

As you can see, using operators in this way is not very useful, and Haskell 98 does not even allow you to write them infix.

The language TypeOperators changes this behaviour:

  • Operator symbols become type constructors rather than type variables.

  • Operator symbols in types can be written infix, both in definitions and uses. For example:

    data a + b = Plus a b
    type Foo = Int + Bool
    
  • There is now some potential ambiguity in import and export lists; for example if you write import M( (+) ) do you mean the function (+) or the type constructor (+)? The default is the former, but with ExplicitNamespaces (which is implied by TypeOperators) GHC allows you to specify the latter by preceding it with the keyword type, thus:

    import M( type (+) )
    

    See Explicit namespaces in import/export.

  • The fixity of a type operator may be set using the usual fixity declarations but, as in Infix type constructors, classes, and type variables, the function and type constructor share a single fixity.

10.4.5. Liberalised type synonyms

LiberalTypeSynonyms
Implies:ExplicitForAll
Since:6.8.1

Relax many of the Haskell 98 rules on type synonym definitions.

Type synonyms are like macros at the type level, but Haskell 98 imposes many rules on individual synonym declarations. With the LiberalTypeSynonyms extension, GHC does validity checking on types only after expanding type synonyms. That means that GHC can be very much more liberal about type synonyms than Haskell 98.

  • You can write a forall (including overloading) in a type synonym, thus:

    type Discard a = forall b. Show b => a -> b -> (a, String)
    
    f :: Discard a
    f x y = (x, show y)
    
    g :: Discard Int -> (Int,String)    -- A rank-2 type
    g f = f 3 True
    
  • If you also use UnboxedTuples, you can write an unboxed tuple in a type synonym:

    type Pr = (# Int, Int #)
    
    h :: Int -> Pr
    h x = (# x, x #)
    
  • You can apply a type synonym to a forall type:

    type Foo a = a -> a -> Bool
    
    f :: Foo (forall b. b->b)
    

    After expanding the synonym, f has the legal (in GHC) type:

    f :: (forall b. b->b) -> (forall b. b->b) -> Bool
    
  • You can apply a type synonym to a partially applied type synonym:

    type Generic i o = forall x. i x -> o x
    type Id x = x
    
    foo :: Generic Id []
    

    After expanding the synonym, foo has the legal (in GHC) type:

    foo :: forall x. x -> [x]
    

GHC currently does kind checking before expanding synonyms (though even that could be changed).

After expanding type synonyms, GHC does validity checking on types, looking for the following malformedness which isn’t detected simply by kind checking:

  • Type constructor applied to a type involving for-alls (if ImpredicativeTypes is off)
  • Partially-applied type synonym.

So, for example, this will be rejected:

type Pr = forall a. a

h :: [Pr]
h = ...

because GHC does not allow type constructors applied to for-all types.

10.4.6. Existentially quantified data constructors

ExistentialQuantification
Implies:ExplicitForAll
Since:6.8.1

Allow existentially quantified type variables in types.

The idea of using existential quantification in data type declarations was suggested by Perry, and implemented in Hope+ (Nigel Perry, The Implementation of Practical Functional Programming Languages, PhD Thesis, University of London, 1991). It was later formalised by Laufer and Odersky (Polymorphic type inference and abstract data types, TOPLAS, 16(5), pp. 1411-1430, 1994). It’s been in Lennart Augustsson’s hbc Haskell compiler for several years, and proved very useful. Here’s the idea. Consider the declaration:

data Foo = forall a. MkFoo a (a -> Bool)
         | Nil

The data type Foo has two constructors with types:

MkFoo :: forall a. a -> (a -> Bool) -> Foo
Nil   :: Foo

Notice that the type variable a in the type of MkFoo does not appear in the data type itself, which is plain Foo. For example, the following expression is fine:

[MkFoo 3 even, MkFoo 'c' isUpper] :: [Foo]

Here, (MkFoo 3 even) packages an integer with a function even that maps an integer to Bool; and MkFoo 'c' isUpper packages a character with a compatible function. These two things are each of type Foo and can be put in a list.

What can we do with a value of type Foo? In particular, what happens when we pattern-match on MkFoo?

f (MkFoo val fn) = ???

Since all we know about val and fn is that they are compatible, the only (useful) thing we can do with them is to apply fn to val to get a boolean. For example:

f :: Foo -> Bool
f (MkFoo val fn) = fn val

What this allows us to do is to package heterogeneous values together with a bunch of functions that manipulate them, and then treat that collection of packages in a uniform manner. You can express quite a bit of object-oriented-like programming this way.

10.4.6.1. Why existential?

What has this to do with existential quantification? Simply that MkFoo has the (nearly) isomorphic type

MkFoo :: (exists a . (a, a -> Bool)) -> Foo

But Haskell programmers can safely think of the ordinary universally quantified type given above, thereby avoiding adding a new existential quantification construct.

10.4.6.2. Existentials and type classes

An easy extension is to allow arbitrary contexts before the constructor. For example:

data Baz = forall a. Eq a => Baz1 a a
         | forall b. Show b => Baz2 b (b -> b)

The two constructors have the types you’d expect:

Baz1 :: forall a. Eq a => a -> a -> Baz
Baz2 :: forall b. Show b => b -> (b -> b) -> Baz

But when pattern matching on Baz1 the matched values can be compared for equality, and when pattern matching on Baz2 the first matched value can be converted to a string (as well as applying the function to it). So this program is legal:

f :: Baz -> String
f (Baz1 p q) | p == q    = "Yes"
             | otherwise = "No"
f (Baz2 v fn)            = show (fn v)

Operationally, in a dictionary-passing implementation, the constructors Baz1 and Baz2 must store the dictionaries for Eq and Show respectively, and extract it on pattern matching.

10.4.6.3. Record Constructors

GHC allows existentials to be used with records syntax as well. For example:

data Counter a = forall self. NewCounter
    { _this    :: self
    , _inc     :: self -> self
    , _display :: self -> IO ()
    , tag      :: a
    }

Here tag is a public field, with a well-typed selector function tag :: Counter a -> a. The self type is hidden from the outside; any attempt to apply _this, _inc or _display as functions will raise a compile-time error. In other words, GHC defines a record selector function only for fields whose type does not mention the existentially-quantified variables. (This example used an underscore in the fields for which record selectors will not be defined, but that is only programming style; GHC ignores them.)

To make use of these hidden fields, we need to create some helper functions:

inc :: Counter a -> Counter a
inc (NewCounter x i d t) = NewCounter
    { _this = i x, _inc = i, _display = d, tag = t }

display :: Counter a -> IO ()
display NewCounter{ _this = x, _display = d } = d x

Now we can define counters with different underlying implementations:

counterA :: Counter String
counterA = NewCounter
    { _this = 0, _inc = (1+), _display = print, tag = "A" }

counterB :: Counter String
counterB = NewCounter
    { _this = "", _inc = ('#':), _display = putStrLn, tag = "B" }

main = do
    display (inc counterA)         -- prints "1"
    display (inc (inc counterB))   -- prints "##"

Record update syntax is supported for existentials (and GADTs):

setTag :: Counter a -> a -> Counter a
setTag obj t = obj{ tag = t }

The rule for record update is this:

the types of the updated fields may mention only the universally-quantified type variables of the data constructor. For GADTs, the field may mention only types that appear as a simple type-variable argument in the constructor’s result type.

For example:

data T a b where { T1 { f1::a, f2::b, f3::(b,c) } :: T a b } -- c is existential
upd1 t x = t { f1=x }   -- OK:   upd1 :: T a b -> a' -> T a' b
upd2 t x = t { f3=x }   -- BAD   (f3's type mentions c, which is
                        --        existentially quantified)

data G a b where { G1 { g1::a, g2::c } :: G a [c] }
upd3 g x = g { g1=x }   -- OK:   upd3 :: G a b -> c -> G c b
upd4 g x = g { g2=x }   -- BAD (f2's type mentions c, which is not a simple
                        --      type-variable argument in G1's result type)

10.4.6.4. Restrictions

There are several restrictions on the ways in which existentially-quantified constructors can be used.

  • When pattern matching, each pattern match introduces a new, distinct, type for each existential type variable. These types cannot be unified with any other type, nor can they escape from the scope of the pattern match. For example, these fragments are incorrect:

    f1 (MkFoo a f) = a
    

    Here, the type bound by MkFoo “escapes”, because a is the result of f1. One way to see why this is wrong is to ask what type f1 has:

    f1 :: Foo -> a             -- Weird!
    

    What is this “a” in the result type? Clearly we don’t mean this:

    f1 :: forall a. Foo -> a   -- Wrong!
    

    The original program is just plain wrong. Here’s another sort of error

    f2 (Baz1 a b) (Baz1 p q) = a==q
    

    It’s ok to say a==b or p==q, but a==q is wrong because it equates the two distinct types arising from the two Baz1 constructors.

  • You can’t pattern-match on an existentially quantified constructor in a let or where group of bindings. So this is illegal:

    f3 x = a==b where { Baz1 a b = x }
    

    Instead, use a case expression:

    f3 x = case x of Baz1 a b -> a==b
    

    In general, you can only pattern-match on an existentially-quantified constructor in a case expression or in the patterns of a function definition. The reason for this restriction is really an implementation one. Type-checking binding groups is already a nightmare without existentials complicating the picture. Also an existential pattern binding at the top level of a module doesn’t make sense, because it’s not clear how to prevent the existentially-quantified type “escaping”. So for now, there’s a simple-to-state restriction. We’ll see how annoying it is.

  • You can’t use existential quantification for newtype declarations. So this is illegal:

    newtype T = forall a. Ord a => MkT a
    

    Reason: a value of type T must be represented as a pair of a dictionary for Ord t and a value of type t. That contradicts the idea that newtype should have no concrete representation. You can get just the same efficiency and effect by using data instead of newtype. If there is no overloading involved, then there is more of a case for allowing an existentially-quantified newtype, because the data version does carry an implementation cost, but single-field existentially quantified constructors aren’t much use. So the simple restriction (no existential stuff on newtype) stands, unless there are convincing reasons to change it.

  • You can’t use deriving to define instances of a data type with existentially quantified data constructors. Reason: in most cases it would not make sense. For example:;

    data T = forall a. MkT [a] deriving( Eq )
    

    To derive Eq in the standard way we would need to have equality between the single component of two MkT constructors:

    instance Eq T where
      (MkT a) == (MkT b) = ???
    

    But a and b have distinct types, and so can’t be compared. It’s just about possible to imagine examples in which the derived instance would make sense, but it seems altogether simpler simply to prohibit such declarations. Define your own instances!

10.4.7. Declaring data types with explicit constructor signatures

GADTSyntax
Since:7.2.1

Allow the use of GADT syntax in data type definitions (but not GADTs themselves; for this see GADTs)

When the GADTSyntax extension is enabled, GHC allows you to declare an algebraic data type by giving the type signatures of constructors explicitly. For example:

data Maybe a where
    Nothing :: Maybe a
    Just    :: a -> Maybe a

The form is called a “GADT-style declaration” because Generalised Algebraic Data Types, described in Generalised Algebraic Data Types (GADTs), can only be declared using this form.

Notice that GADT-style syntax generalises existential types (Existentially quantified data constructors). For example, these two declarations are equivalent:

data Foo = forall a. MkFoo a (a -> Bool)
data Foo' where { MKFoo :: a -> (a->Bool) -> Foo' }

Any data type that can be declared in standard Haskell 98 syntax can also be declared using GADT-style syntax. The choice is largely stylistic, but GADT-style declarations differ in one important respect: they treat class constraints on the data constructors differently. Specifically, if the constructor is given a type-class context, that context is made available by pattern matching. For example:

data Set a where
  MkSet :: Eq a => [a] -> Set a

makeSet :: Eq a => [a] -> Set a
makeSet xs = MkSet (nub xs)

insert :: a -> Set a -> Set a
insert a (MkSet as) | a `elem` as = MkSet as
                    | otherwise   = MkSet (a:as)

A use of MkSet as a constructor (e.g. in the definition of makeSet) gives rise to a (Eq a) constraint, as you would expect. The new feature is that pattern-matching on MkSet (as in the definition of insert) makes available an (Eq a) context. In implementation terms, the MkSet constructor has a hidden field that stores the (Eq a) dictionary that is passed to MkSet; so when pattern-matching that dictionary becomes available for the right-hand side of the match. In the example, the equality dictionary is used to satisfy the equality constraint generated by the call to elem, so that the type of insert itself has no Eq constraint.

For example, one possible application is to reify dictionaries:

data NumInst a where
  MkNumInst :: Num a => NumInst a

intInst :: NumInst Int
intInst = MkNumInst

plus :: NumInst a -> a -> a -> a
plus MkNumInst p q = p + q

Here, a value of type NumInst a is equivalent to an explicit (Num a) dictionary.

All this applies to constructors declared using the syntax of Existentials and type classes. For example, the NumInst data type above could equivalently be declared like this:

data NumInst a
   = Num a => MkNumInst (NumInst a)

Notice that, unlike the situation when declaring an existential, there is no forall, because the Num constrains the data type’s universally quantified type variable a. A constructor may have both universal and existential type variables: for example, the following two declarations are equivalent:

data T1 a
 = forall b. (Num a, Eq b) => MkT1 a b
data T2 a where
 MkT2 :: (Num a, Eq b) => a -> b -> T2 a

All this behaviour contrasts with Haskell 98’s peculiar treatment of contexts on a data type declaration (Section 4.2.1 of the Haskell 98 Report). In Haskell 98 the definition

data Eq a => Set' a = MkSet' [a]

gives MkSet' the same type as MkSet above. But instead of making available an (Eq a) constraint, pattern-matching on MkSet' requires an (Eq a) constraint! GHC faithfully implements this behaviour, odd though it is. But for GADT-style declarations, GHC’s behaviour is much more useful, as well as much more intuitive.

The rest of this section gives further details about GADT-style data type declarations.

  • The result type of each data constructor must begin with the type constructor being defined. If the result type of all constructors has the form T a1 ... an, where a1 ... an are distinct type variables, then the data type is ordinary; otherwise is a generalised data type (Generalised Algebraic Data Types (GADTs)).

  • As with other type signatures, you can give a single signature for several data constructors. In this example we give a single signature for T1 and T2:

    data T a where
      T1,T2 :: a -> T a
      T3 :: T a
    
  • The type signature of each constructor is independent, and is implicitly universally quantified as usual. In particular, the type variable(s) in the “data T a where” header have no scope, and different constructors may have different universally-quantified type variables:

    data T a where        -- The 'a' has no scope
      T1,T2 :: b -> T b   -- Means forall b. b -> T b
      T3 :: T a           -- Means forall a. T a
    
  • A constructor signature may mention type class constraints, which can differ for different constructors. For example, this is fine:

    data T a where
      T1 :: Eq b => b -> b -> T b
      T2 :: (Show c, Ix c) => c -> [c] -> T c
    

    When pattern matching, these constraints are made available to discharge constraints in the body of the match. For example:

    f :: T a -> String
    f (T1 x y) | x==y      = "yes"
               | otherwise = "no"
    f (T2 a b)             = show a
    

    Note that f is not overloaded; the Eq constraint arising from the use of == is discharged by the pattern match on T1 and similarly the Show constraint arising from the use of show.

  • Unlike a Haskell-98-style data type declaration, the type variable(s) in the “data Set a where” header have no scope. Indeed, one can write a kind signature instead:

    data Set :: Type -> Type where ...
    

    or even a mixture of the two:

    data Bar a :: (Type -> Type) -> Type where ...
    

    The type variables (if given) may be explicitly kinded, so we could also write the header for Foo like this:

    data Bar a (b :: Type -> Type) where ...
    
  • You can use strictness annotations, in the obvious places in the constructor type:

    data Term a where
        Lit    :: !Int -> Term Int
        If     :: Term Bool -> !(Term a) -> !(Term a) -> Term a
        Pair   :: Term a -> Term b -> Term (a,b)
    
  • You can use a deriving clause on a GADT-style data type declaration. For example, these two declarations are equivalent

    data Maybe1 a where {
        Nothing1 :: Maybe1 a ;
        Just1    :: a -> Maybe1 a
      } deriving( Eq, Ord )
    
    data Maybe2 a = Nothing2 | Just2 a
         deriving( Eq, Ord )
    
  • The type signature may have quantified type variables that do not appear in the result type:

    data Foo where
       MkFoo :: a -> (a->Bool) -> Foo
       Nil   :: Foo
    

    Here the type variable a does not appear in the result type of either constructor. Although it is universally quantified in the type of the constructor, such a type variable is often called “existential”. Indeed, the above declaration declares precisely the same type as the data Foo in Existentially quantified data constructors.

    The type may contain a class context too, of course:

    data Showable where
      MkShowable :: Show a => a -> Showable
    
  • You can use record syntax on a GADT-style data type declaration:

    data Person where
        Adult :: { name :: String, children :: [Person] } -> Person
        Child :: Show a => { name :: !String, funny :: a } -> Person
    

    As usual, for every constructor that has a field f, the type of field f must be the same (modulo alpha conversion). The Child constructor above shows that the signature may have a context, existentially-quantified variables, and strictness annotations, just as in the non-record case. (NB: the “type” that follows the double-colon is not really a type, because of the record syntax and strictness annotations. A “type” of this form can appear only in a constructor signature.)

  • Record updates are allowed with GADT-style declarations, only fields that have the following property: the type of the field mentions no existential type variables.

  • As in the case of existentials declared using the Haskell-98-like record syntax (Record Constructors), record-selector functions are generated only for those fields that have well-typed selectors. Here is the example of that section, in GADT-style syntax:

    data Counter a where
        NewCounter :: { _this    :: self
                      , _inc     :: self -> self
                      , _display :: self -> IO ()
                      , tag      :: a
                      } -> Counter a
    

    As before, only one selector function is generated here, that for tag. Nevertheless, you can still use all the field names in pattern matching and record construction.

  • In a GADT-style data type declaration there is no obvious way to specify that a data constructor should be infix, which makes a difference if you derive Show for the type. (Data constructors declared infix are displayed infix by the derived show.) So GHC implements the following design: a data constructor declared in a GADT-style data type declaration is displayed infix by Show iff (a) it is an operator symbol, (b) it has two arguments, (c) it has a programmer-supplied fixity declaration. For example

    infix 6 (:--:)
    data T a where
      (:--:) :: Int -> Bool -> T Int
    

10.4.8. Generalised Algebraic Data Types (GADTs)

GADTs
Implies:MonoLocalBinds, GADTSyntax
Since:6.8.1

Allow use of Generalised Algebraic Data Types (GADTs).

Generalised Algebraic Data Types generalise ordinary algebraic data types by allowing constructors to have richer return types. Here is an example:

data Term a where
    Lit    :: Int -> Term Int
    Succ   :: Term Int -> Term Int
    IsZero :: Term Int -> Term Bool
    If     :: Term Bool -> Term a -> Term a -> Term a
    Pair   :: Term a -> Term b -> Term (a,b)

Notice that the return type of the constructors is not always Term a, as is the case with ordinary data types. This generality allows us to write a well-typed eval function for these Terms:

eval :: Term a -> a
eval (Lit i)      = i
eval (Succ t)     = 1 + eval t
eval (IsZero t)   = eval t == 0
eval (If b e1 e2) = if eval b then eval e1 else eval e2
eval (Pair e1 e2) = (eval e1, eval e2)

The key point about GADTs is that pattern matching causes type refinement. For example, in the right hand side of the equation

eval :: Term a -> a
eval (Lit i) =  ...

the type a is refined to Int. That’s the whole point! A precise specification of the type rules is beyond what this user manual aspires to, but the design closely follows that described in the paper Simple unification-based type inference for GADTs, (ICFP 2006). The general principle is this: type refinement is only carried out based on user-supplied type annotations. So if no type signature is supplied for eval, no type refinement happens, and lots of obscure error messages will occur. However, the refinement is quite general. For example, if we had:

eval :: Term a -> a -> a
eval (Lit i) j =  i+j

the pattern match causes the type a to be refined to Int (because of the type of the constructor Lit), and that refinement also applies to the type of j, and the result type of the case expression. Hence the addition i+j is legal.

These and many other examples are given in papers by Hongwei Xi, and Tim Sheard. There is a longer introduction on the wiki, and Ralf Hinze’s Fun with phantom types also has a number of examples. Note that papers may use different notation to that implemented in GHC.

The rest of this section outlines the extensions to GHC that support GADTs. The extension is enabled with GADTs. The GADTs extension also sets GADTSyntax and MonoLocalBinds.

  • A GADT can only be declared using GADT-style syntax (Declaring data types with explicit constructor signatures); the old Haskell 98 syntax for data declarations always declares an ordinary data type. The result type of each constructor must begin with the type constructor being defined, but for a GADT the arguments to the type constructor can be arbitrary monotypes. For example, in the Term data type above, the type of each constructor must end with Term ty, but the ty need not be a type variable (e.g. the Lit constructor).

  • It is permitted to declare an ordinary algebraic data type using GADT-style syntax. What makes a GADT into a GADT is not the syntax, but rather the presence of data constructors whose result type is not just T a b.

  • You cannot use a deriving clause for a GADT; only for an ordinary data type.

  • As mentioned in Declaring data types with explicit constructor signatures, record syntax is supported. For example:

    data Term a where
        Lit    :: { val  :: Int }      -> Term Int
        Succ   :: { num  :: Term Int } -> Term Int
        Pred   :: { num  :: Term Int } -> Term Int
        IsZero :: { arg  :: Term Int } -> Term Bool
        Pair   :: { arg1 :: Term a
                  , arg2 :: Term b
                  }                    -> Term (a,b)
        If     :: { cnd  :: Term Bool
                  , tru  :: Term a
                  , fls  :: Term a
                  }                    -> Term a
    

    However, for GADTs there is the following additional constraint: every constructor that has a field f must have the same result type (modulo alpha conversion) Hence, in the above example, we cannot merge the num and arg fields above into a single name. Although their field types are both Term Int, their selector functions actually have different types:

    num :: Term Int -> Term Int
    arg :: Term Bool -> Term Int
    
  • When pattern-matching against data constructors drawn from a GADT, for example in a case expression, the following rules apply:

    • The type of the scrutinee must be rigid.
    • The type of the entire case expression must be rigid.
    • The type of any free variable mentioned in any of the case alternatives must be rigid.

    A type is “rigid” if it is completely known to the compiler at its binding site. The easiest way to ensure that a variable a rigid type is to give it a type signature. For more precise details see Simple unification-based type inference for GADTs. The criteria implemented by GHC are given in the Appendix.

10.5. Extensions to the record system

10.5.1. Traditional record syntax

NoTraditionalRecordSyntax
Since:7.4.1

Disallow use of record syntax.

Traditional record syntax, such as C {f = x}, is enabled by default. To disable it, you can use the NoTraditionalRecordSyntax extension.

10.5.2. Record field disambiguation

DisambiguateRecordFields
Since:6.8.1
Since:6.8.1

Allow the compiler to automatically choose between identically-named record selectors based on type (if the choice is unambiguous).

In record construction and record pattern matching it is entirely unambiguous which field is referred to, even if there are two different data types in scope with a common field name. For example:

module M where
  data S = MkS { x :: Int, y :: Bool }

module Foo where
  import M

  data T = MkT { x :: Int }

  ok1 (MkS { x = n }) = n+1   -- Unambiguous
  ok2 n = MkT { x = n+1 }     -- Unambiguous

  bad1 k = k { x = 3 }        -- Ambiguous
  bad2 k = x k                -- Ambiguous

Even though there are two x‘s in scope, it is clear that the x in the pattern in the definition of ok1 can only mean the field x from type S. Similarly for the function ok2. However, in the record update in bad1 and the record selection in bad2 it is not clear which of the two types is intended.

Haskell 98 regards all four as ambiguous, but with the DisambiguateRecordFields extension, GHC will accept the former two. The rules are precisely the same as those for instance declarations in Haskell 98, where the method names on the left-hand side of the method bindings in an instance declaration refer unambiguously to the method of that class (provided they are in scope at all), even if there are other variables in scope with the same name. This reduces the clutter of qualified names when you import two records from different modules that use the same field name.

Some details:

  • Field disambiguation can be combined with punning (see Record puns). For example:

    module Foo where
      import M
      x=True
      ok3 (MkS { x }) = x+1   -- Uses both disambiguation and punning
    
  • With DisambiguateRecordFields you can use unqualified field names even if the corresponding selector is only in scope qualified For example, assuming the same module M as in our earlier example, this is legal:

    module Foo where
      import qualified M    -- Note qualified
    
      ok4 (M.MkS { x = n }) = n+1   -- Unambiguous
    

    Since the constructor MkS is only in scope qualified, you must name it M.MkS, but the field x does not need to be qualified even though M.x is in scope but x is not (In effect, it is qualified by the constructor).

10.5.3. Duplicate record fields

DuplicateRecordFields
Implies:DisambiguateRecordFields
Since:8.0.1

Allow definition of record types with identically-named fields.

Going beyond DisambiguateRecordFields (see Record field disambiguation), the DuplicateRecordFields extension allows multiple datatypes to be declared using the same field names in a single module. For example, it allows this:

module M where
  data S = MkS { x :: Int }
  data T = MkT { x :: Bool }

Uses of fields that are always unambiguous because they mention the constructor, including construction and pattern-matching, may freely use duplicated field names. For example, the following are permitted (just as with DisambiguateRecordFields):

s = MkS { x = 3 }

f (MkT { x = b }) = b

Field names used as selector functions or in record updates must be unambiguous, either because there is only one such field in scope, or because a type signature is supplied, as described in the following sections.

10.5.3.1. Selector functions

Fields may be used as selector functions only if they are unambiguous, so this is still not allowed if both S(x) and T(x) are in scope:

bad r = x r

An ambiguous selector may be disambiguated by the type being “pushed down” to the occurrence of the selector (see Type inference for more details on what “pushed down” means). For example, the following are permitted:

ok1 = x :: S -> Int

ok2 :: S -> Int
ok2 = x

ok3 = k x -- assuming we already have k :: (S -> Int) -> _

In addition, the datatype that is meant may be given as a type signature on the argument to the selector:

ok4 s = x (s :: S)

However, we do not infer the type of the argument to determine the datatype, or have any way of deferring the choice to the constraint solver. Thus the following is ambiguous:

bad :: S -> Int
bad s = x s

Even though a field label is duplicated in its defining module, it may be possible to use the selector unambiguously elsewhere. For example, another module could import S(x) but not T(x), and then use x unambiguously.

10.5.3.2. Record updates

In a record update such as e { x = 1 }, if there are multiple x fields in scope, then the type of the context must fix which record datatype is intended, or a type annotation must be supplied. Consider the following definitions:

data S = MkS { foo :: Int }
data T = MkT { foo :: Int, bar :: Int }
data U = MkU { bar :: Int, baz :: Int }

Without DuplicateRecordFields, an update mentioning foo will always be ambiguous if all these definitions were in scope. When the extension is enabled, there are several options for disambiguating updates:

  • Check for types that have all the fields being updated. For example:

    f x = x { foo = 3, bar = 2 }
    

    Here f must be updating T because neither S nor U have both fields.

  • Use the type being pushed in to the record update, as in the following:

    g1 :: T -> T
    g1 x = x { foo = 3 }
    
    g2 x = x { foo = 3 } :: T
    
    g3 = k (x { foo = 3 }) -- assuming we already have k :: T -> _
    
  • Use an explicit type signature on the record expression, as in:

    h x = (x :: T) { foo = 3 }
    

The type of the expression being updated will not be inferred, and no constraint-solving will be performed, so the following will be rejected as ambiguous:

let x :: T
    x = blah
in x { foo = 3 }

\x -> [x { foo = 3 },  blah :: T ]

\ (x :: T) -> x { foo = 3 }

10.5.3.3. Import and export of record fields

When DuplicateRecordFields is enabled, an ambiguous field must be exported as part of its datatype, rather than at the top level. For example, the following is legal:

module M (S(x), T(..)) where
  data S = MkS { x :: Int }
  data T = MkT { x :: Bool }

However, this would not be permitted, because x is ambiguous:

module M (x) where ...

Similar restrictions apply on import.

10.5.4. Record puns

NamedFieldPuns
Since:6.10.1

Allow use of record puns.

Record puns are enabled by the language extension NamedFieldPuns.

When using records, it is common to write a pattern that binds a variable with the same name as a record field, such as:

data C = C {a :: Int}
f (C {a = a}) = a

Record punning permits the variable name to be elided, so one can simply write

f (C {a}) = a

to mean the same pattern as above. That is, in a record pattern, the pattern a expands into the pattern a = a for the same name a.

Note that:

  • Record punning can also be used in an expression, writing, for example,

    let a = 1 in C {a}
    

    instead of

    let a = 1 in C {a = a}
    

    The expansion is purely syntactic, so the expanded right-hand side expression refers to the nearest enclosing variable that is spelled the same as the field name.

  • Puns and other patterns can be mixed in the same record:

    data C = C {a :: Int, b :: Int}
    f (C {a, b = 4}) = a
    
  • Puns can be used wherever record patterns occur (e.g. in let bindings or at the top-level).

  • A pun on a qualified field name is expanded by stripping off the module qualifier. For example:

    f (C {M.a}) = a
    

    means

    f (M.C {M.a = a}) = a
    

    (This is useful if the field selector a for constructor M.C is only in scope in qualified form.)

10.5.5. Record wildcards

RecordWildCards
Implies:DisambiguateRecordFields.
Since:6.8.1

Allow the use of wildcards in record construction and pattern matching.

Record wildcards are enabled by the language extension RecordWildCards. This exension implies DisambiguateRecordFields.

For records with many fields, it can be tiresome to write out each field individually in a record pattern, as in

data C = C {a :: Int, b :: Int, c :: Int, d :: Int}
f (C {a = 1, b = b, c = c, d = d}) = b + c + d

Record wildcard syntax permits a “..” in a record pattern, where each elided field f is replaced by the pattern f = f. For example, the above pattern can be written as

f (C {a = 1, ..}) = b + c + d

More details:

  • Record wildcards in patterns can be mixed with other patterns, including puns (Record puns); for example, in a pattern (C {a = 1, b, ..}). Additionally, record wildcards can be used wherever record patterns occur, including in let bindings and at the top-level. For example, the top-level binding

    C {a = 1, ..} = e
    

    defines b, c, and d.

  • Record wildcards can also be used in an expression, when constructing a record. For example,

    let {a = 1; b = 2; c = 3; d = 4} in C {..}
    

    in place of

    let {a = 1; b = 2; c = 3; d = 4} in C {a=a, b=b, c=c, d=d}
    

    The expansion is purely syntactic, so the record wildcard expression refers to the nearest enclosing variables that are spelled the same as the omitted field names.

  • For both pattern and expression wildcards, the “..” expands to the missing in-scope record fields. Specifically the expansion of “C {..}” includes f if and only if:

    • f is a record field of constructor C.
    • The record field f is in scope somehow (either qualified or unqualified).

    These rules restrict record wildcards to the situations in which the user could have written the expanded version. For example

    module M where
      data R = R { a,b,c :: Int }
    module X where
      import M( R(a,c) )
      f b = R { .. }
    

    The R{..} expands to R{M.a=a}, omitting b since the record field is not in scope, and omitting c since the variable c is not in scope (apart from the binding of the record selector c, of course).

  • When record wildcards are use in record construction, a field f is initialised only if f is in scope, and is not imported or bound at top level. For example, f can be bound by an enclosing pattern match or let/where-binding. For example

    module M where
      import A( a )
    
      data R = R { a,b,c,d :: Int }
    
      c = 3 :: Int
    
      f b = R { .. }  -- Expands to R { b = b, d = d }
        where
          d = b+1
    

    Here, a is imported, and c is bound at top level, so neither contribute to the expansion of the “..”. The motivation here is that it should be easy for the reader to figure out what the “..” expands to.

  • Record wildcards cannot be used (a) in a record update construct, and (b) for data constructors that are not declared with record fields. For example:

    f x = x { v=True, .. }   -- Illegal (a)
    
    data T = MkT Int Bool
    g = MkT { .. }           -- Illegal (b)
    h (MkT { .. }) = True    -- Illegal (b)
    

10.5.6. Record field selector polymorphism

The module GHC.Records defines the following:

class HasField (x :: k) r a | x r -> a where
  getField :: r -> a

A HasField x r a constraint represents the fact that x is a field of type a belonging to a record type r. The getField method gives the record selector function.

This allows definitions that are polymorphic over record types with a specified field. For example, the following works with any record type that has a field name :: String:

foo :: HasField "name" r String => r -> String
foo r = reverse (getField @"name" r)

HasField is a magic built-in typeclass (similar to Coercible, for example). It is given special treatment by the constraint solver (see Solving HasField constraints). Users may define their own instances of HasField also (see Virtual record fields).

10.5.6.1. Solving HasField constraints

If the constraint solver encounters a constraint HasField x r a where r is a concrete datatype with a field x in scope, it will automatically solve the constraint using the field selector as the dictionary, unifying a with the type of the field if necessary. This happens irrespective of which extensions are enabled.

For example, if the following datatype is in scope

data Person = Person { name :: String }

the end result is rather like having an instance

instance HasField "name" Person String where
  getField = name

except that this instance is not actually generated anywhere, rather the constraint is solved directly by the constraint solver.

A field must be in scope for the corresponding HasField constraint to be solved. This retains the existing representation hiding mechanism, whereby a module may choose not to export a field, preventing client modules from accessing or updating it directly.

Solving HasField constraints depends on the field selector functions that are generated for each datatype definition:

  • If a record field does not have a selector function because its type would allow an existential variable to escape, the corresponding HasField constraint will not be solved. For example,

    {-# LANGUAGE ExistentialQuantification #-}
    data Exists t = forall x . MkExists { unExists :: t x }
    

    does not give rise to a selector unExists :: Exists t -> t x and we will not solve HasField "unExists" (Exists t) a automatically.

  • If a record field has a polymorphic type (and hence the selector function is higher-rank), the corresponding HasField constraint will not be solved, because doing so would violate the functional dependency on HasField and/or require impredicativity. For example,

    {-# LANGUAGE RankNTypes #-}
    data Higher = MkHigher { unHigher :: forall t . t -> t }
    

    gives rise to a selector unHigher :: Higher -> (forall t . t -> t) but does not lead to solution of the constraint HasField "unHigher" Higher a.

  • A record GADT may have a restricted type for a selector function, which may lead to additional unification when solving HasField constraints. For example,

    {-# LANGUAGE GADTs #-}
    data Gadt t where
      MkGadt :: { unGadt :: Maybe v } -> Gadt [v]
    

    gives rise to a selector unGadt :: Gadt [v] -> Maybe v, so the solver will reduce the constraint HasField "unGadt" (Gadt t) b by unifying t ~ [v] and b ~ Maybe v for some fresh metavariable v, rather as if we had an instance

    instance (t ~ [v], b ~ Maybe v) => HasField "unGadt" (Gadt t) b
    
  • If a record type has an old-fashioned datatype context, the HasField constraint will be reduced to solving the constraints from the context. For example,

    {-# LANGUAGE DatatypeContexts #-}
    data Eq a => Silly a = MkSilly { unSilly :: a }
    

    gives rise to a selector unSilly :: Eq a => Silly a -> a, so the solver will reduce the constraint HasField "unSilly" (Silly a) b to Eq a (and unify a with b), rather as if we had an instance

    instance (Eq a, a ~ b) => HasField "unSilly" (Silly a) b
    

10.5.6.2. Virtual record fields

Users may define their own instances of HasField, provided they do not conflict with the built-in constraint solving behaviour. This allows “virtual” record fields to be defined for datatypes that do not otherwise have them.

For example, this instance would make the name field of Person accessible using #fullname as well:

instance HasField "fullname" Person String where
  getField = name

More substantially, an anonymous records library could provide HasField instances for its anonymous records, and thus be compatible with the polymorphic record selectors introduced by this proposal. For example, something like this makes it possible to use getField to access Record values with the appropriate string in the type-level list of fields:

data Record (xs :: [(k, Type)]) where
  Nil  :: Record '[]
  Cons :: Proxy x -> a -> Record xs -> Record ('(x, a) ': xs)

instance HasField x (Record ('(x, a) ': xs)) a where
  getField (Cons _ v _) = v
instance HasField x (Record xs) a => HasField x (Record ('(y, b) ': xs)) a where
  getField (Cons _ _ r) = getField @x r

r :: Record '[ '("name", String) ]
r = Cons Proxy "R" Nil)

x = getField @"name" r

Since representations such as this can support field labels with kinds other than Symbol, the HasField class is poly-kinded (even though the built-in constraint solving works only at kind Symbol). In particular, this allows users to declare scoped field labels such as in the following example:

data PersonFields = Name

s :: Record '[ '(Name, String) ]
s = Cons Proxy "S" Nil

y = getField @Name s

In order to avoid conflicting with the built-in constraint solving, the following user-defined HasField instances are prohibited (in addition to the usual rules, such as the prohibition on type families appearing in instance heads):

  • HasField _ r _ where r is a variable;
  • HasField _ (T ...) _ if T is a data family (because it might have fields introduced later, using data instance declarations);
  • HasField x (T ...) _ if x is a variable and T has any fields at all (but this instance is permitted if T has no fields);
  • HasField "foo" (T ...) _ if T has a field foo (but this instance is permitted if it does not).

If a field has a higher-rank or existential type, the corresponding HasField constraint will not be solved automatically (as described above), but in the interests of simplicity we do not permit users to define their own instances either. If a field is not in scope, the corresponding instance is still prohibited, to avoid conflicts in downstream modules.

10.6. Extensions to the “deriving” mechanism

Haskell 98 allows the programmer to add a deriving clause to a data type declaration, to generate a standard instance declaration for specified class. GHC extends this mechanism along several axes:

  • The derivation mechanism can be used separtely from the data type declaration, using the standalone deriving mechanism.

  • In Haskell 98, the only derivable classes are Eq, Ord, Enum, Ix, Bounded, Read, and Show. Various langauge extensions extend this list.

  • Besides the stock approach to deriving instances by generating all method definitions, GHC supports two additional deriving strategies, which can derive arbitrary classes:

    The user can optionally declare the desired deriving strategy, especially if the compiler chooses the wrong one by default.

10.6.1. Deriving instances for empty data types

-XEmptyDataDeriving
Since:8.4.1

Allow deriving instances of standard type classes for empty data types.

One can write data types with no constructors using the -XEmptyDataDecls flag (see Data types with no constructors), which is on by default in Haskell 2010. What is not on by default is the ability to derive type class instances for these types. This ability is enabled through use of the -XEmptyDataDeriving flag. For instance, this lets one write:

data Empty deriving (Eq, Ord, Read, Show)

This would generate the following instances:

instance Eq Empty where
  _ == _ = True

instance Ord Empty where
  compare _ _ = EQ

instance Read Empty where
  readPrec = pfail

instance Show Empty where
  showsPrec _ x = case x of {}

The -XEmptyDataDeriving flag is only required to enable deriving of these four “standard” type classes (which are mentioned in the Haskell Report). Other extensions to the deriving mechanism, which are explained below in greater detail, do not require -XEmptyDataDeriving to be used in conjunction with empty data types. These include:

10.6.2. Inferred context for deriving clauses

The Haskell Report is vague about exactly when a deriving clause is legal. For example:

data T0 f a = MkT0 a         deriving( Eq )
data T1 f a = MkT1 (f a)     deriving( Eq )
data T2 f a = MkT2 (f (f a)) deriving( Eq )

The natural generated Eq code would result in these instance declarations:

instance Eq a         => Eq (T0 f a) where ...
instance Eq (f a)     => Eq (T1 f a) where ...
instance Eq (f (f a)) => Eq (T2 f a) where ...

The first of these is obviously fine. The second is still fine, although less obviously. The third is not Haskell 98, and risks losing termination of instances.

GHC takes a conservative position: it accepts the first two, but not the third. The rule is this: each constraint in the inferred instance context must consist only of type variables, with no repetitions.

This rule is applied regardless of flags. If you want a more exotic context, you can write it yourself, using the standalone deriving mechanism.

10.6.3. Stand-alone deriving declarations

StandaloneDeriving
Since:6.8.1

Allow the use of stand-alone deriving declarations.

GHC allows stand-alone deriving declarations, enabled by StandaloneDeriving:

data Foo a = Bar a | Baz String

deriving instance Eq a => Eq (Foo a)

The syntax is identical to that of an ordinary instance declaration apart from (a) the keyword deriving, and (b) the absence of the where part.

However, standalone deriving differs from a deriving clause in a number of important ways:

  • The standalone deriving declaration does not need to be in the same module as the data type declaration. (But be aware of the dangers of orphan instances (Orphan modules and instance declarations).

  • In most cases, you must supply an explicit context (in the example the context is (Eq a)), exactly as you would in an ordinary instance declaration. (In contrast, in a deriving clause attached to a data type declaration, the context is inferred.)

    The exception to this rule is that the context of a standalone deriving declaration can infer its context when a single, extra-wildcards constraint is used as the context, such as in:

    deriving instance _ => Eq (Foo a)
    

    This is essentially the same as if you had written deriving Foo after the declaration for data Foo a. Using this feature requires the use of PartialTypeSignatures (Partial Type Signatures).

  • Unlike a deriving declaration attached to a data declaration, the instance can be more specific than the data type (assuming you also use FlexibleInstances, Relaxed rules for instance contexts). Consider for example

    data Foo a = Bar a | Baz String
    
    deriving instance Eq a => Eq (Foo [a])
    deriving instance Eq a => Eq (Foo (Maybe a))
    

    This will generate a derived instance for (Foo [a]) and (Foo (Maybe a)), but other types such as (Foo (Int,Bool)) will not be an instance of Eq.

  • Unlike a deriving declaration attached to a data declaration, GHC does not restrict the form of the data type. Instead, GHC simply generates the appropriate boilerplate code for the specified class, and typechecks it. If there is a type error, it is your problem. (GHC will show you the offending code if it has a type error.)

    The merit of this is that you can derive instances for GADTs and other exotic data types, providing only that the boilerplate code does indeed typecheck. For example:

    data T a where
       T1 :: T Int
       T2 :: T Bool
    
    deriving instance Show (T a)
    

    In this example, you cannot say ... deriving( Show ) on the data type declaration for T, because T is a GADT, but you can generate the instance declaration using stand-alone deriving.

    The down-side is that, if the boilerplate code fails to typecheck, you will get an error message about that code, which you did not write. Whereas, with a deriving clause the side-conditions are necessarily more conservative, but any error message may be more comprehensible.

  • Under most circumstances, you cannot use standalone deriving to create an instance for a data type whose constructors are not all in scope. This is because the derived instance would generate code that uses the constructors behind the scenes, which would break abstraction.

    The one exception to this rule is DeriveAnyClass, since deriving an instance via DeriveAnyClass simply generates an empty instance declaration, which does not require the use of any constructors. See the deriving any class section for more details.

In other ways, however, a standalone deriving obeys the same rules as ordinary deriving:

  • A deriving instance declaration must obey the same rules concerning form and termination as ordinary instance declarations, controlled by the same flags; see Instance declarations.

  • The stand-alone syntax is generalised for newtypes in exactly the same way that ordinary deriving clauses are generalised (Generalised derived instances for newtypes). For example:

    newtype Foo a = MkFoo (State Int a)
    
    deriving instance MonadState Int Foo
    

    GHC always treats the last parameter of the instance (Foo in this example) as the type whose instance is being derived.

10.6.4. Deriving instances of extra classes (Data, etc.)

Haskell 98 allows the programmer to add “deriving( Eq, Ord )” to a data type declaration, to generate a standard instance declaration for classes specified in the deriving clause. In Haskell 98, the only classes that may appear in the deriving clause are the standard classes Eq, Ord, Enum, Ix, Bounded, Read, and Show.

GHC extends this list with several more classes that may be automatically derived:

  • With DeriveGeneric, you can derive instances of the classes Generic and Generic1, defined in GHC.Generics. You can use these to define generic functions, as described in Generic programming.
  • With DeriveFunctor, you can derive instances of the class Functor, defined in GHC.Base.
  • With DeriveDataTypeable, you can derive instances of the class Data, defined in Data.Data.
  • With DeriveFoldable, you can derive instances of the class Foldable, defined in Data.Foldable.
  • With DeriveTraversable, you can derive instances of the class Traversable, defined in Data.Traversable. Since the Traversable instance dictates the instances of Functor and Foldable, you’ll probably want to derive them too, so DeriveTraversable implies DeriveFunctor and DeriveFoldable.
  • With DeriveLift, you can derive instances of the class Lift, defined in the Language.Haskell.TH.Syntax module of the template-haskell package.

You can also use a standalone deriving declaration instead (see Stand-alone deriving declarations).

In each case the appropriate class must be in scope before it can be mentioned in the deriving clause.

10.6.4.1. Deriving Functor instances

DeriveFunctor
Since:7.10.1

Allow automatic deriving of instances for the Functor typeclass.

With DeriveFunctor, one can derive Functor instances for data types of kind Type -> Type. For example, this declaration:

data Example a = Ex a Char (Example a) (Example Char)
  deriving Functor

would generate the following instance:

instance Functor Example where
  fmap f (Ex a1 a2 a3 a4) = Ex (f a1) a2 (fmap f a3) a4

The basic algorithm for DeriveFunctor walks the arguments of each constructor of a data type, applying a mapping function depending on the type of each argument. If a plain type variable is found that is syntactically equivalent to the last type parameter of the data type (a in the above example), then we apply the function f directly to it. If a type is encountered that is not syntactically equivalent to the last type parameter but does mention the last type parameter somewhere in it, then a recursive call to fmap is made. If a type is found which doesn’t mention the last type parameter at all, then it is left alone.

The second of those cases, in which a type is unequal to the type parameter but does contain the type parameter, can be surprisingly tricky. For example, the following example compiles:

newtype Right a = Right (Either Int a) deriving Functor

Modifying the code slightly, however, produces code which will not compile:

newtype Wrong a = Wrong (Either a Int) deriving Functor

The difference involves the placement of the last type parameter, a. In the Right case, a occurs within the type Either Int a, and moreover, it appears as the last type argument of Either. In the Wrong case, however, a is not the last type argument to Either; rather, Int is.

This distinction is important because of the way DeriveFunctor works. The derived Functor Right instance would be:

instance Functor Right where
  fmap f (Right a) = Right (fmap f a)

Given a value of type Right a, GHC must produce a value of type Right b. Since the argument to the Right constructor has type Either Int a, the code recursively calls fmap on it to produce a value of type Either Int b, which is used in turn to construct a final value of type Right b.

The generated code for the Functor Wrong instance would look exactly the same, except with Wrong replacing every occurrence of Right. The problem is now that fmap is being applied recursively to a value of type Either a Int. This cannot possibly produce a value of type Either b Int, as fmap can only change the last type parameter! This causes the generated code to be ill-typed.

As a general rule, if a data type has a derived Functor instance and its last type parameter occurs on the right-hand side of the data declaration, then either it must (1) occur bare (e.g., newtype Id a = Id a), or (2) occur as the last argument of a type constructor (as in Right above).

There are two exceptions to this rule:

  1. Tuple types. When a non-unit tuple is used on the right-hand side of a data declaration, DeriveFunctor treats it as a product of distinct types. In other words, the following code:

    newtype Triple a = Triple (a, Int, [a]) deriving Functor
    

    Would result in a generated Functor instance like so:

    instance Functor Triple where
      fmap f (Triple a) =
        Triple (case a of
                     (a1, a2, a3) -> (f a1, a2, fmap f a3))
    

    That is, DeriveFunctor pattern-matches its way into tuples and maps over each type that constitutes the tuple. The generated code is reminiscient of what would be generated from data Triple a = Triple a Int [a], except with extra machinery to handle the tuple.

  2. Function types. The last type parameter can appear anywhere in a function type as long as it occurs in a covariant position. To illustrate what this means, consider the following three examples:

    newtype CovFun1 a = CovFun1 (Int -> a) deriving Functor
    newtype CovFun2 a = CovFun2 ((a -> Int) -> a) deriving Functor
    newtype CovFun3 a = CovFun3 (((Int -> a) -> Int) -> a) deriving Functor
    

    All three of these examples would compile without issue. On the other hand:

    newtype ContraFun1 a = ContraFun1 (a -> Int) deriving Functor
    newtype ContraFun2 a = ContraFun2 ((Int -> a) -> Int) deriving Functor
    newtype ContraFun3 a = ContraFun3 (((a -> Int) -> a) -> Int) deriving Functor
    

    While these examples look similar, none of them would successfully compile. This is because all occurrences of the last type parameter a occur in contravariant positions, not covariant ones.

    Intuitively, a covariant type is produced, and a contravariant type is consumed. Most types in Haskell are covariant, but the function type is special in that the lefthand side of a function arrow reverses variance. If a function type a -> b appears in a covariant position (e.g., CovFun1 above), then a is in a contravariant position and b is in a covariant position. Similarly, if a -> b appears in a contravariant position (e.g., CovFun2 above), then a is in a covariant position and b is in a contravariant position.

    To see why a data type with a contravariant occurrence of its last type parameter cannot have a derived Functor instance, let’s suppose that a Functor ContraFun1 instance exists. The implementation would look something like this:

    instance Functor ContraFun1 where
      fmap f (ContraFun g) = ContraFun (\x -> _)
    

    We have f :: a -> b, g :: a -> Int, and x :: b. Using these, we must somehow fill in the hole (denoted with an underscore) with a value of type Int. What are our options?

    We could try applying g to x. This won’t work though, as g expects an argument of type a, and x :: b. Even worse, we can’t turn x into something of type a, since f also needs an argument of type a! In short, there’s no good way to make this work.

    On the other hand, a derived Functor instances for the CovFuns are within the realm of possibility:

    instance Functor CovFun1 where
      fmap f (CovFun1 g) = CovFun1 (\x -> f (g x))
    
    instance Functor CovFun2 where
      fmap f (CovFun2 g) = CovFun2 (\h -> f (g (\x -> h (f x))))
    
    instance Functor CovFun3 where
      fmap f (CovFun3 g) = CovFun3 (\h -> f (g (\k -> h (\x -> f (k x)))))
    

There are some other scenarios in which a derived Functor instance will fail to compile:

  1. A data type has no type parameters (e.g., data Nothing = Nothing).

  2. A data type’s last type variable is used in a DatatypeContexts constraint (e.g., data Ord a => O a = O a).

  3. A data type’s last type variable is used in an ExistentialQuantification constraint, or is refined in a GADT. For example,

    data T a b where
        T4 :: Ord b => b -> T a b
        T5 :: b -> T b b
        T6 :: T a (b,b)
    
    deriving instance Functor (T a)
    

    would not compile successfully due to the way in which b is constrained.

When the last type parameter has a phantom role (see Roles), the derived Functor instance will not be produced using the usual algorithm. Instead, the entire value will be coerced.

data Phantom a = Z | S (Phantom a) deriving Functor

will produce the following instance:

instance Functor Phantom where
  fmap _ = coerce

When a type has no constructors, the derived Functor instance will simply force the (bottom) value of the argument using EmptyCase.

data V a deriving Functor
type role V nominal

will produce

instance Functor V where
fmap _ z = case z of

10.6.4.2. Deriving Foldable instances

DeriveFoldable
Since:7.10.1

Allow automatic deriving of instances for the Foldable typeclass.

With DeriveFoldable, one can derive Foldable instances for data types of kind Type -> Type. For example, this declaration:

data Example a = Ex a Char (Example a) (Example Char)
  deriving Foldable

would generate the following instance:

instance Foldable Example where
  foldr f z (Ex a1 a2 a3 a4) = f a1 (foldr f z a3)
  foldMap f (Ex a1 a2 a3 a4) = mappend (f a1) (foldMap f a3)

The algorithm for DeriveFoldable is adapted from the DeriveFunctor algorithm, but it generates definitions for foldMap, foldr, and null instead of fmap. In addition, DeriveFoldable filters out all constructor arguments on the RHS expression whose types do not mention the last type parameter, since those arguments do not need to be folded over.

When the type parameter has a phantom role (see Roles), DeriveFoldable derives a trivial instance. For example, this declaration:

data Phantom a = Z | S (Phantom a)

will generate the following instance.

instance Foldable Phantom where
  foldMap _ _ = mempty

Similarly, when the type has no constructors, DeriveFoldable will derive a trivial instance:

data V a deriving Foldable
type role V nominal

will generate the following.

instance Foldable V where
  foldMap _ _ = mempty

Here are the differences between the generated code for Functor and Foldable:

#. When a bare type variable a is encountered, DeriveFunctor would generate f a for an fmap definition. DeriveFoldable would generate f a z for foldr, f a for foldMap, and False for null.

  1. When a type that is not syntactically equivalent to a, but which does contain a, is encountered, DeriveFunctor recursively calls fmap on it. Similarly, DeriveFoldable would recursively call foldr and foldMap. Depending on the context, null may recursively call null or all null. For example, given

    data F a = F (P a)
    data G a = G (P (a, Int))
    data H a = H (P (Q a))
    

    Foldable deriving will produce

    null (F x) = null x
    null (G x) = null x
    null (H x) = all null x
    
  2. DeriveFunctor puts everything back together again at the end by invoking the constructor. DeriveFoldable, however, builds up a value of some type. For foldr, this is accomplished by chaining applications of f and recursive foldr calls on the state value z. For foldMap, this happens by combining all values with mappend. For null, the values are usually combined with &&. However, if any of the values is known to be False, all the rest will be dropped. For example,

    data SnocList a = Nil | Snoc (SnocList a) a
    

    will not produce

    null (Snoc xs _) = null xs && False
    

    (which would walk the whole list), but rather

    null (Snoc _ _) = False
    

There are some other differences regarding what data types can have derived Foldable instances:

  1. Data types containing function types on the right-hand side cannot have derived Foldable instances.

  2. Foldable instances can be derived for data types in which the last type parameter is existentially constrained or refined in a GADT. For example, this data type:

    data E a where
        E1 :: (a ~ Int) => a   -> E a
        E2 ::              Int -> E Int
        E3 :: (a ~ Int) => a   -> E Int
        E4 :: (a ~ Int) => Int -> E a
    
    deriving instance Foldable E
    

    would have the following generated Foldable instance:

    instance Foldable E where
        foldr f z (E1 e) = f e z
        foldr f z (E2 e) = z
        foldr f z (E3 e) = z
        foldr f z (E4 e) = z
    
        foldMap f (E1 e) = f e
        foldMap f (E2 e) = mempty
        foldMap f (E3 e) = mempty
        foldMap f (E4 e) = mempty
    

    Notice how every constructor of E utilizes some sort of existential quantification, but only the argument of E1 is actually “folded over”. This is because we make a deliberate choice to only fold over universally polymorphic types that are syntactically equivalent to the last type parameter. In particular:

  • We don’t fold over the arguments of E1 or E4 beacause even though (a ~ Int), Int is not syntactically equivalent to a.
  • We don’t fold over the argument of E3 because a is not universally polymorphic. The a in E3 is (implicitly) existentially quantified, so it is not the same as the last type parameter of E.

10.6.4.3. Deriving Traversable instances

DeriveTraversable
Implies:DeriveFoldable, DeriveFunctor
Since:7.10.1

Allow automatic deriving of instances for the Traversable typeclass.

With DeriveTraversable, one can derive Traversable instances for data types of kind Type -> Type. For example, this declaration:

data Example a = Ex a Char (Example a) (Example Char)
  deriving (Functor, Foldable, Traversable)

would generate the following Traversable instance:

instance Traversable Example where
  traverse f (Ex a1 a2 a3 a4)
    = fmap (\b1 b3 -> Ex b1 a2 b3 a4) (f a1) <*> traverse f a3

The algorithm for DeriveTraversable is adapted from the DeriveFunctor algorithm, but it generates a definition for traverse instead of fmap. In addition, DeriveTraversable filters out all constructor arguments on the RHS expression whose types do not mention the last type parameter, since those arguments do not produce any effects in a traversal.

When the type parameter has a phantom role (see Roles), DeriveTraversable coerces its argument. For example, this declaration:

data Phantom a = Z | S (Phantom a) deriving Traversable

will generate the following instance:

instance Traversable Phantom where
  traverse _ z = pure (coerce z)

When the type has no constructors, DeriveTraversable will derive the laziest instance it can.

data V a deriving Traversable
type role V nominal

will generate the following, using EmptyCase:

instance Traversable V where
  traverse _ z = pure (case z of)

Here are the differences between the generated code in each extension:

  1. When a bare type variable a is encountered, both DeriveFunctor and DeriveTraversable would generate f a for an fmap and traverse definition, respectively.
  2. When a type that is not syntactically equivalent to a, but which does contain a, is encountered, DeriveFunctor recursively calls fmap on it. Similarly, DeriveTraversable would recursively call traverse.
  3. DeriveFunctor puts everything back together again at the end by invoking the constructor. DeriveTraversable does something similar, but it works in an Applicative context by chaining everything together with (<*>).

Unlike DeriveFunctor, DeriveTraversable cannot be used on data types containing a function type on the right-hand side.

For a full specification of the algorithms used in DeriveFunctor, DeriveFoldable, and DeriveTraversable, see this wiki page.

10.6.4.4. Deriving Data instances

DeriveDataTypeable
Since:6.8.1

Enable automatic deriving of instances for the Data typeclass

10.6.4.5. Deriving Typeable instances

The class Typeable is very special:

  • Typeable is kind-polymorphic (see Kind polymorphism).

  • GHC has a custom solver for discharging constraints that involve class Typeable, and handwritten instances are forbidden. This ensures that the programmer cannot subvert the type system by writing bogus instances.

  • Derived instances of Typeable may be declared if the DeriveDataTypeable extension is enabled, but they are ignored, and they may be reported as an error in a later version of the compiler.

  • The rules for solving `Typeable` constraints are as follows:

    • A concrete type constructor applied to some types.

      instance (Typeable t1, .., Typeable t_n) =>
        Typeable (T t1 .. t_n)
      

      This rule works for any concrete type constructor, including type constructors with polymorphic kinds. The only restriction is that if the type constructor has a polymorphic kind, then it has to be applied to all of its kinds parameters, and these kinds need to be concrete (i.e., they cannot mention kind variables).

    • A type variable applied to some types:

      instance (Typeable f, Typeable t1, .., Typeable t_n) =>
        Typeable (f t1 .. t_n)
      
    • A concrete type literal.:

      instance Typeable 0       -- Type natural literals
      instance Typeable "Hello" -- Type-level symbols
      

10.6.4.6. Deriving Lift instances

DeriveLift
Since:7.2.1

Enable automatic deriving of instances for the Lift typeclass for Template Haskell.

The class Lift, unlike other derivable classes, lives in template-haskell instead of base. Having a data type be an instance of Lift permits its values to be promoted to Template Haskell expressions (of type ExpQ), which can then be spliced into Haskell source code.

Here is an example of how one can derive Lift:

{-# LANGUAGE DeriveLift #-}
module Bar where

import Language.Haskell.TH.Syntax

data Foo a = Foo a | a :^: a deriving Lift

{-
instance (Lift a) => Lift (Foo a) where
    lift (Foo a)
    = appE
        (conE
            (mkNameG_d "package-name" "Bar" "Foo"))
        (lift a)
    lift (u :^: v)
    = infixApp
        (lift u)
        (conE
            (mkNameG_d "package-name" "Bar" ":^:"))
        (lift v)
-}

-----
{-# LANGUAGE TemplateHaskell #-}
module Baz where

import Bar
import Language.Haskell.TH.Lift

foo :: Foo String
foo = $(lift $ Foo "foo")

fooExp :: Lift a => Foo a -> Q Exp
fooExp f = [| f |]

DeriveLift also works for certain unboxed types (Addr#, Char#, Double#, Float#, Int#, and Word#):

{-# LANGUAGE DeriveLift, MagicHash #-}
module Unboxed where

import GHC.Exts
import Language.Haskell.TH.Syntax

data IntHash = IntHash Int# deriving Lift

{-
instance Lift IntHash where
    lift (IntHash i)
    = appE
        (conE
            (mkNameG_d "package-name" "Unboxed" "IntHash"))
        (litE
            (intPrimL (toInteger (I# i))))
-}

10.6.5. Generalised derived instances for newtypes

GeneralisedNewtypeDeriving
GeneralizedNewtypeDeriving
Since:6.8.1. British spelling since 8.6.1.

Enable GHC’s cunning generalised deriving mechanism for newtypes

When you define an abstract type using newtype, you may want the new type to inherit some instances from its representation. In Haskell 98, you can inherit instances of Eq, Ord, Enum and Bounded by deriving them, but for any other classes you have to write an explicit instance declaration. For example, if you define

newtype Dollars = Dollars Int

and you want to use arithmetic on Dollars, you have to explicitly define an instance of Num:

instance Num Dollars where
  Dollars a + Dollars b = Dollars (a+b)
  ...

All the instance does is apply and remove the newtype constructor. It is particularly galling that, since the constructor doesn’t appear at run-time, this instance declaration defines a dictionary which is wholly equivalent to the Int dictionary, only slower!

DerivingVia (see Deriving via) is a generalization of this idea.

10.6.5.1. Generalising the deriving clause

GHC now permits such instances to be derived instead, using the extension GeneralizedNewtypeDeriving, so one can write

newtype Dollars = Dollars { getDollars :: Int } deriving (Eq,Show,Num)

and the implementation uses the same Num dictionary for Dollars as for Int. In other words, GHC will generate something that resembles the following code

instance Num Int => Num Dollars

and then attempt to simplify the Num Int context as much as possible. GHC knows that there is a Num Int instance in scope, so it is able to discharge the Num Int constraint, leaving the code that GHC actually generates

instance Num Dollars

One can think of this instance being implemented with the same code as the Num Int instance, but with Dollars and getDollars added wherever necessary in order to make it typecheck. (In practice, GHC uses a somewhat different approach to code generation. See the A more precise specification section below for more details.)

We can also derive instances of constructor classes in a similar way. For example, suppose we have implemented state and failure monad transformers, such that

instance Monad m => Monad (State s m)
instance Monad m => Monad (Failure m)

In Haskell 98, we can define a parsing monad by

type Parser tok m a = State [tok] (Failure m) a

which is automatically a monad thanks to the instance declarations above. With the extension, we can make the parser type abstract, without needing to write an instance of class Monad, via

newtype Parser tok m a = Parser (State [tok] (Failure m) a)
                       deriving Monad

In this case the derived instance declaration is of the form

instance Monad (State [tok] (Failure m)) => Monad (Parser tok m)

Notice that, since Monad is a constructor class, the instance is a partial application of the newtype, not the entire left hand side. We can imagine that the type declaration is “eta-converted” to generate the context of the instance declaration.

We can even derive instances of multi-parameter classes, provided the newtype is the last class parameter. In this case, a “partial application” of the class appears in the deriving clause. For example, given the class

class StateMonad s m | m -> s where ...
instance Monad m => StateMonad s (State s m) where ...

then we can derive an instance of StateMonad for Parser by

newtype Parser tok m a = Parser (State [tok] (Failure m) a)
                       deriving (Monad, StateMonad [tok])

The derived instance is obtained by completing the application of the class to the new type:

instance StateMonad [tok] (State [tok] (Failure m)) =>
         StateMonad [tok] (Parser tok m)

As a result of this extension, all derived instances in newtype declarations are treated uniformly (and implemented just by reusing the dictionary for the representation type), except Show and Read, which really behave differently for the newtype and its representation.

Note

It is sometimes necessary to enable additional language extensions when deriving instances via GeneralizedNewtypeDeriving. For instance, consider a simple class and instance using UnboxedTuples syntax:

{-# LANGUAGE UnboxedTuples #-}

module Lib where

class AClass a where
  aMethod :: a -> (# Int, a #)

instance AClass Int where
  aMethod x = (# x, x #)

The following will fail with an “Illegal unboxed tuple” error, since the derived instance produced by the compiler makes use of unboxed tuple syntax,

{-# LANGUAGE GeneralizedNewtypeDeriving #-}

import Lib

newtype Int' = Int' Int
             deriving (AClass)

However, enabling the UnboxedTuples extension allows the module to compile. Similar errors may occur with a variety of extensions, including:

10.6.5.2. A more precise specification

A derived instance is derived only for declarations of these forms (after expansion of any type synonyms)

newtype T v1..vn                   = MkT (t vk+1..vn) deriving (C t1..tj)
newtype instance T s1..sk vk+1..vn = MkT (t vk+1..vn) deriving (C t1..tj)

where

  • v1..vn are type variables, and t, s1..sk, t1..tj are types.
  • The (C t1..tj) is a partial applications of the class C, where the arity of C is exactly j+1. That is, C lacks exactly one type argument.
  • k is chosen so that C t1..tj (T v1...vk) is well-kinded. (Or, in the case of a data instance, so that C t1..tj (T s1..sk) is well kinded.)
  • The type t is an arbitrary type.
  • The type variables vk+1...vn do not occur in the types t, s1..sk, or t1..tj.
  • C is not Read, Show, Typeable, or Data. These classes should not “look through” the type or its constructor. You can still derive these classes for a newtype, but it happens in the usual way, not via this new mechanism. Confer with Default deriving strategy.
  • It is safe to coerce each of the methods of C. That is, the missing last argument to C is not used at a nominal role in any of the C‘s methods. (See Roles.)
  • C is allowed to have associated type families, provided they meet the requirements laid out in the section on GND and associated types.

Then the derived instance declaration is of the form

instance C t1..tj t => C t1..tj (T v1...vk)

Note that if C does not contain any class methods, the instance context is wholly unnecessary, and as such GHC will instead generate:

instance C t1..tj (T v1..vk)

As an example which does not work, consider

newtype NonMonad m s = NonMonad (State s m s) deriving Monad

Here we cannot derive the instance

instance Monad (State s m) => Monad (NonMonad m)

because the type variable s occurs in State s m, and so cannot be “eta-converted” away. It is a good thing that this deriving clause is rejected, because NonMonad m is not, in fact, a monad — for the same reason. Try defining >>= with the correct type: you won’t be able to.

Notice also that the order of class parameters becomes important, since we can only derive instances for the last one. If the StateMonad class above were instead defined as

class StateMonad m s | m -> s where ...

then we would not have been able to derive an instance for the Parser type above. We hypothesise that multi-parameter classes usually have one “main” parameter for which deriving new instances is most interesting.

Lastly, all of this applies only for classes other than Read, Show, Typeable, and Data, for which the stock derivation applies (section 4.3.3. of the Haskell Report). (For the standard classes Eq, Ord, Ix, and Bounded it is immaterial whether the stock method is used or the one described here.)

10.6.5.3. Associated type families

GeneralizedNewtypeDeriving also works for some type classes with associated type families. Here is an example:

class HasRing a where
  type Ring a

newtype L1Norm a = L1Norm a
  deriving HasRing

The derived HasRing instance would look like

instance HasRing (L1Norm a) where
  type Ring (L1Norm a) = Ring a

To be precise, if the class being derived is of the form

class C c_1 c_2 ... c_m where
  type T1 t1_1 t1_2 ... t1_n
  ...
  type Tk tk_1 tk_2 ... tk_p

and the newtype is of the form

newtype N n_1 n_2 ... n_q = MkN <rep-type>

then you can derive a C c_1 c_2 ... c_(m-1) instance for N n_1 n_2 ... n_q, provided that:

  • The type parameter c_m occurs once in each of the type variables of T1 through Tk. Imagine a class where this condition didn’t hold. For example:

    class Bad a b where
      type B a
    
    instance Bad Int a where
      type B Int = Char
    
    newtype Foo a = Foo a
      deriving (Bad Int)
    

    For the derived Bad Int instance, GHC would need to generate something like this:

    instance Bad Int (Foo a) where
      type B Int = B ???
    

    Now we’re stuck, since we have no way to refer to a on the right-hand side of the B family instance, so this instance doesn’t really make sense in a GeneralizedNewtypeDeriving setting.

  • C does not have any associated data families (only type families). To see why data families are forbidden, imagine the following scenario:

    class Ex a where
      data D a
    
    instance Ex Int where
      data D Int = DInt Bool
    
    newtype Age = MkAge Int deriving Ex
    

    For the derived Ex instance, GHC would need to generate something like this:

    instance Ex Age where
      data D Age = ???
    

    But it is not clear what GHC would fill in for ???, as each data family instance must generate fresh data constructors.

If both of these conditions are met, GHC will generate this instance:

instance C c_1 c_2 ... c_(m-1) <rep-type> =>
         C c_1 c_2 ... c_(m-1) (N n_1 n_2 ... n_q) where
  type T1 t1_1 t1_2 ... (N n_1 n_2 ... n_q) ... t1_n
     = T1 t1_1 t1_2 ... <rep-type>          ... t1_n
  ...
  type Tk tk_1 tk_2 ... (N n_1 n_2 ... n_q) ... tk_p
     = Tk tk_1 tk_2 ... <rep-type>          ... tk_p

Again, if C contains no class methods, the instance context will be redundant, so GHC will instead generate instance C c_1 c_2 ... c_(m-1) (N n_1 n_2 ... n_q).

Beware that in some cases, you may need to enable the UndecidableInstances extension in order to use this feature. Here’s a pathological case that illustrates why this might happen:

class C a where
  type T a

newtype Loop = MkLoop Loop
  deriving C

This will generate the derived instance:

instance C Loop where
  type T Loop = T Loop

Here, it is evident that attempting to use the type T Loop will throw the typechecker into an infinite loop, as its definition recurses endlessly. In other cases, you might need to enable UndecidableInstances even if the generated code won’t put the typechecker into a loop. For example:

instance C Int where
  type C Int = Int

newtype MyInt = MyInt Int
  deriving C

This will generate the derived instance:

instance C MyInt where
  type T MyInt = T Int

Although typechecking T MyInt will terminate, GHC’s termination checker isn’t sophisticated enough to determine this, so you’ll need to enable UndecidableInstances in order to use this derived instance. If you do go down this route, make sure you can convince yourself that all of the type family instances you’re deriving will eventually terminate if used!

Note that DerivingVia (see Deriving via) uses essentially the same specification to derive instances of associated type families as well (except that it uses the via type instead of the underlying rep-type of a newtype).

10.6.6. Deriving any other class

DeriveAnyClass
Since:7.10.1

Allow use of any typeclass in deriving clauses.

With DeriveAnyClass you can derive any other class. The compiler will simply generate an instance declaration with no explicitly-defined methods. This is mostly useful in classes whose minimal set is empty, and especially when writing generic functions.

As an example, consider a simple pretty-printer class SPretty, which outputs pretty strings:

{-# LANGUAGE DefaultSignatures, DeriveAnyClass #-}

class SPretty a where
  sPpr :: a -> String
  default sPpr :: Show a => a -> String
  sPpr = show

If a user does not provide a manual implementation for sPpr, then it will default to show. Now we can leverage the DeriveAnyClass extension to easily implement a SPretty instance for a new data type:

data Foo = Foo deriving (Show, SPretty)

The above code is equivalent to:

data Foo = Foo deriving Show
instance SPretty Foo

That is, an SPretty Foo instance will be created with empty implementations for all methods. Since we are using DefaultSignatures in this example, a default implementation of sPpr is filled in automatically.

Note the following details

  • In case you try to derive some class on a newtype, and GeneralizedNewtypeDeriving is also on, DeriveAnyClass takes precedence.

  • The instance context is determined by the type signatures of the derived class’s methods. For instance, if the class is:

    class Foo a where
      bar :: a -> String
      default bar :: Show a => a -> String
      bar = show
    
      baz :: a -> a -> Bool
      default baz :: Ord a => a -> a -> Bool
      baz x y = compare x y == EQ
    

    And you attempt to derive it using DeriveAnyClass:

    instance Eq   a => Eq   (Option a) where ...
    instance Ord  a => Ord  (Option a) where ...
    instance Show a => Show (Option a) where ...
    
    data Option a = None | Some a deriving Foo
    

    Then the derived Foo instance will be:

    instance (Show a, Ord a) => Foo (Option a)
    

    Since the default type signatures for bar and baz require Show a and Ord a constraints, respectively.

    Constraints on the non-default type signatures can play a role in inferring the instance context as well. For example, if you have this class:

    class HigherEq f where
      (==#) :: f a -> f a -> Bool
      default (==#) :: Eq (f a) => f a -> f a -> Bool
      x ==# y = (x == y)
    

    And you tried to derive an instance for it:

    instance Eq a => Eq (Option a) where ...
    data Option a = None | Some a deriving HigherEq
    

    Then it will fail with an error to the effect of:

    No instance for (Eq a)
        arising from the 'deriving' clause of a data type declaration
    

    That is because we require an Eq (Option a) instance from the default type signature for (==#), which in turn requires an Eq a instance, which we don’t have in scope. But if you tweak the definition of HigherEq slightly:

    class HigherEq f where
      (==#) :: Eq a => f a -> f a -> Bool
      default (==#) :: Eq (f a) => f a -> f a -> Bool
      x ==# y = (x == y)
    

    Then it becomes possible to derive a HigherEq Option instance. Note that the only difference is that now the non-default type signature for (==#) brings in an Eq a constraint. Constraints from non-default type signatures never appear in the derived instance context itself, but they can be used to discharge obligations that are demanded by the default type signatures. In the example above, the default type signature demanded an Eq a instance, and the non-default signature was able to satisfy that request, so the derived instance is simply:

    instance HigherEq Option
    
  • DeriveAnyClass can be used with partially applied classes, such as

    data T a = MKT a deriving( D Int )
    

    which generates

    instance D Int a => D Int (T a) where {}
    
  • DeriveAnyClass can be used to fill in default instances for associated type families:

    {-# LANGUAGE DeriveAnyClass, TypeFamilies #-}
    
    class Sizable a where
      type Size a
      type Size a = Int
    
    data Bar = Bar deriving Sizable
    
    doubleBarSize :: Size Bar -> Size Bar
    doubleBarSize s = 2*s
    

    The deriving( Sizable ) is equivalent to saying

    instance Sizeable Bar where {}
    

    and then the normal rules for filling in associated types from the default will apply, making Size Bar equal to Int.

10.6.7. Deriving strategies

DerivingStrategies
Since:8.2.1

Allow multiple deriving, each optionally qualified with a strategy.

In most scenarios, every deriving statement generates a typeclass instance in an unambiguous fashion. There is a corner case, however, where simultaneously enabling both the GeneralizedNewtypeDeriving and DeriveAnyClass extensions can make deriving become ambiguous. Consider the following example

{-# LANGUAGE DeriveAnyClass, GeneralizedNewtypeDeriving #-}
newtype Foo = MkFoo Bar deriving C

One could either pick the DeriveAnyClass approach to deriving C or the GeneralizedNewtypeDeriving approach to deriving C, both of which would be equally as valid. GHC defaults to favoring DeriveAnyClass in such a dispute, but this is not a satisfying solution, since that leaves users unable to use both language extensions in a single module.

To make this more robust, GHC has a notion of deriving strategies, which allow the user to explicitly request which approach to use when deriving an instance. To enable this feature, one must enable the DerivingStrategies language extension. A deriving strategy can be specified in a deriving clause

newtype Foo = MkFoo Bar
  deriving newtype C

Or in a standalone deriving declaration

deriving anyclass instance C Foo

DerivingStrategies also allows the use of multiple deriving clauses per data declaration so that a user can derive some instance with one deriving strategy and other instances with another deriving strategy. For example

newtype Baz = Baz Quux
  deriving          (Eq, Ord)
  deriving stock    (Read, Show)
  deriving newtype  (Num, Floating)
  deriving anyclass C

Currently, the deriving strategies are:

10.6.7.1. Default deriving strategy

If an explicit deriving strategy is not given, multiple strategies may apply. In that case, GHC chooses the strategy as follows:

  1. Stock type classes, i.e. those specified in the report and those enabled by language extensions, are derived using the stock strategy, with the following exception:

    • For newtypes, Eq, Ord, Ix and Bounded are always derived using the newtype strategy, even without GeneralizedNewtypeDeriving enabled. (There should be no observable difference to instances derived using the stock strategy.)
    • Also for newtypes, Functor, Foldable and Enum are derived using the newtype strategy if GeneralizedNewtypeDeriving is enabled and the derivation succeeds.
  2. For other any type class:

    1. When DeriveAnyClass is enabled, use anyclass.
    2. When GeneralizedNewtypeDeriving is enabled and we are deriving for a newtype, then use newytype.

    If both rules apply to a deriving clause, then anyclass is used and the user is warned about the ambiguity. The warning can be avoided by explicitly stating the desired deriving strategy.

10.6.8. Deriving via

DerivingVia
Implies:DerivingStrategies
Since:8.6.1

This allows deriving a class instance for a type by specifying another type of equal runtime representation (such that there exists a Coercible instance between the two: see The Coercible constraint) that is already an instance of the that class.

DerivingVia is indicated by the use of the via deriving strategy. via requires specifying another type (the via type) to coerce through. For example, this code:

{-# LANGUAGE DerivingVia #-}

import Numeric

newtype Hex a = Hex a

instance (Integral a, Show a) => Show (Hex a) where
  show (Hex a) = "0x" ++ showHex a ""

newtype Unicode = U Int
  deriving Show
    via (Hex Int)

-- >>> euroSign
-- 0x20ac
euroSign :: Unicode
euroSign = U 0x20ac

Generates the following instance

instance Show Unicode where
  show :: Unicode -> String
  show = Data.Coerce.coerce
    @(Hex Int -> String)
    @(Unicode -> String)
    show

This extension generalizes GeneralizedNewtypeDeriving. To derive Num Unicode with GND (deriving newtype Num) it must reuse the Num Int instance. With DerivingVia, we can explicitly specify the representation type Int:

newtype Unicode = U Int
  deriving Num
    via Int

  deriving Show
    via (Hex Int)

euroSign :: Unicode
euroSign = 0x20ac

Code duplication is common in instance declarations. A familiar pattern is lifting operations over an Applicative functor. Instead of having catch-all instances for f a which overlap with all other such instances, like so:

instance (Applicative f, Semigroup a) => Semigroup (f a) ..
instance (Applicative f, Monoid    a) => Monoid    (f a) ..

We can instead create a newtype App (where App f a and f a are represented the same in memory) and use DerivingVia to explicitly enable uses of this pattern:

{-# LANGUAGE DerivingVia, DeriveFunctor, GeneralizedNewtypeDeriving #-}

import Control.Applicative

newtype App f a = App (f a) deriving newtype (Functor, Applicative)

instance (Applicative f, Semigroup a) => Semigroup (App f a) where
  (<>) = liftA2 (<>)

instance (Applicative f, Monoid a) => Monoid (App f a) where
  mempty = pure mempty

data Pair a = MkPair a a
  deriving stock
    Functor

  deriving (Semigroup, Monoid)
    via (App Pair a)

instance Applicative Pair where
  pure a = MkPair a a

  MkPair f g <*> MkPair a b = MkPair (f a) (g b)

Note that the via type does not have to be a newtype. The only restriction is that it is coercible with the original data type. This means there can be arbitrary nesting of newtypes, as in the following example:

newtype Kleisli m a b = (a -> m b)
  deriving (Semigroup, Monoid)
    via (a -> App m b)

Here we make use of the Monoid ((->) a) instance.

10.7. Pattern synonyms

PatternSynonyms
Since:7.8.1

Allow the definition of pattern synonyms.

Pattern synonyms are enabled by the language extension PatternSynonyms, which is required for defining them, but not for using them. More information and examples of pattern synonyms can be found on the Wiki page.

Pattern synonyms enable giving names to parametrized pattern schemes. They can also be thought of as abstract constructors that don’t have a bearing on data representation. For example, in a programming language implementation, we might represent types of the language as follows:

data Type = App String [Type]

Here are some examples of using said representation. Consider a few types of the Type universe encoded like this:

App "->" [t1, t2]          -- t1 -> t2
App "Int" []               -- Int
App "Maybe" [App "Int" []] -- Maybe Int

This representation is very generic in that no types are given special treatment. However, some functions might need to handle some known types specially, for example the following two functions collect all argument types of (nested) arrow types, and recognize the Int type, respectively:

collectArgs :: Type -> [Type]
collectArgs (App "->" [t1, t2]) = t1 : collectArgs t2
collectArgs _                   = []

isInt :: Type -> Bool
isInt (App "Int" []) = True
isInt _              = False

Matching on App directly is both hard to read and error prone to write. And the situation is even worse when the matching is nested:

isIntEndo :: Type -> Bool
isIntEndo (App "->" [App "Int" [], App "Int" []]) = True
isIntEndo _                                       = False

Pattern synonyms permit abstracting from the representation to expose matchers that behave in a constructor-like manner with respect to pattern matching. We can create pattern synonyms for the known types we care about, without committing the representation to them (note that these don’t have to be defined in the same module as the Type type):

pattern Arrow t1 t2 = App "->"    [t1, t2]
pattern Int         = App "Int"   []
pattern Maybe t     = App "Maybe" [t]

Which enables us to rewrite our functions in a much cleaner style:

collectArgs :: Type -> [Type]
collectArgs (Arrow t1 t2) = t1 : collectArgs t2
collectArgs _             = []

isInt :: Type -> Bool
isInt Int = True
isInt _   = False

isIntEndo :: Type -> Bool
isIntEndo (Arrow Int Int) = True
isIntEndo _               = False

In general there are three kinds of pattern synonyms. Unidirectional, bidirectional and explicitly bidirectional. The examples given so far are examples of bidirectional pattern synonyms. A bidirectional synonym behaves the same as an ordinary data constructor. We can use it in a pattern context to deconstruct values and in an expression context to construct values. For example, we can construct the value intEndo using the pattern synonyms Arrow and Int as defined previously.

intEndo :: Type
intEndo = Arrow Int Int

This example is equivalent to the much more complicated construction if we had directly used the Type constructors.

intEndo :: Type
intEndo = App "->" [App "Int" [], App "Int" []]

Unidirectional synonyms can only be used in a pattern context and are defined as follows:

pattern Head x <- x:xs

In this case, Head ⟨x⟩ cannot be used in expressions, only patterns, since it wouldn’t specify a value for the ⟨xs⟩ on the right-hand side. However, we can define an explicitly bidirectional pattern synonym by separately specifying how to construct and deconstruct a type. The syntax for doing this is as follows:

pattern HeadC x <- x:xs where
  HeadC x = [x]

We can then use HeadC in both expression and pattern contexts. In a pattern context it will match the head of any list with length at least one. In an expression context it will construct a singleton list.

Explicitly bidirectional pattern synonyms offer greater flexibility than implicitly bidirectional ones in terms of the syntax that is permitted. For instance, the following is not a legal implicitly bidirectional pattern synonym:

pattern StrictJust a = Just !a

This is illegal because the use of BangPatterns on the right-hand sides prevents it from being a well formed expression. However, constructing a strict pattern synonym is quite possible with an explicitly bidirectional pattern synonym:

pattern StrictJust a <- Just !a where
  StrictJust !a = Just a

The table below summarises where each kind of pattern synonym can be used.

Context Unidirectional Bidirectional Explicitly Bidirectional
Pattern Yes Yes Yes
Expression No Yes (Inferred) Yes (Explicit)

10.7.1. Record Pattern Synonyms

It is also possible to define pattern synonyms which behave just like record constructors. The syntax for doing this is as follows:

pattern Point :: Int -> Int -> (Int, Int)
pattern Point{x, y} = (x, y)

The idea is that we can then use Point just as if we had defined a new datatype MyPoint with two fields x and y.

data MyPoint = Point { x :: Int, y :: Int }

Whilst a normal pattern synonym can be used in two ways, there are then seven ways in which to use Point. Precisely the ways in which a normal record constructor can be used.

Usage Example
As a constructor zero = Point 0 0
As a constructor with record syntax zero = Point { x = 0, y = 0}
In a pattern context isZero (Point 0 0) = True
In a pattern context with record syntax isZero (Point { x = 0, y = 0 }
In a pattern context with field puns getX (Point {x}) = x
In a record update (0, 0) { x = 1 } == (1,0)
Using record selectors x (0,0) == 0

For a unidirectional record pattern synonym we define record selectors but do not allow record updates or construction.

The syntax and semantics of pattern synonyms are elaborated in the following subsections. There are also lots more details in the paper.

See the Wiki page for more details.

10.7.2. Syntax and scoping of pattern synonyms

A pattern synonym declaration can be either unidirectional, bidirectional or explicitly bidirectional. The syntax for unidirectional pattern synonyms is:

pattern pat_lhs <- pat

the syntax for bidirectional pattern synonyms is:

pattern pat_lhs = pat

and the syntax for explicitly bidirectional pattern synonyms is:

pattern pat_lhs <- pat where
  pat_lhs = expr

We can define either prefix, infix or record pattern synonyms by modifying the form of pat_lhs. The syntax for these is as follows:

Prefix Name args
Infix arg1 `Name` arg2 or arg1 op arg2
Record Name{arg1,arg2,...,argn}

Pattern synonym declarations can only occur in the top level of a module. In particular, they are not allowed as local definitions.

The variables in the left-hand side of the definition are bound by the pattern on the right-hand side. For bidirectional pattern synonyms, all the variables of the right-hand side must also occur on the left-hand side; also, wildcard patterns and view patterns are not allowed. For unidirectional and explicitly bidirectional pattern synonyms, there is no restriction on the right-hand side pattern.

Pattern synonyms cannot be defined recursively.

COMPLETE pragmas can be specified in order to tell the pattern match exhaustiveness checker that a set of pattern synonyms is complete.

10.7.3. Import and export of pattern synonyms

The name of the pattern synonym is in the same namespace as proper data constructors. Like normal data constructors, pattern synonyms can be imported and exported through association with a type constructor or independently.

To export them on their own, in an export or import specification, you must prefix pattern names with the pattern keyword, e.g.:

module Example (pattern Zero) where

data MyNum = MkNum Int

pattern Zero :: MyNum
pattern Zero = MkNum 0

Without the pattern prefix, Zero would be interpreted as a type constructor in the export list.

You may also use the pattern keyword in an import/export specification to import or export an ordinary data constructor. For example:

import Data.Maybe( pattern Just )

would bring into scope the data constructor Just from the Maybe type, without also bringing the type constructor Maybe into scope.

To bundle a pattern synonym with a type constructor, we list the pattern synonym in the export list of a module which exports the type constructor. For example, to bundle Zero with MyNum we could write the following:

module Example ( MyNum(Zero) ) where

If a module was then to import MyNum from Example, it would also import the pattern synonym Zero.

It is also possible to use the special token .. in an export list to mean all currently bundled constructors. For example, we could write:

module Example ( MyNum(.., Zero) ) where

in which case, Example would export the type constructor MyNum with the data constructor MkNum and also the pattern synonym Zero.

Bundled pattern synonyms are type checked to ensure that they are of the same type as the type constructor which they are bundled with. A pattern synonym P can not be bundled with a type constructor T if P‘s type is visibly incompatible with T.

A module which imports MyNum(..) from Example and then re-exports MyNum(..) will also export any pattern synonyms bundled with MyNum in Example. A more complete specification can be found on the wiki.

10.7.4. Typing of pattern synonyms

Given a pattern synonym definition of the form

pattern P var1 var2 ... varN <- pat

it is assigned a pattern type of the form

pattern P :: CReq => CProv => t1 -> t2 -> ... -> tN -> t

where ⟨CReq⟩ and ⟨CProv⟩ are type contexts, and ⟨t1⟩, ⟨t2⟩, ..., ⟨tN⟩ and ⟨t⟩ are types. Notice the unusual form of the type, with two contexts ⟨CReq⟩ and ⟨CProv⟩:

  • ⟨CReq⟩ are the constraints required to match the pattern.
  • ⟨CProv⟩ are the constraints made available (provided) by a successful pattern match.

For example, consider

data T a where
  MkT :: (Show b) => a -> b -> T a

f1 :: (Num a, Eq a) => T a -> String
f1 (MkT 42 x) = show x

pattern ExNumPat :: (Num a, Eq a) => (Show b) => b -> T a
pattern ExNumPat x = MkT 42 x

f2 :: (Eq a, Num a) => T a -> String
f2 (ExNumPat x) = show x

Here f1 does not use pattern synonyms. To match against the numeric pattern 42 requires the caller to satisfy the constraints (Num a, Eq a), so they appear in f1‘s type. The call to show generates a (Show b) constraint, where b is an existentially type variable bound by the pattern match on MkT. But the same pattern match also provides the constraint (Show b) (see MkT‘s type), and so all is well.

Exactly the same reasoning applies to ExNumPat: matching against ExNumPat requires the constraints (Num a, Eq a), and provides the constraint (Show b).

Note also the following points

  • In the common case where CProv is empty, (i.e., ()), it can be omitted altogether in the above pattern type signature for P.

  • However, if CProv is non-empty, while CReq is, the above pattern type signature for P must be specified as

    P :: () => CProv => t1 -> t2 -> .. -> tN -> t
    
  • The GHCi :info command shows pattern types in this format.

  • You may specify an explicit pattern signature, as we did for ExNumPat above, to specify the type of a pattern, just as you can for a function. As usual, the type signature can be less polymorphic than the inferred type. For example

    -- Inferred type would be 'a -> [a]'
    pattern SinglePair :: (a, a) -> [(a, a)]
    pattern SinglePair x = [x]
    

    Just like signatures on value-level bindings, pattern synonym signatures can apply to more than one pattern. For instance,

    pattern Left', Right' :: a -> Either a a
    pattern Left' x  = Left x
    pattern Right' x = Right x
    
  • The rules for lexically-scoped type variables (see Lexically scoped type variables) apply to pattern-synonym signatures. As those rules specify, only the type variables from an explicit, syntactically-visible outer forall (the universals) scope over the definition of the pattern synonym; the existentials, bound by the inner forall, do not. For example

    data T a where
       MkT :: Bool -> b -> (b->Int) -> a -> T a
    
    pattern P :: forall a. forall b. b -> (b->Int) -> a -> T a
    pattern P x y v <- MkT True x y (v::a)
    

    Here the universal type variable a scopes over the definition of P, but the existential b does not. (c.f. disussion on Trac #14998.)

  • For a bidirectional pattern synonym, a use of the pattern synonym as an expression has the type

    (CReq, CProv) => t1 -> t2 -> ... -> tN -> t
    

    So in the previous example, when used in an expression, ExNumPat has type

    ExNumPat :: (Num a, Eq a, Show b) => b -> T t
    

    Notice that this is a tiny bit more restrictive than the expression MkT 42 x which would not require (Eq a).

  • Consider these two pattern synonyms:

    data S a where
       S1 :: Bool -> S Bool
    
    pattern P1 :: Bool -> Maybe Bool
    pattern P1 b = Just b
    
    pattern P2 :: () => (b ~ Bool) => Bool -> S b
    pattern P2 b = S1 b
    
    f :: Maybe a -> String
    f (P1 x) = "no no no"     -- Type-incorrect
    
    g :: S a -> String
    g (P2 b) = "yes yes yes"  -- Fine
    

    Pattern P1 can only match against a value of type Maybe Bool, so function f is rejected because the type signature is Maybe a. (To see this, imagine expanding the pattern synonym.)

    On the other hand, function g works fine, because matching against P2 (which wraps the GADT S) provides the local equality (a~Bool). If you were to give an explicit pattern signature P2 :: Bool -> S Bool, then P2 would become less polymorphic, and would behave exactly like P1 so that g would then be rejected.

    In short, if you want GADT-like behaviour for pattern synonyms, then (unlike concrete data constructors like S1) you must write its type with explicit provided equalities. For a concrete data constructor like S1 you can write its type signature as either S1 :: Bool -> S Bool or S1 :: (b~Bool) => Bool -> S b; the two are equivalent. Not so for pattern synonyms: the two forms are different, in order to distinguish the two cases above. (See Trac #9953 for discussion of this choice.)

10.7.5. Matching of pattern synonyms

A pattern synonym occurrence in a pattern is evaluated by first matching against the pattern synonym itself, and then on the argument patterns.

More precisely, the semantics of pattern matching is given in Section 3.17 of the Haskell 2010 report. To the informal semantics in Section 3.17.2 we add this extra rule:

  • If the pattern is a constructor pattern (P p1 ... pn), where P is a pattern synonym defined by P x1 ... xn = p or P x1 ... xn <- p, then:
    1. Match the value v against p. If this match fails or diverges, so does the whole (pattern synonym) match. Otherwise the match against p must bind the variables x1 ... xn; let them be bound to values v1 ... vn.
    2. Match v1 against p1, v2 against p2 and so on. If any of these matches fail or diverge, so does the whole match.
    3. If all the matches against the pi succeed, the match succeeds, binding the variables bound by the pi . (The xi are not bound; they remain local to the pattern synonym declaration.)

For example, in the following program, f and f' are equivalent:

pattern Pair x y <- [x, y]

f (Pair True True) = True
f _                = False

f' [x, y] | True <- x, True <- y = True
f' _                              = False

Note that the strictness of f differs from that of g defined below:

g [True, True] = True
g _            = False

*Main> f (False:undefined)
*** Exception: Prelude.undefined
*Main> g (False:undefined)
False

10.8. Class and instances declarations

10.8.1. Class declarations

This section, and the next one, documents GHC’s type-class extensions. There’s lots of background in the paper Type classes: exploring the design space (Simon Peyton Jones, Mark Jones, Erik Meijer).

10.8.1.1. Multi-parameter type classes

MultiParamTypeClasses
Implies:ConstrainedClassMethods
Since:6.8.1

Allow the definition of typeclasses with more than one parameter.

Multi-parameter type classes are permitted, with extension MultiParamTypeClasses. For example:

class Collection c a where
    union :: c a -> c a -> c a
    ...etc.

10.8.1.2. The superclasses of a class declaration

FlexibleContexts
Since:6.8.1

Allow the use of complex constraints in class declaration contexts.

In Haskell 98 the context of a class declaration (which introduces superclasses) must be simple; that is, each predicate must consist of a class applied to type variables. The extension FlexibleContexts (The context of a type signature) lifts this restriction, so that the only restriction on the context in a class declaration is that the class hierarchy must be acyclic. So these class declarations are OK:

class Functor (m k) => FiniteMap m k where
  ...

class (Monad m, Monad (t m)) => Transform t m where
  lift :: m a -> (t m) a

As in Haskell 98, the class hierarchy must be acyclic. However, the definition of “acyclic” involves only the superclass relationships. For example, this is okay:

class C a where
  op :: D b => a -> b -> b

class C a => D a where ...

Here, C is a superclass of D, but it’s OK for a class operation op of C to mention D. (It would not be OK for D to be a superclass of C.)

With the extension that adds a kind of constraints, you can write more exotic superclass definitions. The superclass cycle check is even more liberal in these case. For example, this is OK:

class A cls c where
  meth :: cls c => c -> c

class A B c => B c where

A superclass context for a class C is allowed if, after expanding type synonyms to their right-hand-sides, and uses of classes (other than C) to their superclasses, C does not occur syntactically in the context.

10.8.1.3. Constrained class method types

ConstrainedClassMethods
Since:6.8.1

Allows the definition of further constraints on individual class methods.

Haskell 98 prohibits class method types to mention constraints on the class type variable, thus:

class Seq s a where
  fromList :: [a] -> s a
  elem     :: Eq a => a -> s a -> Bool

The type of elem is illegal in Haskell 98, because it contains the constraint Eq a, which constrains only the class type variable (in this case a). this case a). More precisely, a constraint in a class method signature is rejected if

  • The constraint mentions at least one type variable. So this is allowed:

    class C a where
      op1 :: HasCallStack => a -> a
      op2 :: (?x::Int) => Int -> a
    
  • All of the type variables mentioned are bound by the class declaration, and none is locally quantified. Examples:

    class C a where
      op3 :: Eq a => a -> a    -- Rejected: constrains class variable only
      op4 :: D b => a -> b     -- Accepted: constrains a locally-quantified variable `b`
      op5 :: D (a,b) => a -> b -- Accepted: constrains a locally-quantified variable `b`
    

GHC lifts this restriction with language extension ConstrainedClassMethods. The restriction is a pretty stupid one in the first place, so ConstrainedClassMethods is implied by MultiParamTypeClasses.

10.8.1.4. Default method signatures

DefaultSignatures
Since:7.2.1

Allows the definition of default method signatures in class definitions.

Haskell 98 allows you to define a default implementation when declaring a class:

class Enum a where
  enum :: [a]
  enum = []

The type of the enum method is [a], and this is also the type of the default method. You can lift this restriction and give another type to the default method using the extension DefaultSignatures. For instance, if you have written a generic implementation of enumeration in a class GEnum with method genum in terms of GHC.Generics, you can specify a default method that uses that generic implementation:

class Enum a where
  enum :: [a]
  default enum :: (Generic a, GEnum (Rep a)) => [a]
  enum = map to genum

We reuse the keyword default to signal that a signature applies to the default method only; when defining instances of the Enum class, the original type [a] of enum still applies. When giving an empty instance, however, the default implementation (map to genum) is filled-in, and type-checked with the type (Generic a, GEnum (Rep a)) => [a].

The type signature for a default method of a type class must take on the same form as the corresponding main method’s type signature. Otherwise, the typechecker will reject that class’s definition. By “take on the same form”, we mean that the default type signature should differ from the main type signature only in their contexts. Therefore, if you have a method bar:

class Foo a where
  bar :: forall b. C => a -> b -> b

Then a default method for bar must take on the form:

default bar :: forall b. C' => a -> b -> b

C is allowed to be different from C', but the right-hand sides of the type signatures must coincide. We require this because when you declare an empty instance for a class that uses DefaultSignatures, GHC implicitly fills in the default implementation like this:

instance Foo Int where
  bar = default_bar @Int

Where @Int utilizes visible type application (Visible type application) to instantiate the b in default bar :: forall b. C' => a -> b -> b. In order for this type application to work, the default type signature for bar must have the same type variable order as the non-default signature! But there is no obligation for C and C' to be the same (see, for instance, the Enum example above, which relies on this).

To further explain this example, the right-hand side of the default type signature for bar must be something that is alpha-equivalent to forall b. a -> b -> b (where a is bound by the class itself, and is thus free in the methods’ type signatures). So this would also be an acceptable default type signature:

default bar :: forall x. C' => a -> x -> x

But not this (since the free variable a is in the wrong place):

default bar :: forall b. C' => b -> a -> b

Nor this, since we can’t match the type variable b with the concrete type Int:

default bar :: C' => a -> Int -> Int

That last one deserves a special mention, however, since a -> Int -> Int is a straightforward instantiation of forall b. a -> b -> b. You can still write such a default type signature, but you now must use type equalities to do so:

default bar :: forall b. (C', b ~ Int) => a -> b -> b

We use default signatures to simplify generic programming in GHC (Generic programming).

10.8.1.5. Nullary type classes

NullaryTypeClasses
Since:7.8.1

Allows the use definition of type classes with no parameters. This extension has been replaced by MultiParamTypeClasses.

Nullary (no parameter) type classes are enabled with MultiParamTypeClasses; historically, they were enabled with the (now deprecated) NullaryTypeClasses. Since there are no available parameters, there can be at most one instance of a nullary class. A nullary type class might be used to document some assumption in a type signature (such as reliance on the Riemann hypothesis) or add some globally configurable settings in a program. For example,

class RiemannHypothesis where
  assumeRH :: a -> a

-- Deterministic version of the Miller test
-- correctness depends on the generalised Riemann hypothesis
isPrime :: RiemannHypothesis => Integer -> Bool
isPrime n = assumeRH (...)

The type signature of isPrime informs users that its correctness depends on an unproven conjecture. If the function is used, the user has to acknowledge the dependence with:

instance RiemannHypothesis where
  assumeRH = id

10.8.2. Functional dependencies

FunctionalDependencies
Implies:MultiParamTypeClasses
Since:6.8.1

Allow use of functional dependencies in class declarations.

Functional dependencies are implemented as described by Mark Jones in [Jones2000].

Functional dependencies are introduced by a vertical bar in the syntax of a class declaration; e.g.

class (Monad m) => MonadState s m | m -> s where ...

class Foo a b c | a b -> c where ...

More documentation can be found in the Haskell Wiki.

[Jones2000]Type Classes with Functional Dependencies”, Mark P. Jones, In Proceedings of the 9th European Symposium on Programming, ESOP 2000, Berlin, Germany, March 2000, Springer-Verlag LNCS 1782, .

10.8.2.1. Rules for functional dependencies

In a class declaration, all of the class type variables must be reachable (in the sense mentioned in The context of a type signature) from the free variables of each method type. For example:

class Coll s a where
  empty  :: s
  insert :: s -> a -> s

is not OK, because the type of empty doesn’t mention a. Functional dependencies can make the type variable reachable:

class Coll s a | s -> a where
  empty  :: s
  insert :: s -> a -> s

Alternatively Coll might be rewritten

class Coll s a where
  empty  :: s a
  insert :: s a -> a -> s a

which makes the connection between the type of a collection of a‘s (namely (s a)) and the element type a. Occasionally this really doesn’t work, in which case you can split the class like this:

class CollE s where
  empty  :: s

class CollE s => Coll s a where
  insert :: s -> a -> s

10.8.2.2. Background on functional dependencies

The following description of the motivation and use of functional dependencies is taken from the Hugs user manual, reproduced here (with minor changes) by kind permission of Mark Jones.

Consider the following class, intended as part of a library for collection types:

class Collects e ce where
    empty  :: ce
    insert :: e -> ce -> ce
    member :: e -> ce -> Bool

The type variable e used here represents the element type, while ce is the type of the container itself. Within this framework, we might want to define instances of this class for lists or characteristic functions (both of which can be used to represent collections of any equality type), bit sets (which can be used to represent collections of characters), or hash tables (which can be used to represent any collection whose elements have a hash function). Omitting standard implementation details, this would lead to the following declarations:

instance Eq e => Collects e [e] where ...
instance Eq e => Collects e (e -> Bool) where ...
instance Collects Char BitSet where ...
instance (Hashable e, Collects a ce)
           => Collects e (Array Int ce) where ...

All this looks quite promising; we have a class and a range of interesting implementations. Unfortunately, there are some serious problems with the class declaration. First, the empty function has an ambiguous type:

empty :: Collects e ce => ce

By “ambiguous” we mean that there is a type variable e that appears on the left of the => symbol, but not on the right. The problem with this is that, according to the theoretical foundations of Haskell overloading, we cannot guarantee a well-defined semantics for any term with an ambiguous type.

We can sidestep this specific problem by removing the empty member from the class declaration. However, although the remaining members, insert and member, do not have ambiguous types, we still run into problems when we try to use them. For example, consider the following two functions:

f x y = insert x . insert y
g     = f True 'a'

for which GHC infers the following types:

f :: (Collects a c, Collects b c) => a -> b -> c -> c
g :: (Collects Bool c, Collects Char c) => c -> c

Notice that the type for f allows the two parameters x and y to be assigned different types, even though it attempts to insert each of the two values, one after the other, into the same collection. If we’re trying to model collections that contain only one type of value, then this is clearly an inaccurate type. Worse still, the definition for g is accepted, without causing a type error. As a result, the error in this code will not be flagged at the point where it appears. Instead, it will show up only when we try to use g, which might even be in a different module.

10.8.2.2.1. An attempt to use constructor classes

Faced with the problems described above, some Haskell programmers might be tempted to use something like the following version of the class declaration:

class Collects e c where
   empty  :: c e
   insert :: e -> c e -> c e
   member :: e -> c e -> Bool

The key difference here is that we abstract over the type constructor c that is used to form the collection type c e, and not over that collection type itself, represented by ce in the original class declaration. This avoids the immediate problems that we mentioned above: empty has type Collects e c => c e, which is not ambiguous.

The function f from the previous section has a more accurate type:

f :: (Collects e c) => e -> e -> c e -> c e

The function g from the previous section is now rejected with a type error as we would hope because the type of f does not allow the two arguments to have different types. This, then, is an example of a multiple parameter class that does actually work quite well in practice, without ambiguity problems. There is, however, a catch. This version of the Collects class is nowhere near as general as the original class seemed to be: only one of the four instances for Collects given above can be used with this version of Collects because only one of them—the instance for lists—has a collection type that can be written in the form c e, for some type constructor c, and element type e.

10.8.2.2.2. Adding functional dependencies

To get a more useful version of the Collects class, GHC provides a mechanism that allows programmers to specify dependencies between the parameters of a multiple parameter class (For readers with an interest in theoretical foundations and previous work: The use of dependency information can be seen both as a generalisation of the proposal for “parametric type classes” that was put forward by Chen, Hudak, and Odersky, or as a special case of Mark Jones’s later framework for “improvement” of qualified types. The underlying ideas are also discussed in a more theoretical and abstract setting in a manuscript [Jones1999], where they are identified as one point in a general design space for systems of implicit parameterisation). To start with an abstract example, consider a declaration such as:

class C a b where ...
[Jones1999]Exploring the Design Space for Type-based Implicit Parameterization”, Mark P. Jones, Oregon Graduate Institute of Science & Technology, Technical Report, July 1999.

which tells us simply that C can be thought of as a binary relation on types (or type constructors, depending on the kinds of a and b). Extra clauses can be included in the definition of classes to add information about dependencies between parameters, as in the following examples:

class D a b | a -> b where ...
class E a b | a -> b, b -> a where ...

The notation a -> b used here between the | and where symbols — not to be confused with a function type — indicates that the a parameter uniquely determines the b parameter, and might be read as “a determines b.” Thus D is not just a relation, but actually a (partial) function. Similarly, from the two dependencies that are included in the definition of E, we can see that E represents a (partial) one-to-one mapping between types.

More generally, dependencies take the form x1 ... xn -> y1 ... ym, where x1, ..., xn, and y1, ..., yn are type variables with n>0 and m>=0, meaning that the y parameters are uniquely determined by the x parameters. Spaces can be used as separators if more than one variable appears on any single side of a dependency, as in t -> a b. Note that a class may be annotated with multiple dependencies using commas as separators, as in the definition of E above. Some dependencies that we can write in this notation are redundant, and will be rejected because they don’t serve any useful purpose, and may instead indicate an error in the program. Examples of dependencies like this include a -> a, a -> a a, a ->, etc. There can also be some redundancy if multiple dependencies are given, as in a->b, b->c, a->c, and in which some subset implies the remaining dependencies. Examples like this are not treated as errors. Note that dependencies appear only in class declarations, and not in any other part of the language. In particular, the syntax for instance declarations, class constraints, and types is completely unchanged.

By including dependencies in a class declaration, we provide a mechanism for the programmer to specify each multiple parameter class more precisely. The compiler, on the other hand, is responsible for ensuring that the set of instances that are in scope at any given point in the program is consistent with any declared dependencies. For example, the following pair of instance declarations cannot appear together in the same scope because they violate the dependency for D, even though either one on its own would be acceptable:

instance D Bool Int where ...
instance D Bool Char where ...

Note also that the following declaration is not allowed, even by itself:

instance D [a] b where ...

The problem here is that this instance would allow one particular choice of [a] to be associated with more than one choice for b, which contradicts the dependency specified in the definition of D. More generally, this means that, in any instance of the form:

instance D t s where ...

for some particular types t and s, the only variables that can appear in s are the ones that appear in t, and hence, if the type t is known, then s will be uniquely determined.

The benefit of including dependency information is that it allows us to define more general multiple parameter classes, without ambiguity problems, and with the benefit of more accurate types. To illustrate this, we return to the collection class example, and annotate the original definition of Collects with a simple dependency:

class Collects e ce | ce -> e where
   empty  :: ce
   insert :: e -> ce -> ce
   member :: e -> ce -> Bool

The dependency ce -> e here specifies that the type e of elements is uniquely determined by the type of the collection ce. Note that both parameters of Collects are of kind Type; there are no constructor classes here. Note too that all of the instances of Collects that we gave earlier can be used together with this new definition.

What about the ambiguity problems that we encountered with the original definition? The empty function still has type Collects e ce => ce, but it is no longer necessary to regard that as an ambiguous type: Although the variable e does not appear on the right of the => symbol, the dependency for class Collects tells us that it is uniquely determined by ce, which does appear on the right of the => symbol. Hence the context in which empty is used can still give enough information to determine types for both ce and e, without ambiguity. More generally, we need only regard a type as ambiguous if it contains a variable on the left of the => that is not uniquely determined (either directly or indirectly) by the variables on the right.

Dependencies also help to produce more accurate types for user defined functions, and hence to provide earlier detection of errors, and less cluttered types for programmers to work with. Recall the previous definition for a function f:

f x y = insert x y = insert x . insert y

for which we originally obtained a type:

f :: (Collects a c, Collects b c) => a -> b -> c -> c

Given the dependency information that we have for Collects, however, we can deduce that a and b must be equal because they both appear as the second parameter in a Collects constraint with the same first parameter c. Hence we can infer a shorter and more accurate type for f:

f :: (Collects a c) => a -> a -> c -> c

In a similar way, the earlier definition of g will now be flagged as a type error.

Although we have given only a few examples here, it should be clear that the addition of dependency information can help to make multiple parameter classes more useful in practice, avoiding ambiguity problems, and allowing more general sets of instance declarations.

10.8.3. Instance declarations

An instance declaration has the form

instance ( assertion1, ..., assertionn) => class type1 ... typem where ...

The part before the “=>” is the context, while the part after the “=>” is the head of the instance declaration.

10.8.3.1. Instance resolution

When GHC tries to resolve, say, the constraint C Int Bool, it tries to match every instance declaration against the constraint, by instantiating the head of the instance declaration. Consider these declarations:

instance context1 => C Int a     where ...  -- (A)
instance context2 => C a   Bool  where ...  -- (B)

GHC’s default behaviour is that exactly one instance must match the constraint it is trying to resolve. For example, the constraint C Int Bool matches instances (A) and (B), and hence would be rejected; while C Int Char matches only (A) and hence (A) is chosen.

Notice that

  • When matching, GHC takes no account of the context of the instance declaration (context1 etc).
  • It is fine for there to be a potential of overlap (by including both declarations (A) and (B), say); an error is only reported if a particular constraint matches more than one.

See also Overlapping instances for flags that loosen the instance resolution rules.

10.8.3.2. Relaxed rules for the instance head

TypeSynonymInstances
Since:6.8.1

Allow definition of type class instances for type synonyms.

FlexibleInstances
Implies:TypeSynonymInstances
Since:6.8.1

Allow definition of type class instances with arbitrary nested types in the instance head.

In Haskell 98 the head of an instance declaration must be of the form C (T a1 ... an), where C is the class, T is a data type constructor, and the a1 ... an are distinct type variables. In the case of multi-parameter type classes, this rule applies to each parameter of the instance head (Arguably it should be okay if just one has this form and the others are type variables, but that’s the rules at the moment).

GHC relaxes this rule in two ways:

  • With the TypeSynonymInstances extension, instance heads may use type synonyms. As always, using a type synonym is just shorthand for writing the RHS of the type synonym definition. For example:

    type Point a = (a,a)
    instance C (Point a)   where ...
    

    is legal. The instance declaration is equivalent to

    instance C (a,a) where ...
    

    As always, type synonyms must be fully applied. You cannot, for example, write:

    instance Monad Point where ...
    
  • The FlexibleInstances extension allows the head of the instance declaration to mention arbitrary nested types. For example, this becomes a legal instance declaration

    instance C (Maybe Int) where ...
    

    See also the rules on overlap.

    The FlexibleInstances extension implies TypeSynonymInstances.

However, the instance declaration must still conform to the rules for instance termination: see Instance termination rules.

10.8.3.3. Relaxed rules for instance contexts

In Haskell 98, the class constraints in the context of the instance declaration must be of the form C a where a is a type variable that occurs in the head.

The FlexibleContexts extension relaxes this rule, as well as relaxing the corresponding rule for type signatures (see The context of a type signature). Specifically, FlexibleContexts, allows (well-kinded) class constraints of form (C t1 ... tn) in the context of an instance declaration.

Notice that the extension does not affect equality constraints in an instance context; they are permitted by TypeFamilies or GADTs.

However, the instance declaration must still conform to the rules for instance termination: see Instance termination rules.

10.8.3.4. Instance termination rules

UndecidableInstances
Since:6.8.1

Permit definition of instances which may lead to type-checker non-termination.

Regardless of FlexibleInstances and FlexibleContexts, instance declarations must conform to some rules that ensure that instance resolution will terminate. The restrictions can be lifted with UndecidableInstances (see Undecidable instances).

The rules are these:

  1. The Paterson Conditions: for each class constraint (C t1 ... tn) in the context
    1. No type variable has more occurrences in the constraint than in the head
    2. The constraint has fewer constructors and variables (taken together and counting repetitions) than the head
    3. The constraint mentions no type functions. A type function application can in principle expand to a type of arbitrary size, and so are rejected out of hand
  2. The Coverage Condition. For each functional dependency, ⟨tvs⟩left -> ⟨tvs⟩right, of the class, every type variable in S(⟨tvs⟩right) must appear in S(⟨tvs⟩left), where S is the substitution mapping each type variable in the class declaration to the corresponding type in the instance head.

These restrictions ensure that instance resolution terminates: each reduction step makes the problem smaller by at least one constructor. You can find lots of background material about the reason for these restrictions in the paper Understanding functional dependencies via Constraint Handling Rules.

For example, these are okay:

instance C Int [a]          -- Multiple parameters
instance Eq (S [a])         -- Structured type in head

    -- Repeated type variable in head
instance C4 a a => C4 [a] [a]
instance Stateful (ST s) (MutVar s)

    -- Head can consist of type variables only
instance C a
instance (Eq a, Show b) => C2 a b

    -- Non-type variables in context
instance Show (s a) => Show (Sized s a)
instance C2 Int a => C3 Bool [a]
instance C2 Int a => C3 [a] b

But these are not:

    -- Context assertion no smaller than head
instance C a => C a where ...
    -- (C b b) has more occurrences of b than the head
instance C b b => Foo [b] where ...

The same restrictions apply to instances generated by deriving clauses. Thus the following is accepted:

data MinHeap h a = H a (h a)
  deriving (Show)

because the derived instance

instance (Show a, Show (h a)) => Show (MinHeap h a)

conforms to the above rules.

A useful idiom permitted by the above rules is as follows. If one allows overlapping instance declarations then it’s quite convenient to have a “default instance” declaration that applies if something more specific does not:

instance C a where
  op = ... -- Default

10.8.3.5. Undecidable instances

Sometimes even the termination rules of Instance termination rules are too onerous. So GHC allows you to experiment with more liberal rules: if you use the experimental extension UndecidableInstances, both the Paterson Conditions and the Coverage Condition (described in Instance termination rules) are lifted. Termination is still ensured by having a fixed-depth recursion stack. If you exceed the stack depth you get a sort of backtrace, and the opportunity to increase the stack depth with -freduction-depth=⟨n⟩. However, if you should exceed the default reduction depth limit, it is probably best just to disable depth checking, with -freduction-depth=0. The exact depth your program requires depends on minutiae of your code, and it may change between minor GHC releases. The safest bet for released code – if you’re sure that it should compile in finite time – is just to disable the check.

For example, sometimes you might want to use the following to get the effect of a “class synonym”:

class (C1 a, C2 a, C3 a) => C a where { }

instance (C1 a, C2 a, C3 a) => C a where { }

This allows you to write shorter signatures:

f :: C a => ...

instead of

f :: (C1 a, C2 a, C3 a) => ...

The restrictions on functional dependencies (Functional dependencies) are particularly troublesome. It is tempting to introduce type variables in the context that do not appear in the head, something that is excluded by the normal rules. For example:

class HasConverter a b | a -> b where
   convert :: a -> b

data Foo a = MkFoo a

instance (HasConverter a b,Show b) => Show (Foo a) where
   show (MkFoo value) = show (convert value)

This is dangerous territory, however. Here, for example, is a program that would make the typechecker loop:

class D a
class F a b | a->b
instance F [a] [[a]]
instance (D c, F a c) => D [a]   -- 'c' is not mentioned in the head

Similarly, it can be tempting to lift the coverage condition:

class Mul a b c | a b -> c where
  (.*.) :: a -> b -> c

instance Mul Int Int Int where (.*.) = (*)
instance Mul Int Float Float where x .*. y = fromIntegral x * y
instance Mul a b c => Mul a [b] [c] where x .*. v = map (x.*.) v

The third instance declaration does not obey the coverage condition; and indeed the (somewhat strange) definition:

f = \ b x y -> if b then x .*. [y] else y

makes instance inference go into a loop, because it requires the constraint (Mul a [b] b).

The UndecidableInstances extension is also used to lift some of the restrictions imposed on type family instances. See Decidability of type synonym instances.

10.8.3.6. Overlapping instances

OverlappingInstances

Deprecated extension to weaken checks intended to ensure instance resolution termination.

IncoherentInstances
Since:6.8.1

Deprecated extension to weaken checks intended to ensure instance resolution termination.

In general, as discussed in Instance resolution, GHC requires that it be unambiguous which instance declaration should be used to resolve a type-class constraint. GHC also provides a way to loosen the instance resolution, by allowing more than one instance to match, provided there is a most specific one. Moreover, it can be loosened further, by allowing more than one instance to match irrespective of whether there is a most specific one. This section gives the details.

To control the choice of instance, it is possible to specify the overlap behavior for individual instances with a pragma, written immediately after the instance keyword. The pragma may be one of: {-# OVERLAPPING #-}, {-# OVERLAPPABLE #-}, {-# OVERLAPS #-}, or {-# INCOHERENT #-}.

The matching behaviour is also influenced by two module-level language extension flags: OverlappingInstances and IncoherentInstances. These extensions are now deprecated (since GHC 7.10) in favour of the fine-grained per-instance pragmas.

A more precise specification is as follows. The willingness to be overlapped or incoherent is a property of the instance declaration itself, controlled as follows:

  • An instance is incoherent if: it has an INCOHERENT pragma; or if the instance has no pragma and it appears in a module compiled with IncoherentInstances.
  • An instance is overlappable if: it has an OVERLAPPABLE or OVERLAPS pragma; or if the instance has no pragma and it appears in a module compiled with OverlappingInstances; or if the instance is incoherent.
  • An instance is overlapping if: it has an OVERLAPPING or OVERLAPS pragma; or if the instance has no pragma and it appears in a module compiled with OverlappingInstances; or if the instance is incoherent.

Now suppose that, in some client module, we are searching for an instance of the target constraint (C ty1 .. tyn). The search works like this:

  • Find all instances \(I\) that match the target constraint; that is, the target constraint is a substitution instance of \(I\). These instance declarations are the candidates.
  • Eliminate any candidate \(IX\) for which both of the following hold:
    • There is another candidate \(IY\) that is strictly more specific; that is, \(IY\) is a substitution instance of \(IX\) but not vice versa.
    • Either \(IX\) is overlappable, or \(IY\) is overlapping. (This “either/or” design, rather than a “both/and” design, allow a client to deliberately override an instance from a library, without requiring a change to the library.)
  • If exactly one non-incoherent candidate remains, select it. If all remaining candidates are incoherent, select an arbitrary one. Otherwise the search fails (i.e. when more than one surviving candidate is not incoherent).
  • If the selected candidate (from the previous step) is incoherent, the search succeeds, returning that candidate.
  • If not, find all instances that unify with the target constraint, but do not match it. Such non-candidate instances might match when the target constraint is further instantiated. If all of them are incoherent, the search succeeds, returning the selected candidate; if not, the search fails.

Notice that these rules are not influenced by flag settings in the client module, where the instances are used. These rules make it possible for a library author to design a library that relies on overlapping instances without the client having to know.

Errors are reported lazily (when attempting to solve a constraint), rather than eagerly (when the instances themselves are defined). Consider, for example

instance C Int  b where ..
instance C a Bool where ..

These potentially overlap, but GHC will not complain about the instance declarations themselves, regardless of flag settings. If we later try to solve the constraint (C Int Char) then only the first instance matches, and all is well. Similarly with (C Bool Bool). But if we try to solve (C Int Bool), both instances match and an error is reported.

As a more substantial example of the rules in action, consider

instance {-# OVERLAPPABLE #-} context1 => C Int b     where ...  -- (A)
instance {-# OVERLAPPABLE #-} context2 => C a   Bool  where ...  -- (B)
instance {-# OVERLAPPABLE #-} context3 => C a   [b]   where ...  -- (C)
instance {-# OVERLAPPING  #-} context4 => C Int [Int] where ...  -- (D)

Now suppose that the type inference engine needs to solve the constraint C Int [Int]. This constraint matches instances (A), (C) and (D), but the last is more specific, and hence is chosen.

If (D) did not exist then (A) and (C) would still be matched, but neither is most specific. In that case, the program would be rejected, unless IncoherentInstances is enabled, in which case it would be accepted and (A) or (C) would be chosen arbitrarily.

An instance declaration is more specific than another iff the head of former is a substitution instance of the latter. For example (D) is “more specific” than (C) because you can get from (C) to (D) by substituting a := Int.

GHC is conservative about committing to an overlapping instance. For example:

f :: [b] -> [b]
f x = ...

Suppose that from the RHS of f we get the constraint C b [b]. But GHC does not commit to instance (C), because in a particular call of f, b might be instantiate to Int, in which case instance (D) would be more specific still. So GHC rejects the program.

If, however, you enable the extension IncoherentInstances when compiling the module that contains (D), GHC will instead pick (C), without complaining about the problem of subsequent instantiations.

Notice that we gave a type signature to f, so GHC had to check that f has the specified type. Suppose instead we do not give a type signature, asking GHC to infer it instead. In this case, GHC will refrain from simplifying the constraint C Int [b] (for the same reason as before) but, rather than rejecting the program, it will infer the type

f :: C b [b] => [b] -> [b]

That postpones the question of which instance to pick to the call site for f by which time more is known about the type b. You can write this type signature yourself if you use the FlexibleContexts extension.

Exactly the same situation can arise in instance declarations themselves. Suppose we have

class Foo a where
   f :: a -> a
instance Foo [b] where
   f x = ...

and, as before, the constraint C Int [b] arises from f‘s right hand side. GHC will reject the instance, complaining as before that it does not know how to resolve the constraint C Int [b], because it matches more than one instance declaration. The solution is to postpone the choice by adding the constraint to the context of the instance declaration, thus:

instance C Int [b] => Foo [b] where
   f x = ...

(You need FlexibleInstances to do this.)

Warning

Overlapping instances must be used with care. They can give rise to incoherence (i.e. different instance choices are made in different parts of the program) even without IncoherentInstances. Consider:

{-# LANGUAGE OverlappingInstances #-}
module Help where

    class MyShow a where
    myshow :: a -> String

    instance MyShow a => MyShow [a] where
    myshow xs = concatMap myshow xs

    showHelp :: MyShow a => [a] -> String
    showHelp xs = myshow xs

{-# LANGUAGE FlexibleInstances, OverlappingInstances #-}
module Main where
    import Help

    data T = MkT

    instance MyShow T where
    myshow x = "Used generic instance"

    instance MyShow [T] where
    myshow xs = "Used more specific instance"

    main = do { print (myshow [MkT]); print (showHelp [MkT]) }

In function showHelp GHC sees no overlapping instances, and so uses the MyShow [a] instance without complaint. In the call to myshow in main, GHC resolves the MyShow [T] constraint using the overlapping instance declaration in module Main. As a result, the program prints

"Used more specific instance"
"Used generic instance"

(An alternative possible behaviour, not currently implemented, would be to reject module Help on the grounds that a later instance declaration might overlap the local one.)

10.8.3.7. Instance signatures: type signatures in instance declarations

InstanceSigs
Since:7.6.1

Allow type signatures for members in instance definitions.

In Haskell, you can’t write a type signature in an instance declaration, but it is sometimes convenient to do so, and the language extension InstanceSigs allows you to do so. For example:

data T a = MkT a a
instance Eq a => Eq (T a) where
  (==) :: T a -> T a -> Bool   -- The signature
  (==) (MkT x1 x2) (MkTy y1 y2) = x1==y1 && x2==y2

Some details

  • The type signature in the instance declaration must be more polymorphic than (or the same as) the one in the class declaration, instantiated with the instance type. For example, this is fine:

    instance Eq a => Eq (T a) where
       (==) :: forall b. b -> b -> Bool
       (==) x y = True
    

    Here the signature in the instance declaration is more polymorphic than that required by the instantiated class method.

  • The code for the method in the instance declaration is typechecked against the type signature supplied in the instance declaration, as you would expect. So if the instance signature is more polymorphic than required, the code must be too.

  • One stylistic reason for wanting to write a type signature is simple documentation. Another is that you may want to bring scoped type variables into scope. For example:

    class C a where
      foo :: b -> a -> (a, [b])
    
    instance C a => C (T a) where
      foo :: forall b. b -> T a -> (T a, [b])
      foo x (T y) = (T y, xs)
         where
           xs :: [b]
           xs = [x,x,x]
    

    Provided that you also specify ScopedTypeVariables (Lexically scoped type variables), the forall b scopes over the definition of foo, and in particular over the type signature for xs.

10.8.4. Overloaded string literals

OverloadedStrings
Since:6.8.1

Enable overloaded string literals (e.g. string literals desugared via the IsString class).

GHC supports overloaded string literals. Normally a string literal has type String, but with overloaded string literals enabled (with OverloadedStrings) a string literal has type (IsString a) => a.

This means that the usual string syntax can be used, e.g., for ByteString, Text, and other variations of string like types. String literals behave very much like integer literals, i.e., they can be used in both expressions and patterns. If used in a pattern the literal will be replaced by an equality test, in the same way as an integer literal is.

The class IsString is defined as:

class IsString a where
    fromString :: String -> a

The only predefined instance is the obvious one to make strings work as usual:

instance IsString [Char] where
    fromString cs = cs

The class IsString is not in scope by default. If you want to mention it explicitly (for example, to give an instance declaration for it), you can import it from module GHC.Exts.

Haskell’s defaulting mechanism (Haskell Report, Section 4.3.4) is extended to cover string literals, when OverloadedStrings is specified. Specifically:

  • Each type in a default declaration must be an instance of Num or of IsString.
  • If no default declaration is given, then it is just as if the module contained the declaration default( Integer, Double, String).
  • The standard defaulting rule is extended thus: defaulting applies when all the unresolved constraints involve standard classes or IsString; and at least one is a numeric class or IsString.

So, for example, the expression length "foo" will give rise to an ambiguous use of IsString a0 which, because of the above rules, will default to String.

A small example:

module Main where

import GHC.Exts( IsString(..) )

newtype MyString = MyString String deriving (Eq, Show)
instance IsString MyString where
    fromString = MyString

greet :: MyString -> MyString
greet "hello" = "world"
greet other = other

main = do
    print $ greet "hello"
    print $ greet "fool"

Note that deriving Eq is necessary for the pattern matching to work since it gets translated into an equality comparison.

10.8.5. Overloaded labels

OverloadedLabels
Since:8.0.1

Enable use of the #foo overloaded label syntax.

GHC supports overloaded labels, a form of identifier whose interpretation may depend both on its type and on its literal text. When the OverloadedLabels extension is enabled, an overloaded label can be written with a prefix hash, for example #foo. The type of this expression is IsLabel "foo" a => a.

The class IsLabel is defined as:

class IsLabel (x :: Symbol) a where
  fromLabel :: a

This is rather similar to the class IsString (see Overloaded string literals), but with an additional type parameter that makes the text of the label available as a type-level string (see Type-Level Literals). Note that fromLabel had an extra Proxy# x argument in GHC 8.0, but this was removed in GHC 8.2 as a type application (see Visible type application) can be used instead.

There are no predefined instances of this class. It is not in scope by default, but can be brought into scope by importing GHC.OverloadedLabels. Unlike IsString, there are no special defaulting rules for IsLabel.

During typechecking, GHC will replace an occurrence of an overloaded label like #foo with fromLabel @"foo". This will have some type alpha and require the solution of a class constraint IsLabel "foo" alpha.

The intention is for IsLabel to be used to support overloaded record fields and perhaps anonymous records. Thus, it may be given instances for base datatypes (in particular (->)) in the future.

If RebindableSyntax is enabled, overloaded labels will be desugared using whatever fromLabel function is in scope, rather than always using GHC.OverloadedLabels.fromLabel.

When writing an overloaded label, there must be no space between the hash sign and the following identifier. The MagicHash extension makes use of postfix hash signs; if OverloadedLabels and MagicHash are both enabled then x#y means x# y, but if only OverloadedLabels is enabled then it means x #y. The UnboxedTuples extension makes (# a single lexeme, so when UnboxedTuples is enabled you must write a space between an opening parenthesis and an overloaded label. To avoid confusion, you are strongly encouraged to put a space before the hash when using OverloadedLabels.

When using OverloadedLabels (or other extensions that make use of hash signs) in a .hsc file (see Writing Haskell interfaces to C code: hsc2hs), the hash signs must be doubled (write ##foo instead of #foo) to avoid them being treated as hsc2hs directives.

Here is an extension of the record access example in Type-Level Literals showing how an overloaded label can be used as a record selector:

{-# LANGUAGE DataKinds, KindSignatures, MultiParamTypeClasses,
             FunctionalDependencies, FlexibleInstances,
             OverloadedLabels, ScopedTypeVariables #-}

import GHC.OverloadedLabels (IsLabel(..))
import GHC.TypeLits (Symbol)

data Label (l :: Symbol) = Get

class Has a l b | a l -> b where
  from :: a -> Label l -> b

data Point = Point Int Int deriving Show

instance Has Point "x" Int where from (Point x _) _ = x
instance Has Point "y" Int where from (Point _ y) _ = y

instance Has a l b => IsLabel l (a -> b) where
  fromLabel x = from x (Get :: Label l)

example = #x (Point 1 2)

10.8.6. Overloaded lists

OverloadedLists
Since:7.8.1

Enable overloaded list syntax (e.g. desugaring of lists via the IsList class).

GHC supports overloading of the list notation. Let us recap the notation for constructing lists. In Haskell, the list notation can be used in the following seven ways:

[]          -- Empty list
[x]         -- x : []
[x,y,z]     -- x : y : z : []
[x .. ]     -- enumFrom x
[x,y ..]    -- enumFromThen x y
[x .. y]    -- enumFromTo x y
[x,y .. z]  -- enumFromThenTo x y z

When the OverloadedLists extension is turned on, the aforementioned seven notations are desugared as follows:

[]          -- fromListN 0 []
[x]         -- fromListN 1 (x : [])
[x,y,z]     -- fromListN 3 (x : y : z : [])
[x .. ]     -- fromList (enumFrom x)
[x,y ..]    -- fromList (enumFromThen x y)
[x .. y]    -- fromList (enumFromTo x y)
[x,y .. z]  -- fromList (enumFromThenTo x y z)

This extension allows programmers to use the list notation for construction of structures like: Set, Map, IntMap, Vector, Text and Array. The following code listing gives a few examples:

['0' .. '9']             :: Set Char
[1 .. 10]                :: Vector Int
[("default",0), (k1,v1)] :: Map String Int
['a' .. 'z']             :: Text

List patterns are also overloaded. When the OverloadedLists extension is turned on, these definitions are desugared as follows

f [] = ...          -- f (toList -> []) = ...
g [x,y,z] = ...     -- g (toList -> [x,y,z]) = ...

(Here we are using view-pattern syntax for the translation, see View patterns.)

10.8.6.1. The IsList class

In the above desugarings, the functions toList, fromList and fromListN are all methods of the IsList class, which is itself exported from the GHC.Exts module. The type class is defined as follows:

class IsList l where
  type Item l

  fromList :: [Item l] -> l
  toList   :: l -> [Item l]

  fromListN :: Int -> [Item l] -> l
  fromListN _ = fromList

The IsList class and its methods are intended to be used in conjunction with the OverloadedLists extension.

  • The type function Item returns the type of items of the structure l.
  • The function fromList constructs the structure l from the given list of Item l.
  • The function fromListN takes the input list’s length as a hint. Its behaviour should be equivalent to fromList. The hint can be used for more efficient construction of the structure l compared to fromList. If the given hint is not equal to the input list’s length the behaviour of fromListN is not specified.
  • The function toList should be the inverse of fromList.

It is perfectly fine to declare new instances of IsList, so that list notation becomes useful for completely new data types. Here are several example instances:

instance IsList [a] where
  type Item [a] = a
  fromList = id
  toList = id

instance (Ord a) => IsList (Set a) where
  type Item (Set a) = a
  fromList = Set.fromList
  toList = Set.toList

instance (Ord k) => IsList (Map k v) where
  type Item (Map k v) = (k,v)
  fromList = Map.fromList
  toList = Map.toList

instance IsList (IntMap v) where
  type Item (IntMap v) = (Int,v)
  fromList = IntMap.fromList
  toList = IntMap.toList

instance IsList Text where
  type Item Text = Char
  fromList = Text.pack
  toList = Text.unpack

instance IsList (Vector a) where
  type Item (Vector a) = a
  fromList  = Vector.fromList
  fromListN = Vector.fromListN
  toList = Vector.toList

10.8.6.2. Rebindable syntax

When desugaring list notation with OverloadedLists GHC uses the fromList (etc) methods from module GHC.Exts. You do not need to import GHC.Exts for this to happen.

However if you use RebindableSyntax, then GHC instead uses whatever is in scope with the names of toList, fromList and fromListN. That is, these functions are rebindable; c.f. Rebindable syntax and the implicit Prelude import.

10.8.6.3. Defaulting

Currently, the IsList class is not accompanied with defaulting rules. Although feasible, not much thought has gone into how to specify the meaning of the default declarations like:

default ([a])

10.8.6.4. Speculation about the future

The current implementation of the OverloadedLists extension can be improved by handling the lists that are only populated with literals in a special way. More specifically, the compiler could allocate such lists statically using a compact representation and allow IsList instances to take advantage of the compact representation. Equipped with this capability the OverloadedLists extension will be in a good position to subsume the OverloadedStrings extension (currently, as a special case, string literals benefit from statically allocated compact representation).

10.8.7. Undecidable (or recursive) superclasses

UndecidableSuperClasses
Since:8.0.1

Allow all superclass constraints, including those that may result in non-termination of the typechecker.

The language extension UndecidableSuperClasses allows much more flexible constraints in superclasses.

A class cannot generally have itself as a superclass. So this is illegal

class C a => D a where ...
class D a => C a where ...

GHC implements this test conservatively when type functions, or type variables, are involved. For example

type family F a :: Constraint
class F a => C a where ...

GHC will complain about this, because you might later add

type instance F Int = C Int

and now we’d be in a superclass loop. Here’s an example involving a type variable

class f (C f) => C f
class c       => Id c

If we expanded the superclasses of C Id we’d get first Id (C Id) and thence C Id again.

But superclass constraints like these are sometimes useful, and the conservative check is annoying where no actual recursion is involved.

Moreover genuninely-recursive superclasses are sometimes useful. Here’s a real-life example (Trac #10318)

class (Frac (Frac a) ~ Frac a,
       Fractional (Frac a),
       IntegralDomain (Frac a))
    => IntegralDomain a where
 type Frac a :: Type

Here the superclass cycle does terminate but it’s not entirely straightforward to see that it does.

With the language extension UndecidableSuperClasses GHC lifts all restrictions on superclass constraints. If there really is a loop, GHC will only expand it to finite depth.

10.9. Type families

TypeFamilies
Implies:MonoLocalBinds, KindSignatures, ExplicitNamespaces
Since:6.8.1

Allow use and definition of indexed type and data families.

Indexed type families form an extension to facilitate type-level programming. Type families are a generalisation of associated data types [AssocDataTypes2005] and associated type synonyms [AssocTypeSyn2005] Type families themselves are described in Schrijvers 2008 [TypeFamilies2008]. Type families essentially provide type-indexed data types and named functions on types, which are useful for generic programming and highly parameterised library interfaces as well as interfaces with enhanced static information, much like dependent types. They might also be regarded as an alternative to functional dependencies, but provide a more functional style of type-level programming than the relational style of functional dependencies.

Indexed type families, or type families for short, are type constructors that represent sets of types. Set members are denoted by supplying the type family constructor with type parameters, which are called type indices. The difference between vanilla parametrised type constructors and family constructors is much like between parametrically polymorphic functions and (ad-hoc polymorphic) methods of type classes. Parametric polymorphic functions behave the same at all type instances, whereas class methods can change their behaviour in dependence on the class type parameters. Similarly, vanilla type constructors imply the same data representation for all type instances, but family constructors can have varying representation types for varying type indices.

Indexed type families come in three flavours: data families, open type synonym families, and closed type synonym families. They are the indexed family variants of algebraic data types and type synonyms, respectively. The instances of data families can be data types and newtypes.

Type families are enabled by the language extension TypeFamilies. Additional information on the use of type families in GHC is available on the Haskell wiki page on type families.

[AssocDataTypes2005]Associated Types with Class”, M. Chakravarty, G. Keller, S. Peyton Jones, and S. Marlow. In Proceedings of “The 32nd Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL‘05)”, pages 1-13, ACM Press, 2005.
[AssocTypeSyn2005]Type Associated Type Synonyms”. M. Chakravarty, G. Keller, and S. Peyton Jones. In Proceedings of “The Tenth ACM SIGPLAN International Conference on Functional Programming”, ACM Press, pages 241-253, 2005.
[TypeFamilies2008]Type Checking with Open Type Functions”, T. Schrijvers, S. Peyton-Jones, M. Chakravarty, and M. Sulzmann, in Proceedings of “ICFP 2008: The 13th ACM SIGPLAN International Conference on Functional Programming”, ACM Press, pages 51-62, 2008.

10.9.1. Data families

Data families appear in two flavours: (1) they can be defined on the toplevel or (2) they can appear inside type classes (in which case they are known as associated types). The former is the more general variant, as it lacks the requirement for the type-indexes to coincide with the class parameters. However, the latter can lead to more clearly structured code and compiler warnings if some type instances were - possibly accidentally - omitted. In the following, we always discuss the general toplevel form first and then cover the additional constraints placed on associated types.

10.9.1.1. Data family declarations

Indexed data families are introduced by a signature, such as

data family GMap k :: Type -> Type

The special family distinguishes family from standard data declarations. The result kind annotation is optional and, as usual, defaults to Type if omitted. An example is

data family Array e

Named arguments can also be given explicit kind signatures if needed. Just as with GADT declarations named arguments are entirely optional, so that we can declare Array alternatively with

data family Array :: Type -> Type

Unlike with ordinary data definitions, the result kind of a data family does not need to be Type: it can alternatively be a kind variable (with PolyKinds). Data instances’ kinds must end in Type, however.

10.9.1.2. Data instance declarations

Instance declarations of data and newtype families are very similar to standard data and newtype declarations. The only two differences are that the keyword data or newtype is followed by instance and that some or all of the type arguments can be non-variable types, but may not contain forall types or type synonym families. However, data families are generally allowed in type parameters, and type synonyms are allowed as long as they are fully applied and expand to a type that is itself admissible - exactly as this is required for occurrences of type synonyms in class instance parameters. For example, the Either instance for GMap is

data instance GMap (Either a b) v = GMapEither (GMap a v) (GMap b v)

In this example, the declaration has only one variant. In general, it can be any number.

When the flag -Wunused-type-patterns is enabled, type variables that are mentioned in the patterns on the left hand side, but not used on the right hand side are reported. Variables that occur multiple times on the left hand side are also considered used. To suppress the warnings, unused variables should be either replaced or prefixed with underscores. Type variables starting with an underscore (_x) are otherwise treated as ordinary type variables.

This resembles the wildcards that can be used in Partial Type Signatures. However, there are some differences. No error messages reporting the inferred types are generated, nor does the extension PartialTypeSignatures have any effect.

Data and newtype instance declarations are only permitted when an appropriate family declaration is in scope - just as a class instance declaration requires the class declaration to be visible. Moreover, each instance declaration has to conform to the kind determined by its family declaration. This implies that the number of parameters of an instance declaration matches the arity determined by the kind of the family.

A data family instance declaration can use the full expressiveness of ordinary data or newtype declarations:

  • Although, a data family is introduced with the keyword “data”, a data family instance can use either data or newtype. For example:

    data family T a
    data    instance T Int  = T1 Int | T2 Bool
    newtype instance T Char = TC Bool
    
  • A data instance can use GADT syntax for the data constructors, and indeed can define a GADT. For example:

    data family G a b
    data instance G [a] b where
       G1 :: c -> G [Int] b
       G2 :: G [a] Bool
    
  • You can use a deriving clause on a data instance or newtype instance declaration.

Even if data families are defined as toplevel declarations, functions that perform different computations for different family instances may still need to be defined as methods of type classes. In particular, the following is not possible:

data family T a
data instance T Int  = A
data instance T Char = B
foo :: T a -> Int
foo A = 1
foo B = 2

Instead, you would have to write foo as a class operation, thus:

class Foo a where
  foo :: T a -> Int
instance Foo Int where
  foo A = 1
instance Foo Char where
  foo B = 2

Given the functionality provided by GADTs (Generalised Algebraic Data Types), it might seem as if a definition, such as the above, should be feasible. However, type families - in contrast to GADTs - are open; i.e., new instances can always be added, possibly in other modules. Supporting pattern matching across different data instances would require a form of extensible case construct.

10.9.1.3. Overlap of data instances

The instance declarations of a data family used in a single program may not overlap at all, independent of whether they are associated or not. In contrast to type class instances, this is not only a matter of consistency, but one of type safety.

10.9.2. Synonym families

Type families appear in three flavours: (1) they can be defined as open families on the toplevel, (2) they can be defined as closed families on the toplevel, or (3) they can appear inside type classes (in which case they are known as associated type synonyms). Toplevel families are more general, as they lack the requirement for the type-indexes to coincide with the class parameters. However, associated type synonyms can lead to more clearly structured code and compiler warnings if some type instances were - possibly accidentally - omitted. In the following, we always discuss the general toplevel forms first and then cover the additional constraints placed on associated types. Note that closed associated type synonyms do not exist.

10.9.2.1. Type family declarations

Open indexed type families are introduced by a signature, such as

type family Elem c :: Type

The special family distinguishes family from standard type declarations. The result kind annotation is optional and, as usual, defaults to Type if omitted. An example is

type family Elem c

Parameters can also be given explicit kind signatures if needed. We call the number of parameters in a type family declaration, the family’s arity, and all applications of a type family must be fully saturated with respect to that arity. This requirement is unlike ordinary type synonyms and it implies that the kind of a type family is not sufficient to determine a family’s arity, and hence in general, also insufficient to determine whether a type family application is well formed. As an example, consider the following declaration:

type family F a b :: Type -> Type
  -- F's arity is 2,
  -- although its overall kind is Type -> Type -> Type -> Type

Given this declaration the following are examples of well-formed and malformed types:

F Char [Int]       -- OK!  Kind: Type -> Type
F Char [Int] Bool  -- OK!  Kind: Type
F IO Bool          -- WRONG: kind mismatch in the first argument
F Bool             -- WRONG: unsaturated application

The result kind annotation is optional and defaults to Type (like argument kinds) if omitted. Polykinded type families can be declared using a parameter in the kind annotation:

type family F a :: k

In this case the kind parameter k is actually an implicit parameter of the type family.

10.9.2.2. Type instance declarations

Instance declarations of type families are very similar to standard type synonym declarations. The only two differences are that the keyword type is followed by instance and that some or all of the type arguments can be non-variable types, but may not contain forall types or type synonym families. However, data families are generally allowed, and type synonyms are allowed as long as they are fully applied and expand to a type that is admissible - these are the exact same requirements as for data instances. For example, the [e] instance for Elem is

type instance Elem [e] = e

Type arguments can be replaced with underscores (_) if the names of the arguments don’t matter. This is the same as writing type variables with unique names. Unused type arguments can be replaced or prefixed with underscores to avoid warnings when the -Wunused-type-patterns flag is enabled. The same rules apply as for Data instance declarations.

Type family instance declarations are only legitimate when an appropriate family declaration is in scope - just like class instances require the class declaration to be visible. Moreover, each instance declaration has to conform to the kind determined by its family declaration, and the number of type parameters in an instance declaration must match the number of type parameters in the family declaration. Finally, the right-hand side of a type instance must be a monotype (i.e., it may not include foralls) and after the expansion of all saturated vanilla type synonyms, no synonyms, except family synonyms may remain.

10.9.2.3. Closed type families

A type family can also be declared with a where clause, defining the full set of equations for that family. For example:

type family F a where
  F Int  = Double
  F Bool = Char
  F a    = String

A closed type family’s equations are tried in order, from top to bottom, when simplifying a type family application. In this example, we declare an instance for F such that F Int simplifies to Double, F Bool simplifies to Char, and for any other type a that is known not to be Int or Bool, F a simplifies to String. Note that GHC must be sure that a cannot unify with Int or Bool in that last case; if a programmer specifies just F a in their code, GHC will not be able to simplify the type. After all, a might later be instantiated with Int.

A closed type family’s equations have the same restrictions as the equations for open type family instances.

A closed type family may be declared with no equations. Such closed type families are opaque type-level definitions that will never reduce, are not necessarily injective (unlike empty data types), and cannot be given any instances. This is different from omitting the equations of a closed type family in a hs-boot file, which uses the syntax where .., as in that case there may or may not be equations given in the hs file.

10.9.2.4. Type family examples

Here are some examples of admissible and illegal type instances:

type family F a :: Type
type instance F [Int]   = Int   -- OK!
type instance F String  = Char  -- OK!
type instance F (F a)   = a     -- WRONG: type parameter mentions a type family
type instance
  F (forall a. (a, b))  = b     -- WRONG: a forall type appears in a type parameter
type instance
  F Float = forall a.a          -- WRONG: right-hand side may not be a forall type
type family H a where          -- OK!
  H Int  = Int
  H Bool = Bool
  H a    = String
type instance H Char = Char    -- WRONG: cannot have instances of closed family
type family K a where          -- OK!

type family G a b :: Type -> Type
type instance G Int            = (,)     -- WRONG: must be two type parameters
type instance G Int Char Float = Double  -- WRONG: must be two type parameters

10.9.2.5. Compatibility and apartness of type family equations

There must be some restrictions on the equations of type families, lest we define an ambiguous rewrite system. So, equations of open type families are restricted to be compatible. Two type patterns are compatible if

  1. all corresponding types and implicit kinds in the patterns are apart, or
  2. the two patterns unify producing a substitution, and the right-hand sides are equal under that substitution.

Two types are considered apart if, for all possible substitutions, the types cannot reduce to a common reduct.

The first clause of “compatible” is the more straightforward one. It says that the patterns of two distinct type family instances cannot overlap. For example, the following is disallowed:

type instance F Int = Bool
type instance F Int = Char

The second clause is a little more interesting. It says that two overlapping type family instances are allowed if the right-hand sides coincide in the region of overlap. Some examples help here:

type instance F (a, Int) = [a]
type instance F (Int, b) = [b]   -- overlap permitted

type instance G (a, Int)  = [a]
type instance G (Char, a) = [a]  -- ILLEGAL overlap, as [Char] /= [Int]

Note that this compatibility condition is independent of whether the type family is associated or not, and it is not only a matter of consistency, but one of type safety.

For a polykinded type family, the kinds are checked for apartness just like types. For example, the following is accepted:

type family J a :: k
type instance J Int = Bool
type instance J Int = Maybe

These instances are compatible because they differ in their implicit kind parameter; the first uses Type while the second uses Type -> Type.

The definition for “compatible” uses a notion of “apart”, whose definition in turn relies on type family reduction. This condition of “apartness”, as stated, is impossible to check, so we use this conservative approximation: two types are considered to be apart when the two types cannot be unified, even by a potentially infinite unifier. Allowing the unifier to be infinite disallows the following pair of instances:

type instance H x   x = Int
type instance H [x] x = Bool

The type patterns in this pair equal if x is replaced by an infinite nesting of lists. Rejecting instances such as these is necessary for type soundness.

Compatibility also affects closed type families. When simplifying an application of a closed type family, GHC will select an equation only when it is sure that no incompatible previous equation will ever apply. Here are some examples:

type family F a where
  F Int = Bool
  F a   = Char

type family G a where
  G Int = Int
  G a   = a

In the definition for F, the two equations are incompatible – their patterns are not apart, and yet their right-hand sides do not coincide. Thus, before GHC selects the second equation, it must be sure that the first can never apply. So, the type F a does not simplify; only a type such as F Double will simplify to Char. In G, on the other hand, the two equations are compatible. Thus, GHC can ignore the first equation when looking at the second. So, G a will simplify to a.

However see Type, class and other declarations for the overlap rules in GHCi.

10.9.2.6. Decidability of type synonym instances

UndecidableInstances

Relax restrictions on the decidability of type synonym family instances.

In order to guarantee that type inference in the presence of type families decidable, we need to place a number of additional restrictions on the formation of type instance declarations (c.f., Definition 5 (Relaxed Conditions) of “Type Checking with Open Type Functions”). Instance declarations have the general form

type instance F t1 .. tn = t

where we require that for every type family application (G s1 .. sm) in t,

  1. s1 .. sm do not contain any type family constructors,
  2. the total number of symbols (data type constructors and type variables) in s1 .. sm is strictly smaller than in t1 .. tn, and
  3. for every type variable a, a occurs in s1 .. sm at most as often as in t1 .. tn.

These restrictions are easily verified and ensure termination of type inference. However, they are not sufficient to guarantee completeness of type inference in the presence of, so called, ‘’loopy equalities’‘, such as a ~ [F a], where a recursive occurrence of a type variable is underneath a family application and data constructor application - see the above mentioned paper for details.

If the option UndecidableInstances is passed to the compiler, the above restrictions are not enforced and it is on the programmer to ensure termination of the normalisation of type families during type inference.

10.9.3. Wildcards on the LHS of data and type family instances

When the name of a type argument of a data or type instance declaration doesn’t matter, it can be replaced with an underscore (_). This is the same as writing a type variable with a unique name.

data family F a b :: Type
data instance F Int _ = Int
-- Equivalent to  data instance F Int b = Int

type family T a :: Type
type instance T (a,_) = a
-- Equivalent to  type instance T (a,b) = a

This use of underscore for wildcard in a type pattern is exactly like pattern matching in the term language, but is rather different to the use of a underscore in a partial type signature (see Type Wildcards).

A type variable beginning with an underscore is not treated specially in a type or data instance declaration. For example:

data instance F Bool _a = _a -> Int
-- Equivalent to  data instance F Bool a = a -> Int

Contrast this with the special treatment of named wildcards in type signatures (Named Wildcards).

10.9.4. Associated data and type families

A data or type synonym family can be declared as part of a type class, thus:

class GMapKey k where
  data GMap k :: Type -> Type
  ...

class Collects ce where
  type Elem ce :: Type
  ...

When doing so, we (optionally) may drop the “family” keyword.

The type parameters must all be type variables, of course, and some (but not necessarily all) of then can be the class parameters. Each class parameter may only be used at most once per associated type, but some may be omitted and they may be in an order other than in the class head. Hence, the following contrived example is admissible:

class C a b c where
  type T c a x :: Type

Here c and a are class parameters, but the type is also indexed on a third parameter x.

10.9.4.1. Associated instances

When an associated data or type synonym family instance is declared within a type class instance, we (optionally) may drop the instance keyword in the family instance:

instance (GMapKey a, GMapKey b) => GMapKey (Either a b) where
  data GMap (Either a b) v = GMapEither (GMap a v) (GMap b v)
  ...

instance Eq (Elem [e]) => Collects [e] where
  type Elem [e] = e
  ...

The data or type family instance for an assocated type must follow the rule that the type indexes corresponding to class parameters must have precisely the same as type given in the instance head. For example:

class Collects ce where
  type Elem ce :: Type

instance Eq (Elem [e]) => Collects [e] where
  -- Choose one of the following alternatives:
  type Elem [e] = e       -- OK
  type Elem [x] = x       -- BAD; '[x]' is different to '[e]' from head
  type Elem x   = x       -- BAD; 'x' is different to '[e]'
  type Elem [Maybe x] = x -- BAD: '[Maybe x]' is different to '[e]'

Note the following points:

  • An instance for an associated family can only appear as part of an instance declarations of the class in which the family was declared, just as with the equations of the methods of a class.

  • The type variables on the right hand side of the type family equation must, as usual, be explicitly bound by the left hand side. This restriction is relaxed for kind variables, however, as the right hand side is allowed to mention kind variables that are implicitly bound. For example, these are legitimate:

    data family Nat :: k -> k -> Type
    -- k is implicitly bound by an invisible kind pattern
    newtype instance Nat :: (k -> Type) -> (k -> Type) -> Type where
      Nat :: (forall xx. f xx -> g xx) -> Nat f g
    
    class Funct f where
      type Codomain f :: Type
    instance Funct ('KProxy :: KProxy o) where
      -- o is implicitly bound by the kind signature
      -- of the LHS type pattern ('KProxy)
      type Codomain 'KProxy = NatTr (Proxy :: o -> Type)
    
  • The instance for an associated type can be omitted in class instances. In that case, unless there is a default instance (see Associated type synonym defaults), the corresponding instance type is not inhabited; i.e., only diverging expressions, such as undefined, can assume the type.

  • Although it is unusual, there (currently) can be multiple instances for an associated family in a single instance declaration. For example, this is legitimate:

    instance GMapKey Flob where
      data GMap Flob [v] = G1 v
      data GMap Flob Int = G2 Int
      ...
    

    Here we give two data instance declarations, one in which the last parameter is [v], and one for which it is Int. Since you cannot give any subsequent instances for (GMap Flob ...), this facility is most useful when the free indexed parameter is of a kind with a finite number of alternatives (unlike Type).

10.9.4.2. Associated type synonym defaults

It is possible for the class defining the associated type to specify a default for associated type instances. So for example, this is OK:

class IsBoolMap v where
  type Key v
  type instance Key v = Int

  lookupKey :: Key v -> v -> Maybe Bool

instance IsBoolMap [(Int, Bool)] where
  lookupKey = lookup

In an instance declaration for the class, if no explicit type instance declaration is given for the associated type, the default declaration is used instead, just as with default class methods.

Note the following points:

  • The instance keyword is optional.
  • There can be at most one default declaration for an associated type synonym.
  • A default declaration is not permitted for an associated data type.
  • The default declaration must mention only type variables on the left hand side, and the right hand side must mention only type variables that are explicitly bound on the left hand side. This restriction is relaxed for kind variables, however, as the right hand side is allowed to mention kind variables that are implicitly bound on the left hand side.
  • Unlike the associated type family declaration itself, the type variables of the default instance are independent of those of the parent class.

Here are some examples:

class C (a :: Type) where
  type F1 a :: Type
  type instance F1 a = [a]     -- OK
  type instance F1 a = a->a    -- BAD; only one default instance is allowed

  type F2 b a                  -- OK; note the family has more type
                               --     variables than the class
  type instance F2 c d = c->d  -- OK; you don't have to use 'a' in the type instance

  type F3 a
  type F3 [b] = b              -- BAD; only type variables allowed on the LHS

  type F4 a
  type F4 b = a                -- BAD; 'a' is not in scope  in the RHS

  type F5 a :: [k]
  type F5 a = ('[] :: [x])     -- OK; the kind variable x is implicitly
                                      bound by an invisible kind pattern
                                      on the LHS

  type F6 a
  type F6 a =
    Proxy ('[] :: [x])         -- BAD; the kind variable x is not bound,
                                       even by an invisible kind pattern

  type F7 (x :: a) :: [a]
  type F7 x = ('[] :: [a])     -- OK; the kind variable a is implicitly
                                      bound by the kind signature of the
                                      LHS type pattern

10.9.4.3. Scoping of class parameters

The visibility of class parameters in the right-hand side of associated family instances depends solely on the parameters of the family. As an example, consider the simple class declaration

class C a b where
  data T a

Only one of the two class parameters is a parameter to the data family. Hence, the following instance declaration is invalid:

instance C [c] d where
  data T [c] = MkT (c, d)    -- WRONG!!  'd' is not in scope

Here, the right-hand side of the data instance mentions the type variable d that does not occur in its left-hand side. We cannot admit such data instances as they would compromise type safety.

Bear in mind that it is also possible for the right-hand side of an associated family instance to contain kind parameters (by using the PolyKinds extension). For instance, this class and instance are perfectly admissible:

class C k where
  type T :: k

instance C (Maybe a) where
  type T = (Nothing :: Maybe a)

Here, although the right-hand side (Nothing :: Maybe a) mentions a kind variable a which does not occur on the left-hand side, this is acceptable, because a is implicitly bound by T‘s kind pattern.

A kind variable can also be bound implicitly in a LHS type pattern, as in this example:

class C a where
  type T (x :: a) :: [a]

instance C (Maybe a) where
  type T x = ('[] :: [Maybe a])

In ('[] :: [Maybe a]), the kind variable a is implicitly bound by the kind signature of the LHS type pattern x.

10.9.4.4. Instance contexts and associated type and data instances

Associated type and data instance declarations do not inherit any context specified on the enclosing instance. For type instance declarations, it is unclear what the context would mean. For data instance declarations, it is unlikely a user would want the context repeated for every data constructor. The only place where the context might likely be useful is in a deriving clause of an associated data instance. However, even here, the role of the outer instance context is murky. So, for clarity, we just stick to the rule above: the enclosing instance context is ignored. If you need to use a non-trivial context on a derived instance, use a standalone deriving clause (at the top level).

10.9.5. Import and export

The rules for export lists (Haskell Report Section 5.2) needs adjustment for type families:

  • The form T(..), where T is a data family, names the family T and all the in-scope constructors (whether in scope qualified or unqualified) that are data instances of T.
  • The form T(.., ci, .., fj, ..), where T is a data family, names T and the specified constructors ci and fields fj as usual. The constructors and field names must belong to some data instance of T, but are not required to belong to the same instance.
  • The form C(..), where C is a class, names the class C and all its methods and associated types.
  • The form C(.., mi, .., type Tj, ..), where C is a class, names the class C, and the specified methods mi and associated types Tj. The types need a keyword “type” to distinguish them from data constructors.
  • Whenever there is no export list and a data instance is defined, the corresponding data family type constructor is exported along with the new data constructors, regardless of whether the data family is defined locally or in another module.

10.9.5.1. Examples

Recall our running GMapKey class example:

class GMapKey k where
  data GMap k :: Type -> Type
  insert :: GMap k v -> k -> v -> GMap k v
  lookup :: GMap k v -> k -> Maybe v
  empty  :: GMap k v

instance (GMapKey a, GMapKey b) => GMapKey (Either a b) where
  data GMap (Either a b) v = GMapEither (GMap a v) (GMap b v)
  ...method declarations...

Here are some export lists and their meaning:

  • module GMap( GMapKey )
    

    Exports just the class name.

  • module GMap( GMapKey(..) )
    

    Exports the class, the associated type GMap and the member functions empty, lookup, and insert. The data constructors of GMap (in this case GMapEither) are not exported.

  • module GMap( GMapKey( type GMap, empty, lookup, insert ) )
    

    Same as the previous item. Note the “type” keyword.

  • module GMap( GMapKey(..), GMap(..) )
    

    Same as previous item, but also exports all the data constructors for GMap, namely GMapEither.

  • module GMap ( GMapKey( empty, lookup, insert), GMap(..) )
    

    Same as previous item.

  • module GMap ( GMapKey, empty, lookup, insert, GMap(..) )
    

    Same as previous item.

Two things to watch out for:

  • You cannot write GMapKey(type GMap(..)) — i.e., sub-component specifications cannot be nested. To specify GMap‘s data constructors, you have to list it separately.

  • Consider this example:

    module X where
      data family D
    
    module Y where
      import X
      data instance D Int = D1 | D2
    

    Module Y exports all the entities defined in Y, namely the data constructors D1 and D2, and implicitly the data family D, even though it’s defined in X. This means you can write import Y( D(D1,D2) ) without giving an explicit export list like this:

         module Y( D(..) ) where ...
    or   module Y( module Y, D ) where ...
    

10.9.5.2. Instances

Family instances are implicitly exported, just like class instances. However, this applies only to the heads of instances, not to the data constructors an instance defines.

10.9.6. Type families and instance declarations

Type families require us to extend the rules for the form of instance heads, which are given in Relaxed rules for the instance head. Specifically:

  • Data type families may appear in an instance head
  • Type synonym families may not appear (at all) in an instance head

The reason for the latter restriction is that there is no way to check for instance matching. Consider

type family F a
type instance F Bool = Int

class C a

instance C Int
instance C (F a)

Now a constraint (C (F Bool)) would match both instances. The situation is especially bad because the type instance for F Bool might be in another module, or even in a module that is not yet written.

However, type class instances of instances of data families can be defined much like any other data type. For example, we can say

data instance T Int = T1 Int | T2 Bool
instance Eq (T Int) where
  (T1 i) == (T1 j) = i==j
  (T2 i) == (T2 j) = i==j
  _      == _      = False

Note that class instances are always for particular instances of a data family and never for an entire family as a whole. This is for essentially the same reasons that we cannot define a toplevel function that performs pattern matching on the data constructors of different instances of a single type family. It would require a form of extensible case construct.

Data instance declarations can also have deriving clauses. For example, we can write

data GMap () v = GMapUnit (Maybe v)
               deriving Show

which implicitly defines an instance of the form

instance Show v => Show (GMap () v) where ...

10.9.7. Injective type families

TypeFamilyDependencies
Implies:TypeFamilies
Since:8.0.1

Allow functional dependency annotations on type families. This allows one to define injective type families.

Starting with GHC 8.0 type families can be annotated with injectivity information. This information is then used by GHC during type checking to resolve type ambiguities in situations where a type variable appears only under type family applications. Consider this contrived example:

type family Id a
type instance Id Int = Int
type instance Id Bool = Bool

id :: Id t -> Id t
id x = x

Here the definition of id will be rejected because type variable t appears only under type family applications and is thus ambiguous. But this code will be accepted if we tell GHC that Id is injective, which means it will be possible to infer t at call sites from the type of the argument:

type family Id a = r | r -> a

Injective type families are enabled with -XTypeFamilyDependencies language extension. This extension implies -XTypeFamilies.

For full details on injective type families refer to Haskell Symposium 2015 paper Injective type families for Haskell.

10.9.7.1. Syntax of injectivity annotation

Injectivity annotation is added after type family head and consists of two parts:

  • a type variable that names the result of a type family. Syntax: = tyvar or = (tyvar :: kind). Type variable must be fresh.
  • an injectivity annotation of the form | A -> B, where A is the result type variable (see previous bullet) and B is a list of argument type and kind variables in which type family is injective. It is possible to omit some variables if type family is not injective in them.

Examples:

type family Id a = result | result -> a where
type family F a b c = d | d -> a c b
type family G (a :: k) b c = foo | foo -> k b where

For open and closed type families it is OK to name the result but skip the injectivity annotation. This is not the case for associated type synonyms, where the named result without injectivity annotation will be interpreted as associated type synonym default.

10.9.7.2. Verifying injectivity annotation against type family equations

Once the user declares type family to be injective GHC must verify that this declaration is correct, ie. type family equations don’t violate the injectivity annotation. A general idea is that if at least one equation (bullets (1), (2) and (3) below) or a pair of equations (bullets (4) and (5) below) violates the injectivity annotation then a type family is not injective in a way user claims and an error is reported. In the bullets below RHS refers to the right-hand side of the type family equation being checked for injectivity. LHS refers to the arguments of that type family equation. Below are the rules followed when checking injectivity of a type family:

  1. If a RHS of a type family equation is a type family application GHC reports that the type family is not injective.

  2. If a RHS of a type family equation is a bare type variable we require that all LHS variables (including implicit kind variables) are also bare. In other words, this has to be a sole equation of that type family and it has to cover all possible patterns. If the patterns are not covering GHC reports that the type family is not injective.

  3. If a LHS type variable that is declared as injective is not mentioned on injective position in the RHS GHC reports that the type family is not injective. Injective position means either argument to a type constructor or injective argument to a type family.

  4. Open type families Open type families are typechecked incrementally. This means that when a module is imported type family instances contained in that module are checked against instances present in already imported modules.

    A pair of an open type family equations is checked by attempting to unify their RHSs. If the RHSs don’t unify this pair does not violate injectivity annotation. If unification succeeds with a substitution then LHSs of unified equations must be identical under that substitution. If they are not identical then GHC reports that the type family is not injective.

  5. In a closed type family all equations are ordered and in one place. Equations are also checked pair-wise but this time an equation has to be paired with all the preceeding equations. Of course a single-equation closed type family is trivially injective (unless (1), (2) or (3) above holds).

    When checking a pair of closed type family equations GHC tried to unify their RHSs. If they don’t unify this pair of equations does not violate injectivity annotation. If the RHSs can be unified under some substitution (possibly empty) then either the LHSs unify under the same substitution or the LHS of the latter equation is subsumed by earlier equations. If neither condition is met GHC reports that a type family is not injective.

Note that for the purpose of injectivity check in bullets (4) and (5) GHC uses a special variant of unification algorithm that treats type family applications as possibly unifying with anything.

10.10. Datatype promotion

DataKinds
Since:7.4.1

Allow promotion of data types to kind level.

This section describes data type promotion, an extension to the kind system that complements kind polymorphism. It is enabled by DataKinds, and described in more detail in the paper Giving Haskell a Promotion, which appeared at TLDI 2012.

10.10.1. Motivation

Standard Haskell has a rich type language. Types classify terms and serve to avoid many common programming mistakes. The kind language, however, is relatively simple, distinguishing only regular types (kind Type) and type constructors (e.g. kind Type -> Type -> Type). In particular when using advanced type system features, such as type families (Type families) or GADTs (Generalised Algebraic Data Types (GADTs)), this simple kind system is insufficient, and fails to prevent simple errors. Consider the example of type-level natural numbers, and length-indexed vectors:

data Ze
data Su n

data Vec :: Type -> Type -> Type where
  Nil  :: Vec a Ze
  Cons :: a -> Vec a n -> Vec a (Su n)

The kind of Vec is Type -> Type -> Type. This means that, e.g., Vec Int Char is a well-kinded type, even though this is not what we intend when defining length-indexed vectors.

With DataKinds, the example above can then be rewritten to:

data Nat = Ze | Su Nat

data Vec :: Type -> Nat -> Type where
  Nil  :: Vec a 'Ze
  Cons :: a -> Vec a n -> Vec a ('Su n)

With the improved kind of Vec, things like Vec Int Char are now ill-kinded, and GHC will report an error.

10.10.2. Overview

With DataKinds, GHC automatically promotes every datatype to be a kind and its (value) constructors to be type constructors. The following types

data Nat = Zero | Succ Nat

data List a = Nil | Cons a (List a)

data Pair a b = Pair a b

data Sum a b = L a | R b

give rise to the following kinds and type constructors (where promoted constructors are prefixed by a tick '):

Nat :: Type
'Zero :: Nat
'Succ :: Nat -> Nat

List :: Type -> Type
'Nil  :: forall k. List k
'Cons :: forall k. k -> List k -> List k

Pair  :: Type -> Type -> Type
'Pair :: forall k1 k2. k1 -> k2 -> Pair k1 k2

Sum :: Type -> Type -> Type
'L :: k1 -> Sum k1 k2
'R :: k2 -> Sum k1 k2

Virtually all data constructors, even those with rich kinds, can be promoted. There are only a couple of exceptions to this rule:

  • Data family instance constructors cannot be promoted at the moment. GHC’s type theory just isn’t up to the task of promoting data families, which requires full dependent types.

  • Data constructors with contexts that contain non-equality constraints cannot be promoted. For example:

    data Foo :: Type -> Type where
      MkFoo1 :: a ~ Int         => Foo a    -- promotable
      MkFoo2 :: a ~~ Int        => Foo a    -- promotable
      MkFoo3 :: Show a          => Foo a    -- not promotable
    

    MkFoo1 and MkFoo2 can be promoted, since their contexts only involve equality-oriented constraints. However, MkFoo3‘s context contains a non-equality constraint Show a, and thus cannot be promoted.

10.10.3. Distinguishing between types and constructors

In the examples above, all promoted constructors are prefixed with a single quote mark '. This mark tells GHC to look in the data constructor namespace for a name, not the type (constructor) namespace. Consider

data P = MkP    -- 1

data Prom = P   -- 2

We can thus distinguish the type P (which has a constructor MkP) from the promoted data constructor 'P (of kind Prom).

As a convenience, GHC allows you to omit the quote mark when the name is unambiguous. However, our experience has shown that the quote mark helps to make code more readable and less error-prone. GHC thus supports -Wunticked-promoted-constructors that will warn you if you use a promoted data constructor without a preceding quote mark.

Just as in the case of Template Haskell (Syntax), GHC gets confused if you put a quote mark before a data constructor whose second character is a quote mark. In this case, just put a space between the promotion quote and the data constructor:

data T = A'
type S = 'A'   -- ERROR: looks like a character
type R = ' A'  -- OK: promoted `A'`

10.10.5. Promoting existential data constructors

Note that we do promote existential data constructors that are otherwise suitable. For example, consider the following:

data Ex :: Type where
  MkEx :: forall a. a -> Ex

Both the type Ex and the data constructor MkEx get promoted, with the polymorphic kind 'MkEx :: forall k. k -> Ex. Somewhat surprisingly, you can write a type family to extract the member of a type-level existential:

type family UnEx (ex :: Ex) :: k
type instance UnEx (MkEx x) = x

At first blush, UnEx seems poorly-kinded. The return kind k is not mentioned in the arguments, and thus it would seem that an instance would have to return a member of k for any k. However, this is not the case. The type family UnEx is a kind-indexed type family. The return kind k is an implicit parameter to UnEx. The elaborated definitions are as follows (where implicit parameters are denoted by braces):

type family UnEx {k :: Type} (ex :: Ex) :: k
type instance UnEx {k} (MkEx @k x) = x

Thus, the instance triggers only when the implicit parameter to UnEx matches the implicit parameter to MkEx. Because k is actually a parameter to UnEx, the kind is not escaping the existential, and the above code is valid.

See also Trac #7347.

10.11. Kind polymorphism

TypeInType
Implies:PolyKinds, DataKinds, KindSignatures
Since:8.0.1

In the past this extension used to enable advanced type-level programming techniques. Now it’s a shorthand for a couple of other extensions.

PolyKinds
Implies:KindSignatures
Since:7.4.1

Allow kind polymorphic types.

This section describes GHC’s kind system, as it appears in version 8.0 and beyond. The kind system as described here is always in effect, with or without extensions, although it is a conservative extension beyond standard Haskell. The extensions above simply enable syntax and tweak the inference algorithm to allow users to take advantage of the extra expressiveness of GHC’s kind system.

10.11.1. Overview of kind polymorphism

Consider inferring the kind for

data App f a = MkApp (f a)

In Haskell 98, the inferred kind for App is (Type -> Type) -> Type -> Type. But this is overly specific, because another suitable Haskell 98 kind for App is ((Type -> Type) -> Type) -> (Type -> Type) -> Type, where the kind assigned to a is Type -> Type. Indeed, without kind signatures (KindSignatures), it is necessary to use a dummy constructor to get a Haskell compiler to infer the second kind. With kind polymorphism (PolyKinds), GHC infers the kind forall k. (k -> Type) -> k -> Type for App, which is its most general kind.

Thus, the chief benefit of kind polymorphism is that we can now infer these most general kinds and use App at a variety of kinds:

App Maybe Int   -- `k` is instantiated to Type

data T a = MkT (a Int)    -- `a` is inferred to have kind (Type -> Type)
App T Maybe     -- `k` is instantiated to (Type -> Type)

10.11.2. Overview of Type-in-Type

GHC 8 extends the idea of kind polymorphism by declaring that types and kinds are indeed one and the same. Nothing within GHC distinguishes between types and kinds. Another way of thinking about this is that the type Bool and the “promoted kind” Bool are actually identical. (Note that term True and the type 'True are still distinct, because the former can be used in expressions and the latter in types.) This lack of distinction between types and kinds is a hallmark of dependently typed languages. Full dependently typed languages also remove the difference between expressions and types, but doing that in GHC is a story for another day.

One simplification allowed by combining types and kinds is that the type of Type is just Type. It is true that the Type :: Type axiom can lead to non-termination, but this is not a problem in GHC, as we already have other means of non-terminating programs in both types and expressions. This decision (among many, many others) does mean that despite the expressiveness of GHC’s type system, a “proof” you write in Haskell is not an irrefutable mathematical proof. GHC promises only partial correctness, that if your programs compile and run to completion, their results indeed have the types assigned. It makes no claim about programs that do not finish in a finite amount of time.

To learn more about this decision and the design of GHC under the hood please see the paper introducing this kind system to GHC/Haskell.

10.11.3. Principles of kind inference

Generally speaking, when PolyKinds is on, GHC tries to infer the most general kind for a declaration. In many cases (for example, in a datatype declaration) the definition has a right-hand side to inform kind inference. But that is not always the case. Consider

type family F a

Type family declarations have no right-hand side, but GHC must still infer a kind for F. Since there are no constraints, it could infer F :: forall k1 k2. k1 -> k2, but that seems too polymorphic. So GHC defaults those entirely-unconstrained kind variables to Type and we get F :: Type -> Type. You can still declare F to be kind-polymorphic using kind signatures:

type family F1 a                -- F1 :: Type -> Type
type family F2 (a :: k)         -- F2 :: forall k. k -> Type
type family F3 a :: k           -- F3 :: forall k. Type -> k
type family F4 (a :: k1) :: k2  -- F4 :: forall k1 k2. k1 -> k2

The general principle is this:

  • When there is a right-hand side, GHC infers the most polymorphic kind consistent with the right-hand side. Examples: ordinary data type and GADT declarations, class declarations. In the case of a class declaration the role of “right hand side” is played by the class method signatures.
  • When there is no right hand side, GHC defaults argument and result kinds to ``Type``, except when directed otherwise by a kind signature. Examples: data and open type family declarations.

This rule has occasionally-surprising consequences (see Trac #10132.

class C a where    -- Class declarations are generalised
                   -- so C :: forall k. k -> Constraint
  data D1 a        -- No right hand side for these two family
  type F1 a        -- declarations, but the class forces (a :: k)
                   -- so   D1, F1 :: forall k. k -> Type

data D2 a   -- No right-hand side so D2 :: Type -> Type
type F2 a   -- No right-hand side so F2 :: Type -> Type

The kind-polymorphism from the class declaration makes D1 kind-polymorphic, but not so D2; and similarly F1, F1.

10.11.4. Complete user-supplied kind signatures and polymorphic recursion

Just as in type inference, kind inference for recursive types can only use monomorphic recursion. Consider this (contrived) example:

data T m a = MkT (m a) (T Maybe (m a))
-- GHC infers kind  T :: (Type -> Type) -> Type -> Type

The recursive use of T forced the second argument to have kind Type. However, just as in type inference, you can achieve polymorphic recursion by giving a complete user-supplied kind signature (or CUSK) for T. A CUSK is present when all argument kinds and the result kind are known, without any need for inference. For example:

data T (m :: k -> Type) :: k -> Type where
  MkT :: m a -> T Maybe (m a) -> T m a

The complete user-supplied kind signature specifies the polymorphic kind for T, and this signature is used for all the calls to T including the recursive ones. In particular, the recursive use of T is at kind Type.

What exactly is considered to be a “complete user-supplied kind signature” for a type constructor? These are the forms:

  • For a datatype, every type variable must be annotated with a kind. In a GADT-style declaration, there may also be a kind signature (with a top-level :: in the header), but the presence or absence of this annotation does not affect whether or not the declaration has a complete signature.

    data T1 :: (k -> Type) -> k -> Type       where ...
    -- Yes;  T1 :: forall k. (k->Type) -> k -> Type
    
    data T2 (a :: k -> Type) :: k -> Type     where ...
    -- Yes;  T2 :: forall k. (k->Type) -> k -> Type
    
    data T3 (a :: k -> Type) (b :: k) :: Type where ...
    -- Yes;  T3 :: forall k. (k->Type) -> k -> Type
    
    data T4 (a :: k -> Type) (b :: k)      where ...
    -- Yes;  T4 :: forall k. (k->Type) -> k -> Type
    
    data T5 a (b :: k) :: Type             where ...
    -- No;  kind is inferred
    
    data T6 a b                         where ...
    -- No;  kind is inferred
    
  • For a datatype with a top-level ::: all kind variables introduced after the :: must be explicitly quantified.

    data T1 :: k -> Type            -- No CUSK: `k` is not explicitly quantified
    data T2 :: forall k. k -> Type  -- CUSK: `k` is bound explicitly
    data T3 :: forall (k :: Type). k -> Type   -- still a CUSK
    
  • For a class, every type variable must be annotated with a kind.

  • For a type synonym, every type variable and the result type must all be annotated with kinds:

    type S1 (a :: k) = (a :: k)    -- Yes   S1 :: forall k. k -> k
    type S2 (a :: k) = a           -- No    kind is inferred
    type S3 (a :: k) = Proxy a     -- No    kind is inferred
    

    Note that in S2 and S3, the kind of the right-hand side is rather apparent, but it is still not considered to have a complete signature – no inference can be done before detecting the signature.

  • An un-associated open type or data family declaration always has a CUSK; un-annotated type variables default to kind Type:

    data family D1 a                  -- D1 :: Type -> Type
    data family D2 (a :: k)           -- D2 :: forall k. k -> Type
    data family D3 (a :: k) :: Type   -- D3 :: forall k. k -> Type
    type family S1 a :: k -> Type     -- S1 :: forall k. Type -> k -> Type
    
  • An associated type or data family declaration has a CUSK precisely if its enclosing class has a CUSK.

    class C a where                -- no CUSK
      type AT a b                  -- no CUSK, b is defaulted
    
    class D (a :: k) where         -- yes CUSK
      type AT2 a b                 -- yes CUSK, b is defaulted
    
  • A closed type family has a complete signature when all of its type variables are annotated and a return kind (with a top-level ::) is supplied.

It is possible to write a datatype that syntactically has a CUSK (according to the rules above) but actually requires some inference. As a very contrived example, consider

data Proxy a           -- Proxy :: forall k. k -> Type
data X (a :: Proxy k)

According to the rules above X has a CUSK. Yet, the kind of k is undetermined. It is thus quantified over, giving X the kind forall k1 (k :: k1). Proxy k -> Type.

10.11.5. Kind inference in closed type families

Although all open type families are considered to have a complete user-supplied kind signature, we can relax this condition for closed type families, where we have equations on which to perform kind inference. GHC will infer kinds for the arguments and result types of a closed type family.

GHC supports kind-indexed type families, where the family matches both on the kind and type. GHC will not infer this behaviour without a complete user-supplied kind signature, as doing so would sometimes infer non-principal types. Indeed, we can see kind-indexing as a form of polymorphic recursion, where a type is used at a kind other than its most general in its own definition.

For example:

type family F1 a where
  F1 True  = False
  F1 False = True
  F1 x     = x
-- F1 fails to compile: kind-indexing is not inferred

type family F2 (a :: k) where
  F2 True  = False
  F2 False = True
  F2 x     = x
-- F2 fails to compile: no complete signature

type family F3 (a :: k) :: k where
  F3 True  = False
  F3 False = True
  F3 x     = x
-- OK

10.11.6. Kind inference in class instance declarations

Consider the following example of a poly-kinded class and an instance for it:

class C a where
  type F a

instance C b where
  type F b = b -> b

In the class declaration, nothing constrains the kind of the type a, so it becomes a poly-kinded type variable (a :: k). Yet, in the instance declaration, the right-hand side of the associated type instance b -> b says that b must be of kind Type. GHC could theoretically propagate this information back into the instance head, and make that instance declaration apply only to type of kind Type, as opposed to types of any kind. However, GHC does not do this.

In short: GHC does not propagate kind information from the members of a class instance declaration into the instance declaration head.

This lack of kind inference is simply an engineering problem within GHC, but getting it to work would make a substantial change to the inference infrastructure, and it’s not clear the payoff is worth it. If you want to restrict b‘s kind in the instance above, just use a kind signature in the instance head.

10.11.7. Kind inference in type signatures

When kind-checking a type, GHC considers only what is written in that type when figuring out how to generalise the type’s kind.

For example, consider these definitions (with ScopedTypeVariables):

data Proxy a    -- Proxy :: forall k. k -> Type
p :: forall a. Proxy a
p = Proxy :: Proxy (a :: Type)

GHC reports an error, saying that the kind of a should be a kind variable k, not Type. This is because, by looking at the type signature forall a. Proxy a, GHC assumes a‘s kind should be generalised, not restricted to be Type. The function definition is then rejected for being more specific than its type signature.

10.11.8. Explicit kind quantification

Enabled by PolyKinds, GHC supports explicit kind quantification, as in these examples:

data Proxy :: forall k. k -> Type
f :: (forall k (a :: k). Proxy a -> ()) -> Int

Note that the second example has a forall that binds both a kind k and a type variable a of kind k. In general, there is no limit to how deeply nested this sort of dependency can work. However, the dependency must be well-scoped: forall (a :: k) k. ... is an error.

For backward compatibility, kind variables do not need to be bound explicitly, even if the type starts with forall.

Accordingly, the rule for kind quantification in higher-rank contexts has changed slightly. In GHC 7, if a kind variable was mentioned for the first time in the kind of a variable bound in a non-top-level forall, the kind variable was bound there, too. That is, in f :: (forall (a :: k). ...) -> ..., the k was bound by the same forall as the a. In GHC 8, however, all kind variables mentioned in a type are bound at the outermost level. If you want one bound in a higher-rank forall, include it explicitly.

10.11.9. Kind-indexed GADTs

Consider the type

data G (a :: k) where
  GInt    :: G Int
  GMaybe  :: G Maybe

This datatype G is GADT-like in both its kind and its type. Suppose you have g :: G a, where a :: k. Then pattern matching to discover that g is in fact GMaybe tells you both that k ~ (Type -> Type) and a ~ Maybe. The definition for G requires that PolyKinds be in effect, but pattern-matching on G requires no extension beyond GADTs. That this works is actually a straightforward extension of regular GADTs and a consequence of the fact that kinds and types are the same.

Note that the datatype G is used at different kinds in its body, and therefore that kind-indexed GADTs use a form of polymorphic recursion. It is thus only possible to use this feature if you have provided a complete user-supplied kind signature for the datatype (Complete user-supplied kind signatures and polymorphic recursion).

10.11.10. Higher-rank kinds

In concert with RankNTypes, GHC supports higher-rank kinds. Here is an example:

-- Heterogeneous propositional equality
data (a :: k1) :~~: (b :: k2) where
  HRefl :: a :~~: a

class HTestEquality (t :: forall k. k -> Type) where
  hTestEquality :: forall k1 k2 (a :: k1) (b :: k2). t a -> t b -> Maybe (a :~~: b)

Note that hTestEquality takes two arguments where the type variable t is applied to types of different kinds. That type variable must then be polykinded. Accordingly, the kind of HTestEquality (the class) is (forall k. k -> Type) -> Constraint, a higher-rank kind.

A big difference with higher-rank kinds as compared with higher-rank types is that foralls in kinds cannot be moved. This is best illustrated by example. Suppose we want to have an instance of HTestEquality for (:~~:).

instance HTestEquality ((:~~:) a) where
  hTestEquality HRefl HRefl = Just HRefl

With the declaration of (:~~:) above, it gets kind forall k1 k2. k1 -> k2 -> Type. Thus, the type (:~~:) a has kind k2 -> Type for some k2. GHC cannot then regeneralize this kind to become forall k2. k2 -> Type as desired. Thus, the instance is rejected as ill-kinded.

To allow for such an instance, we would have to define (:~~:) as follows:

data (:~~:) :: forall k1. k1 -> forall k2. k2 -> Type where
  HRefl :: a :~~: a

In this redefinition, we give an explicit kind for (:~~:), deferring the choice of k2 until after the first argument (a) has been given. With this declaration for (:~~:), the instance for HTestEquality is accepted.

Another difference between higher-rank kinds and types can be found in their treatment of inferred and user-specified type variables. Consider the following program:

newtype Foo (f :: forall k. k -> Type) = MkFoo (f Int)
data Proxy a = Proxy

foo :: Foo Proxy
foo = MkFoo Proxy

The kind of Foo‘s parameter is forall k. k -> Type, but the kind of Proxy is forall {k}. k -> Type, where {k} denotes that the kind variable k is to be inferred, not specified by the user. (See Visible type application for more discussion on the inferred-specified distinction). GHC does not consider forall k. k -> Type and forall {k}. k -> Type to be equal at the kind level, and thus rejects Foo Proxy as ill-kinded.

10.11.11. Constraints in kinds

As kinds and types are the same, kinds can (with PolyKinds) contain type constraints. Only equality constraints are currently supported, however. We expect this to extend to other constraints in the future.

Here is an example of a constrained kind:

type family IsTypeLit a where
  IsTypeLit Nat    = 'True
  IsTypeLit Symbol = 'True
  IsTypeLit a      = 'False

data T :: forall a. (IsTypeLit a ~ 'True) => a -> Type where
  MkNat    :: T 42
  MkSymbol :: T "Don't panic!"

The declarations above are accepted. However, if we add MkOther :: T Int, we get an error that the equality constraint is not satisfied; Int is not a type literal. Note that explicitly quantifying with forall a is not necessary here.

10.11.12. The kind Type

StarIsType
Since:8.6.1

Treat the unqualified uses of the * type operator as nullary and desugar to Data.Kind.Type.

The kind Type (imported from Data.Kind) classifies ordinary types. With StarIsType (currently enabled by default), * is desugared to Type, but using this legacy syntax is not recommended due to conflicts with TypeOperators. This also applies to , the Unicode variant of *.

10.11.13. Inferring dependency in datatype declarations

If a type variable a in a datatype, class, or type family declaration depends on another such variable k in the same declaration, two properties must hold:

  • a must appear after k in the declaration, and
  • k must appear explicitly in the kind of some type variable in that declaration.

The first bullet simply means that the dependency must be well-scoped. The second bullet concerns GHC’s ability to infer dependency. Inferring this dependency is difficult, and GHC currently requires the dependency to be made explicit, meaning that k must appear in the kind of a type variable, making it obvious to GHC that dependency is intended. For example:

data Proxy k (a :: k)            -- OK: dependency is "obvious"
data Proxy2 k a = P (Proxy k a)  -- ERROR: dependency is unclear

In the second declaration, GHC cannot immediately tell that k should be a dependent variable, and so the declaration is rejected.

It is conceivable that this restriction will be relaxed in the future, but it is (at the time of writing) unclear if the difficulties around this scenario are theoretical (inferring this dependency would mean our type system does not have principal types) or merely practical (inferring this dependency is hard, given GHC’s implementation). So, GHC takes the easy way out and requires a little help from the user.

10.11.14. Inferring dependency in user-written foralls

A programmer may use forall in a type to introduce new quantified type variables. These variables may depend on each other, even in the same forall. However, GHC requires that the dependency be inferrable from the body of the forall. Here are some examples:

data Proxy k (a :: k) = MkProxy   -- just to use below

f :: forall k a. Proxy k a        -- This is just fine. We see that (a :: k).
f = undefined

g :: Proxy k a -> ()              -- This is to use below.
g = undefined

data Sing a
h :: forall k a. Sing k -> Sing a -> ()  -- No obvious relationship between k and a
h _ _ = g (MkProxy :: Proxy k a)  -- This fails. We didn't know that a should have kind k.

Note that in the last example, it’s impossible to learn that a depends on k in the body of the forall (that is, the Sing k -> Sing a -> ()). And so GHC rejects the program.

10.11.15. Kind defaulting without PolyKinds

Without PolyKinds, GHC refuses to generalise over kind variables. It thus defaults kind variables to Type when possible; when this is not possible, an error is issued.

Here is an example of this in action:

{-# LANGUAGE PolyKinds #-}
import Data.Kind (Type)
data Proxy a = P   -- inferred kind: Proxy :: k -> Type
data Compose f g x = MkCompose (f (g x))
  -- inferred kind: Compose :: (b -> Type) -> (a -> b) -> a -> Type

-- separate module having imported the first
{-# LANGUAGE NoPolyKinds, DataKinds #-}
z = Proxy :: Proxy 'MkCompose

In the last line, we use the promoted constructor 'MkCompose, which has kind

forall (a :: Type) (b :: Type) (f :: b -> Type) (g :: a -> b) (x :: a).
  f (g x) -> Compose f g x

Now we must infer a type for z. To do so without generalising over kind variables, we must default the kind variables of 'MkCompose. We can easily default a and b to Type, but f and g would be ill-kinded if defaulted. The definition for z is thus an error.

10.11.16. Pretty-printing in the presence of kind polymorphism

With kind polymorphism, there is quite a bit going on behind the scenes that may be invisible to a Haskell programmer. GHC supports several flags that control how types are printed in error messages and at the GHCi prompt. See the discussion of type pretty-printing options for further details. If you are using kind polymorphism and are confused as to why GHC is rejecting (or accepting) your program, we encourage you to turn on these flags, especially -fprint-explicit-kinds.

10.12. Levity polymorphism

In order to allow full flexibility in how kinds are used, it is necessary to use the kind system to differentiate between boxed, lifted types (normal, everyday types like Int and [Bool]) and unboxed, primitive types (Unboxed types and primitive operations) like Int#. We thus have so-called levity polymorphism.

Here are the key definitions, all available from GHC.Exts:

TYPE :: RuntimeRep -> Type   -- highly magical, built into GHC

data RuntimeRep = LiftedRep     -- for things like `Int`
                | UnliftedRep   -- for things like `Array#`
                | IntRep        -- for `Int#`
                | TupleRep [RuntimeRep]  -- unboxed tuples, indexed by the representations of the elements
                | SumRep [RuntimeRep]    -- unboxed sums, indexed by the representations of the disjuncts
                | ...

type Type = TYPE LiftedRep    -- Type is just an ordinary type synonym

The idea is that we have a new fundamental type constant TYPE, which is parameterised by a RuntimeRep. We thus get Int# :: TYPE 'IntRep and Bool :: TYPE 'LiftedRep. Anything with a type of the form TYPE x can appear to either side of a function arrow ->. We can thus say that -> has type TYPE r1 -> TYPE r2 -> TYPE 'LiftedRep. The result is always lifted because all functions are lifted in GHC.

10.12.1. No levity-polymorphic variables or arguments

If GHC didn’t have to compile programs that run in the real world, that would be the end of the story. But representation polymorphism can cause quite a bit of trouble for GHC’s code generator. Consider

bad :: forall (r1 :: RuntimeRep) (r2 :: RuntimeRep)
              (a :: TYPE r1) (b :: TYPE r2).
       (a -> b) -> a -> b
bad f x = f x

This seems like a generalisation of the standard $ operator. If we think about compiling this to runnable code, though, problems appear. In particular, when we call bad, we must somehow pass x into bad. How wide (that is, how many bits) is x? Is it a pointer? What kind of register (floating-point or integral) should x go in? It’s all impossible to say, because x‘s type, a :: TYPE r1 is levity polymorphic. We thus forbid such constructions, via the following straightforward rule:

No variable may have a levity-polymorphic type.

This eliminates bad because the variable x would have a representation-polymorphic type.

However, not all is lost. We can still do this:

($) :: forall r (a :: Type) (b :: TYPE r).
       (a -> b) -> a -> b
f $ x = f x

Here, only b is levity polymorphic. There are no variables with a levity-polymorphic type. And the code generator has no trouble with this. Indeed, this is the true type of GHC’s $ operator, slightly more general than the Haskell 98 version.

Because the code generator must store and move arguments as well as variables, the logic above applies equally well to function arguments, which may not be levity-polymorphic.

10.12.2. Levity-polymorphic bottoms

We can use levity polymorphism to good effect with error and undefined, whose types are given here:

undefined :: forall (r :: RuntimeRep) (a :: TYPE r).
             HasCallStack => a
error :: forall (r :: RuntimeRep) (a :: TYPE r).
         HasCallStack => String -> a

These functions do not bind a levity-polymorphic variable, and so are accepted. Their polymorphism allows users to use these to conveniently stub out functions that return unboxed types.

10.12.3. Printing levity-polymorphic types

-fprint-explicit-runtime-reps

Print RuntimeRep parameters as they appear; otherwise, they are defaulted to 'LiftedRep.

Most GHC users will not need to worry about levity polymorphism or unboxed types. For these users, seeing the levity polymorphism in the type of $ is unhelpful. And thus, by default, it is suppressed, by supposing all type variables of type RuntimeRep to be 'LiftedRep when printing, and printing TYPE 'LiftedRep as Type (or * when StarIsType is on).

Should you wish to see levity polymorphism in your types, enable the flag -fprint-explicit-runtime-reps.

10.13. Type-Level Literals

GHC supports numeric and string literals at the type level, giving convenient access to a large number of predefined type-level constants. Numeric literals are of kind Nat, while string literals are of kind Symbol. This feature is enabled by the DataKinds language extension.

The kinds of the literals and all other low-level operations for this feature are defined in module GHC.TypeLits. Note that the module defines some type-level operators that clash with their value-level counterparts (e.g. (+)). Import and export declarations referring to these operators require an explicit namespace annotation (see Explicit namespaces in import/export).

Here is an example of using type-level numeric literals to provide a safe interface to a low-level function:

import GHC.TypeLits
import Data.Word
import Foreign

newtype ArrPtr (n :: Nat) a = ArrPtr (Ptr a)

clearPage :: ArrPtr 4096 Word8 -> IO ()
clearPage (ArrPtr p) = ...

Here is an example of using type-level string literals to simulate simple record operations:

data Label (l :: Symbol) = Get

class Has a l b | a l -> b where
  from :: a -> Label l -> b

data Point = Point Int Int deriving Show

instance Has Point "x" Int where from (Point x _) _ = x
instance Has Point "y" Int where from (Point _ y) _ = y

example = from (Point 1 2) (Get :: Label "x")

10.13.1. Runtime Values for Type-Level Literals

Sometimes it is useful to access the value-level literal associated with a type-level literal. This is done with the functions natVal and symbolVal. For example:

GHC.TypeLits> natVal (Proxy :: Proxy 2)
2

These functions are overloaded because they need to return a different result, depending on the type at which they are instantiated.

natVal :: KnownNat n => proxy n -> Integer

-- instance KnownNat 0
-- instance KnownNat 1
-- instance KnownNat 2
-- ...

GHC discharges the constraint as soon as it knows what concrete type-level literal is being used in the program. Note that this works only for literals and not arbitrary type expressions. For example, a constraint of the form KnownNat (a + b) will not be simplified to (KnownNat a, KnownNat b); instead, GHC will keep the constraint as is, until it can simplify a + b to a constant value.

It is also possible to convert a run-time integer or string value to the corresponding type-level literal. Of course, the resulting type literal will be unknown at compile-time, so it is hidden in an existential type. The conversion may be performed using someNatVal for integers and someSymbolVal for strings:

someNatVal :: Integer -> Maybe SomeNat
SomeNat    :: KnownNat n => Proxy n -> SomeNat

The operations on strings are similar.

10.13.2. Computing With Type-Level Naturals

GHC 7.8 can evaluate arithmetic expressions involving type-level natural numbers. Such expressions may be constructed using the type-families (+), (*), (^) for addition, multiplication, and exponentiation. Numbers may be compared using (<=?), which returns a promoted boolean value, or (<=), which compares numbers as a constraint. For example:

GHC.TypeLits> natVal (Proxy :: Proxy (2 + 3))
5

At present, GHC is quite limited in its reasoning about arithmetic: it will only evaluate the arithmetic type functions and compare the results— in the same way that it does for any other type function. In particular, it does not know more general facts about arithmetic, such as the commutativity and associativity of (+), for example.

However, it is possible to perform a bit of “backwards” evaluation. For example, here is how we could get GHC to compute arbitrary logarithms at the type level:

lg :: Proxy base -> Proxy (base ^ pow) -> Proxy pow
lg _ _ = Proxy

GHC.TypeLits> natVal (lg (Proxy :: Proxy 2) (Proxy :: Proxy 8))
3

10.14. Equality constraints, Coercible, and the kind Constraint

10.14.1. Equality constraints

A type context can include equality constraints of the form t1 ~ t2, which denote that the types t1 and t2 need to be the same. In the presence of type families, whether two types are equal cannot generally be decided locally. Hence, the contexts of function signatures may include equality constraints, as in the following example:

sumCollects :: (Collects c1, Collects c2, Elem c1 ~ Elem c2) => c1 -> c2 -> c2

where we require that the element type of c1 and c2 are the same. In general, the types t1 and t2 of an equality constraint may be arbitrary monotypes; i.e., they may not contain any quantifiers, independent of whether higher-rank types are otherwise enabled.

Equality constraints can also appear in class and instance contexts. The former enable a simple translation of programs using functional dependencies into programs using family synonyms instead. The general idea is to rewrite a class declaration of the form

class C a b | a -> b

to

class (F a ~ b) => C a b where
  type F a

That is, we represent every functional dependency (FD) a1 .. an -> b by an FD type family F a1 .. an and a superclass context equality F a1 .. an ~ b, essentially giving a name to the functional dependency. In class instances, we define the type instances of FD families in accordance with the class head. Method signatures are not affected by that process.

10.14.2. Heterogeneous equality

GHC also supports kind-heterogeneous equality, which relates two types of potentially different kinds. Heterogeneous equality is spelled ~~. Here are the kinds of ~ and ~~ to better understand their difference:

(~)  :: forall k. k -> k -> Constraint
(~~) :: forall k1 k2. k1 -> k2 -> Constraint

Users will most likely want ~, but ~~ is available if GHC cannot know, a priori, that the two types of interest have the same kind. Evidence that (a :: k1) ~~ (b :: k2) tells GHC both that k1 and k2 are the same and that a and b are the same.

Because ~ is the more common equality relation, GHC prints out ~~ like ~ unless -fprint-equality-relations is set.

10.14.3. Unlifted heterogeneous equality

Internal to GHC is yet a third equality relation (~#). It is heterogeneous (like ~~) and is used only internally. It may appear in error messages and other output only when -fprint-equality-relations is enabled.

10.14.4. The Coercible constraint

The constraint Coercible t1 t2 is similar to t1 ~ t2, but denotes representational equality between t1 and t2 in the sense of Roles (Roles). It is exported by Data.Coerce, which also contains the documentation. More details and discussion can be found in the paper “Safe Coercions”.

10.14.5. The Constraint kind

ConstraintKinds
Since:7.4.1

Allow types of kind Constraint to be used in contexts.

Normally, constraints (which appear in types to the left of the => arrow) have a very restricted syntax. They can only be:

With the ConstraintKinds extension, GHC becomes more liberal in what it accepts as constraints in your program. To be precise, with this flag any type of the new kind Constraint can be used as a constraint. The following things have kind Constraint:

  • Anything which is already valid as a constraint without the flag: saturated applications to type classes, implicit parameter and equality constraints.

  • Tuples, all of whose component types have kind Constraint. So for example the type (Show a, Ord a) is of kind Constraint.

  • Anything whose form is not yet known, but the user has declared to have kind Constraint (for which they need to import it from GHC.Exts). So for example type Foo (f :: Type -> Constraint) = forall b. f b => b -> b is allowed, as well as examples involving type families:

    type family Typ a b :: Constraint
    type instance Typ Int  b = Show b
    type instance Typ Bool b = Num b
    
    func :: Typ a b => a -> b -> b
    func = ...
    

Note that because constraints are just handled as types of a particular kind, this extension allows type constraint synonyms:

type Stringy a = (Read a, Show a)
foo :: Stringy a => a -> (String, String -> a)
foo x = (show x, read)

Presently, only standard constraints, tuples and type synonyms for those two sorts of constraint are permitted in instance contexts and superclasses (without extra flags). The reason is that permitting more general constraints can cause type checking to loop, as it would with these two programs:

type family Clsish u a
type instance Clsish () a = Cls a
class Clsish () a => Cls a where
class OkCls a where

type family OkClsish u a
type instance OkClsish () a = OkCls a
instance OkClsish () a => OkCls a where

You may write programs that use exotic sorts of constraints in instance contexts and superclasses, but to do so you must use UndecidableInstances to signal that you don’t mind if the type checker fails to terminate.

10.15. Quantified constraints

QuantifiedConstraints
Since:8.6.1

Allow constraints to quantify over types.

The extension QuantifiedConstraints introduces quantified constraints, which give a new level of expressiveness in constraints. For example, consider

data Rose f a = Branch a (f (Rose f a))

instance (Eq a, ???) => Eq (Rose f a)
  where
    (Branch x1 c1) == (Branch x2 c2)
       = x1==x1 && c1==c2

From the x1==x2 we need Eq a, which is fine. From c1==c2 we need Eq (f (Rose f a)) which is not fine in Haskell today; we have no way to solve such a constraint.

QuantifiedConstraints lets us write this

instance (Eq a, forall b. (Eq b) => Eq (f b))
       => Eq (Rose f a)
  where
    (Branch x1 c1) == (Branch x2 c2)
       = x1==x1 && c1==c2

Here, the quantified constraint forall b. (Eq b) => Eq (f b) behaves a bit like a local instance declaration, and makes the instance typeable.

The paper Quantified class constraints (by Bottu, Karachalias, Schrijvers, Oliveira, Wadler, Haskell Symposium 2017) describes this feature in technical detail, with examples, and so is a primary reference source for this proposal.

10.15.1. Motivation

Introducing quantified constraints offers two main benefits:

  • Firstly, they enable terminating resolution where this was not possible before. Consider for instance the following instance declaration for the general rose datatype

    data Rose f x = Rose x (f (Rose f x))
    
    instance (Eq a, forall b. Eq b => Eq (f b)) => Eq (Rose f a) where
      (Rose x1 rs1) == (Rose x2 rs2) = x1 == x2 && rs1 == rs2
    

    This extension allows us to write constraints of the form forall b. Eq b => Eq