7.20. Generic programming

Using a combination of -XDeriveGeneric (Section 7.5.3, “Deriving clause for extra classes (Typeable, Data, etc)”) and -XDefaultSignatures (Section 7.6.1.4, “Default method signatures”), you can easily do datatype-generic programming using the GHC.Generics framework. This section gives a very brief overview of how to do it.

Generic programming support in GHC allows defining classes with methods that do not need a user specification when instantiating: the method body is automatically derived by GHC. This is similar to what happens for standard classes such as Read and Show, for instance, but now for user-defined classes.

7.20.1. Deriving representations

The first thing we need is generic representations. The GHC.Generics module defines a couple of primitive types that are used to represent Haskell datatypes:

-- | Unit: used for constructors without arguments
data U1 p = U1

-- | Constants, additional parameters and recursion of kind *
newtype K1 i c p = K1 { unK1 :: c }

-- | Meta-information (constructor names, etc.)
newtype M1 i c f p = M1 { unM1 :: f p }

-- | Sums: encode choice between constructors
infixr 5 :+:
data (:+:) f g p = L1 (f p) | R1 (g p)

-- | Products: encode multiple arguments to constructors
infixr 6 :*:
data (:*:) f g p = f p :*: g p

The Generic class mediates between user-defined datatypes and their internal representation as a sum-of-products:

class Generic a where
  -- Encode the representation of a user datatype
  type Rep a :: * -> *
  -- Convert from the datatype to its representation
  from  :: a -> (Rep a) x
  -- Convert from the representation to the datatype
  to    :: (Rep a) x -> a

Instances of this class can be derived by GHC with the -XDeriveGeneric (Section 7.5.3, “Deriving clause for extra classes (Typeable, Data, etc)”), and are necessary to be able to define generic instances automatically.

For example, a user-defined datatype of trees data UserTree a = Node a (UserTree a) (UserTree a) | Leaf gets the following representation:

instance Generic (UserTree a) where
  -- Representation type
  type Rep (UserTree a) =
    M1 D D1UserTree (
          M1 C C1_0UserTree (
                M1 S NoSelector (K1 P a)
            :*: M1 S NoSelector (K1 R (UserTree a))
            :*: M1 S NoSelector (K1 R (UserTree a)))
      :+: M1 C C1_1UserTree U1)

  -- Conversion functions
  from (Node x l r) = M1 (L1 (M1 (M1 (K1 x) :*: M1 (K1 l) :*: M1 (K1 r))))
  from Leaf         = M1 (R1 (M1 U1))
  to (M1 (L1 (M1 (M1 (K1 x) :*: M1 (K1 l) :*: M1 (K1 r))))) = Node x l r
  to (M1 (R1 (M1 U1)))                                      = Leaf

-- Meta-information
data D1UserTree
data C1_0UserTree
data C1_1UserTree

instance Datatype D1UserTree where
  datatypeName _ = "UserTree"
  moduleName _   = "Main"

instance Constructor C1_0UserTree where
  conName _ = "Node"

instance Constructor C1_1UserTree where
  conName _ = "Leaf"

This representation is generated automatically if a deriving Generic clause is attached to the datatype. Standalone deriving can also be used.

7.20.2. Writing generic functions

A generic function is defined by creating a class and giving instances for each of the representation types of GHC.Generics. As an example we show generic serialization:

data Bin = O | I

class GSerialize f where
  gput :: f a -> [Bin]

instance GSerialize U1 where
  gput U1 = []

instance (GSerialize a, GSerialize b) => GSerialize (a :*: b) where
  gput (x :*: y) = gput x ++ gput y

instance (GSerialize a, GSerialize b) => GSerialize (a :+: b) where
  gput (L1 x) = O : gput x
  gput (R1 x) = I : gput x

instance (GSerialize a) => GSerialize (M1 i c a) where
  gput (M1 x) = gput x

instance (Serialize a) => GSerialize (K1 i a) where
  gput (K1 x) = put x

Typically this class will not be exported, as it only makes sense to have instances for the representation types.

7.20.3. Generic defaults

The only thing left to do now is to define a "front-end" class, which is exposed to the user:

class Serialize a where
  put :: a -> [Bin]

  default put :: (Generic a, GSerialize (Rep a)) => a -> [Bit]
  put = gput . from

Here we use a default signature to specify that the user does not have to provide an implementation for put, as long as there is a Generic instance for the type to instantiate. For the UserTree type, for instance, the user can just write:

instance (Serialize a) => Serialize (UserTree a)

The default method for put is then used, corresponding to the generic implementation of serialization.

7.20.4. More information

For more detail please refer to the HaskellWiki page or the original paper:

  • Jose Pedro Magalhaes, Atze Dijkstra, Johan Jeuring, and Andres Loeh. A generic deriving mechanism for Haskell. Proceedings of the third ACM Haskell symposium on Haskell (Haskell'2010), pp. 37-48, ACM, 2010.

Note: the current support for generic programming in GHC is preliminary. In particular, we only allow deriving instances for the Generic class. Support for deriving Generic1 (and thus enabling generic functions of kind * -> * such as fmap) will come at a later stage.