The programmer can specify rewrite rules as part of the source program
(in a pragma). GHC applies these rewrite rules wherever it can, provided (a)
the `-O` flag (Section 4.9) is on,
and (b) the `-frules-off` flag
(Section 4.9.2) is not specified.

Here is an example:

{-# RULES "map/map" forall f g xs. map f (map g xs) = map (f.g) xs #-} |

From a syntactic point of view:

There may be zero or more rules in a

`RULES`pragma.Each rule has a name, enclosed in double quotes. The name itself has no significance at all. It is only used when reporting how many times the rule fired.

A rule may optionally have a phase-control number (see Section 7.8.2.3), immediately after the name of the rule. Thus:

The "[2]" means that the rule is active in Phase 2 and subsequent phases. The inverse notation "[~2]" is also accepted, meaning that the rule is active up to, but not including, Phase 2.{-# RULES "map/map" [2] forall f g xs. map f (map g xs) = map (f.g) xs #-}

Layout applies in a

`RULES`pragma. Currently no new indentation level is set, so you must lay out your rules starting in the same column as the enclosing definitions.Each variable mentioned in a rule must either be in scope (e.g.

`map`), or bound by the`forall`(e.g.`f`,`g`,`xs`). The variables bound by the`forall`are called the*pattern*variables. They are separated by spaces, just like in a type`forall`.A pattern variable may optionally have a type signature. If the type of the pattern variable is polymorphic, it

*must*have a type signature. For example, here is the`foldr/build`rule:

Since"fold/build" forall k z (g::forall b. (a->b->b) -> b -> b) . foldr k z (build g) = g k z

`g`has a polymorphic type, it must have a type signature.The left hand side of a rule must consist of a top-level variable applied to arbitrary expressions. For example, this is

*not*OK:

In"wrong1" forall e1 e2. case True of { True -> e1; False -> e2 } = e1 "wrong2" forall f. f True = True

`"wrong1"`, the LHS is not an application; in`"wrong2"`, the LHS has a pattern variable in the head.A rule does not need to be in the same module as (any of) the variables it mentions, though of course they need to be in scope.

Rules are automatically exported from a module, just as instance declarations are.

From a semantic point of view:

Rules are only applied if you use the

`-O`flag.Rules are regarded as left-to-right rewrite rules. When GHC finds an expression that is a substitution instance of the LHS of a rule, it replaces the expression by the (appropriately-substituted) RHS. By "a substitution instance" we mean that the LHS can be made equal to the expression by substituting for the pattern variables.

The LHS and RHS of a rule are typechecked, and must have the same type.

GHC makes absolutely no attempt to verify that the LHS and RHS of a rule have the same meaning. That is undecideable in general, and infeasible in most interesting cases. The responsibility is entirely the programmer's!

GHC makes no attempt to make sure that the rules are confluent or terminating. For example:

This rule will cause the compiler to go into an infinite loop."loop" forall x,y. f x y = f y x

If more than one rule matches a call, GHC will choose one arbitrarily to apply.

GHC currently uses a very simple, syntactic, matching algorithm for matching a rule LHS with an expression. It seeks a substitution which makes the LHS and expression syntactically equal modulo alpha conversion. The pattern (rule), but not the expression, is eta-expanded if necessary. (Eta-expanding the epression can lead to laziness bugs.) But not beta conversion (that's called higher-order matching).

Matching is carried out on GHC's intermediate language, which includes type abstractions and applications. So a rule only matches if the types match too. See Section 7.9.4 below.

GHC keeps trying to apply the rules as it optimises the program. For example, consider:

The expressionlet s = map f t = map g in s (t xs)

`s (t xs)`does not match the rule`"map/map"`, but GHC will substitute for`s`and`t`, giving an expression which does match. If`s`or`t`was (a) used more than once, and (b) large or a redex, then it would not be substituted, and the rule would not fire.In the earlier phases of compilation, GHC inlines

*nothing that appears on the LHS of a rule*, because once you have substituted for something you can't match against it (given the simple minded matching). So if you write the rule

this"map/map" forall f,g. map f . map g = map (f.g)

*won't*match the expression`map f (map g xs)`. It will only match something written with explicit use of ".". Well, not quite. It*will*match the expression

wherewibble f g xs

`wibble`is defined:

becausewibble f g = map f . map g

`wibble`will be inlined (it's small). Later on in compilation, GHC starts inlining even things on the LHS of rules, but still leaves the rules enabled. This inlining policy is controlled by the per-simplification-pass flag`-finline-phase`*n*.All rules are implicitly exported from the module, and are therefore in force in any module that imports the module that defined the rule, directly or indirectly. (That is, if A imports B, which imports C, then C's rules are in force when compiling A.) The situation is very similar to that for instance declarations.

The RULES mechanism is used to implement fusion (deforestation) of common list functions. If a "good consumer" consumes an intermediate list constructed by a "good producer", the intermediate list should be eliminated entirely.

The following are good producers:

List comprehensions

Enumerations of

`Int`and`Char`(e.g.`['a'..'z']`).Explicit lists (e.g.

`[True, False]`)The cons constructor (e.g

`3:4:[]`)`++``map``filter``iterate`,`repeat``zip`,`zipWith`

The following are good consumers:

List comprehensions

`array`(on its second argument)`length``++`(on its first argument)`foldr``map``filter``concat``unzip`,`unzip2`,`unzip3`,`unzip4``zip`,`zipWith`(but on one argument only; if both are good producers,`zip`will fuse with one but not the other)`partition``head``and`,`or`,`any`,`all``sequence_``msum``sortBy`

So, for example, the following should generate no intermediate lists:

array (1,10) [(i,i*i) | i <- map (+ 1) [0..9]] |

This list could readily be extended; if there are Prelude functions that you use a lot which are not included, please tell us.

If you want to write your own good consumers or producers, look at the Prelude definitions of the above functions to see how to do so.

Rewrite rules can be used to get the same effect as a feature present in earlier version of GHC:

{-# SPECIALIZE fromIntegral :: Int8 -> Int16 = int8ToInt16 #-} |

This feature is no longer in GHC. But rewrite rules let you do the same thing:

{-# RULES "fromIntegral/Int8/Int16" fromIntegral = int8ToInt16 #-} |

forall (d1::Integral Int8) (d2::Num Int16) . fromIntegral Int8 Int16 d1 d2 = int8ToInt16 |

Use

`-ddump-rules`to see what transformation rules GHC is using.Use

`-ddump-simpl-stats`to see what rules are being fired. If you add`-dppr-debug`you get a more detailed listing.The defintion of (say)

`build`in`GHC/Base.lhs`looks llike this:

Notice thebuild :: forall a. (forall b. (a -> b -> b) -> b -> b) -> [a] {-# INLINE build #-} build g = g (:) []

`INLINE`! That prevents`(:)`from being inlined when compiling`PrelBase`, so that an importing module will “see” the`(:)`, and can match it on the LHS of a rule.`INLINE`prevents any inlining happening in the RHS of the`INLINE`thing. I regret the delicacy of this.In

`libraries/base/GHC/Base.lhs`look at the rules for`map`to see how to write rules that will do fusion and yet give an efficient program even if fusion doesn't happen. More rules in`GHC/List.lhs`.

The external core format supports "Note" annotations;
the `CORE` pragma gives a way to specify what these
should be in your Haskell source code. Syntactically, core
annotations are attached to expressions and take a Haskell string
literal as an argument. The following function definition shows an
example:

f x = ({-# CORE "foo" #-} show) ({-# CORE "bar" #-} x) |

g x = show x |

However, when external for is generated (via
`-fext-core`), there will be Notes attached to the
expressions `show` and `x`.
The core function declaration for `f` is:

f :: %forall a . GHCziShow.ZCTShow a -> a -> GHCziBase.ZMZN GHCziBase.Char = \ @ a (zddShow::GHCziShow.ZCTShow a) (eta::a) -> (%note "foo" %case zddShow %of (tpl::GHCziShow.ZCTShow a) {GHCziShow.ZCDShow (tpl1::GHCziBase.Int -> a -> GHCziBase.ZMZN GHCziBase.Char -> GHCziBase.ZMZN GHCziBase.Cha r) (tpl2::a -> GHCziBase.ZMZN GHCziBase.Char) (tpl3::GHCziBase.ZMZN a -> GHCziBase.ZMZN GHCziBase.Char -> GHCziBase.ZMZN GHCziBase.Cha r) -> tpl2}) (%note "foo" eta); |

Here, we can see that the function `show` (which
has been expanded out to a case expression over the Show dictionary)
has a `%note` attached to it, as does the
expression `eta` (which used to be called
`x`).