ghc-7.10.3: The GHC API

Safe HaskellNone
LanguageHaskell2010

CoreSyn

Contents

Description

CoreSyn holds all the main data types for use by for the Glasgow Haskell Compiler midsection

Synopsis

Main data types

data Expr b Source

This is the data type that represents GHCs core intermediate language. Currently GHC uses System FC http://research.microsoft.com/~simonpj/papers/ext-f/ for this purpose, which is closely related to the simpler and better known System F http://en.wikipedia.org/wiki/System_F.

We get from Haskell source to this Core language in a number of stages:

  1. The source code is parsed into an abstract syntax tree, which is represented by the data type HsExpr with the names being RdrNames
  2. This syntax tree is renamed, which attaches a Unique to every RdrName (yielding a Name) to disambiguate identifiers which are lexically identical. For example, this program:
     f x = let f x = x + 1
           in f (x - 2)

Would be renamed by having Uniques attached so it looked something like this:

     f_1 x_2 = let f_3 x_4 = x_4 + 1
               in f_3 (x_2 - 2)

But see Note [Shadowing] below.

  1. The resulting syntax tree undergoes type checking (which also deals with instantiating type class arguments) to yield a HsExpr type that has Id as it's names.
  2. Finally the syntax tree is desugared from the expressive HsExpr type into this Expr type, which has far fewer constructors and hence is easier to perform optimization, analysis and code generation on.

The type parameter b is for the type of binders in the expression tree.

The language consists of the following elements:

  • Variables
  • Primitive literals
  • Applications: note that the argument may be a Type.

See CoreSyn for another invariant

  • Lambda abstraction
  • Recursive and non recursive lets. Operationally this corresponds to allocating a thunk for the things bound and then executing the sub-expression.

The right hand sides of all top-level and recursive lets must be of lifted type (see Type for the meaning of lifted vs. unlifted).

See Note [CoreSyn let/app invariant]

We allow a non-recursive let to bind a type variable, thus:

Let (NonRec tv (Type ty)) body

This can be very convenient for postponing type substitutions until the next run of the simplifier.

At the moment, the rest of the compiler only deals with type-let in a Let expression, rather than at top level. We may want to revist this choice.

  • Case split. Operationally this corresponds to evaluating the scrutinee (expression examined) to weak head normal form and then examining at most one level of resulting constructor (i.e. you cannot do nested pattern matching directly with this).

The binder gets bound to the value of the scrutinee, and the Type must be that of all the case alternatives

This is one of the more complicated elements of the Core language, and comes with a number of restrictions:

  1. The list of alternatives may be empty; See Note [Empty case alternatives]
  2. The DEFAULT case alternative must be first in the list, if it occurs at all.
  3. The remaining cases are in order of increasing tag (for DataAlts) or lit (for LitAlts). This makes finding the relevant constructor easy, and makes comparison easier too.
  4. The list of alternatives must be exhaustive. An exhaustive case does not necessarily mention all constructors:
           data Foo = Red | Green | Blue
      ... case x of
           Red   -> True
           other -> f (case x of
                           Green -> ...
                           Blue  -> ... ) ...
      

The inner case does not need a Red alternative, because x can't be Red at that program point.

  1. Floating-point values must not be scrutinised against literals. See Trac #9238 and Note [Rules for floating-point comparisons] in PrelRules for rationale.
  • Cast an expression to a particular type. This is used to implement newtypes (a newtype constructor or destructor just becomes a Cast in Core) and GADTs.
  • Notes. These allow general information to be added to expressions in the syntax tree
  • A type: this should only show up at the top level of an Arg
  • A coercion

Constructors

Var Id 
Lit Literal 
App (Expr b) (Arg b) infixl 4 
Lam b (Expr b) 
Let (Bind b) (Expr b) 
Case (Expr b) b Type [Alt b] 
Cast (Expr b) Coercion 
Tick (Tickish Id) (Expr b) 
Type Type 
Coercion Coercion 

Instances

Data b => Data (Expr b) 

type Alt b = (AltCon, [b], Expr b) Source

A case split alternative. Consists of the constructor leading to the alternative, the variables bound from the constructor, and the expression to be executed given that binding. The default alternative is (DEFAULT, [], rhs)

data Bind b Source

Binding, used for top level bindings in a module and local bindings in a let.

Constructors

NonRec b (Expr b) 
Rec [(b, Expr b)] 

Instances

Data b => Data (Bind b) 

data AltCon Source

A case alternative constructor (i.e. pattern match)

Constructors

DataAlt DataCon 
LitAlt Literal

A literal: case e of { 1 -> ... } Invariant: always an *unlifted* literal See Note [Literal alternatives]

DEFAULT

Trivial alternative: case e of { _ -> ... }

type Arg b = Expr b Source

Type synonym for expressions that occur in function argument positions. Only Arg should contain a Type at top level, general Expr should not

data Tickish id Source

Allows attaching extra information to points in expressions

Constructors

ProfNote

An {--} profiling annotation, either automatically added by the desugarer as a result of -auto-all, or added by the user.

Fields

profNoteCC :: CostCentre

the cost centre

profNoteCount :: !Bool

bump the entry count?

profNoteScope :: !Bool

scopes over the enclosed expression (i.e. not just a tick)

HpcTick

A "tick" used by HPC to track the execution of each subexpression in the original source code.

Fields

tickModule :: Module
 
tickId :: !Int
 
Breakpoint

A breakpoint for the GHCi debugger. This behaves like an HPC tick, but has a list of free variables which will be available for inspection in GHCi when the program stops at the breakpoint.

NB. we must take account of these Ids when (a) counting free variables, and (b) substituting (don't substitute for them)

Fields

breakpointId :: !Int
 
breakpointFVs :: [id]

the order of this list is important: it matches the order of the lists in the appropriate entry in HscTypes.ModBreaks.

Careful about substitution! See Note [substTickish] in CoreSubst.

SourceNote

A source note.

Source notes are pure annotations: Their presence should neither influence compilation nor execution. The semantics are given by causality: The presence of a source note means that a local change in the referenced source code span will possibly provoke the generated code to change. On the flip-side, the functionality of annotated code *must* be invariant against changes to all source code *except* the spans referenced in the source notes (see "Causality of optimized Haskell" paper for details).

Therefore extending the scope of any given source note is always valid. Note that it is still undesirable though, as this reduces their usefulness for debugging and profiling. Therefore we will generally try only to make use of this property where it is neccessary to enable optimizations.

Fields

sourceSpan :: RealSrcSpan

Source covered

sourceName :: String

Name for source location (uses same names as CCs)

Instances

Eq id => Eq (Tickish id) 
Data id => Data (Tickish id) 
Ord id => Ord (Tickish id) 

data TickishScoping Source

Specifies the scoping behaviour of ticks. This governs the behaviour of ticks that care about the covered code and the cost associated with it. Important for ticks relating to profiling.

Constructors

NoScope

No scoping: The tick does not care about what code it covers. Transformations can freely move code inside as well as outside without any additional annotation obligations

SoftScope

Soft scoping: We want all code that is covered to stay covered. Note that this scope type does not forbid transformations from happening, as as long as all results of the transformations are still covered by this tick or a copy of it. For example

let x = tick... (let y = foo in bar) in baz ===> let x = tick... bar; y = tick... foo in baz

Is a valid transformation as far as "bar" and "foo" is concerned, because both still are scoped over by the tick.

Note though that one might object to the "let" not being covered by the tick any more. However, we are generally lax with this - constant costs don't matter too much, and given that the "let" was effectively merged we can view it as having lost its identity anyway.

Also note that this scoping behaviour allows floating a tick "upwards" in pretty much any situation. For example:

case foo of x -> tick... bar ==> tick... case foo of x -> bar

While this is always leagl, we want to make a best effort to only make us of this where it exposes transformation opportunities.

CostCentreScope

Cost centre scoping: We don't want any costs to move to other cost-centre stacks. This means we not only want no code or cost to get moved out of their cost centres, but we also object to code getting associated with new cost-centre ticks - or changing the order in which they get applied.

A rule of thumb is that we don't want any code to gain new annotations. However, there are notable exceptions, for example:

let f = y -> foo in tick... ... (f x) ... ==> tick... ... foo[x/y] ...

In-lining lambdas like this is always legal, because inlining a function does not change the cost-centre stack when the function is called.

Instances

data TickishPlacement Source

Governs the kind of expression that the tick gets placed on when annotating for example using mkTick. If we find that we want to put a tickish on an expression ruled out here, we try to float it inwards until we find a suitable expression.

Constructors

PlaceRuntime

Place ticks exactly on run-time expressions. We can still move the tick through pure compile-time constructs such as other ticks, casts or type lambdas. This is the most restrictive placement rule for ticks, as all tickishs have in common that they want to track runtime processes. The only legal placement rule for counting ticks.

PlaceNonLam

As PlaceRuntime, but we float the tick through all lambdas. This makes sense where there is little difference between annotating the lambda and annotating the lambda's code.

PlaceCostCentre

In addition to floating through lambdas, cost-centre style tickishs can also be moved from constructors, non-function variables and literals. For example:

let x = scc... C (scc... y) (scc... 3) in ...

Neither the constructor application, the variable or the literal are likely to have any cost worth mentioning. And even if y names a thunk, the call would not care about the evaluation context. Therefore removing all annotations in the above example is safe.

Instances

type CoreExpr = Expr CoreBndr Source

Expressions where binders are CoreBndrs

type CoreAlt = Alt CoreBndr Source

Case alternatives where binders are CoreBndrs

type CoreBind = Bind CoreBndr Source

Binding groups where binders are CoreBndrs

type CoreArg = Arg CoreBndr Source

Argument expressions where binders are CoreBndrs

type CoreBndr = Var Source

The common case for the type of binders and variables when we are manipulating the Core language within GHC

data TaggedBndr t Source

Binders are tagged with a t

Constructors

TB CoreBndr t 

Expr construction

mkLets :: [Bind b] -> Expr b -> Expr b Source

Bind all supplied binding groups over an expression in a nested let expression. Assumes that the rhs satisfies the let/app invariant. Prefer to use mkCoreLets if possible, which does guarantee the invariant

mkLams :: [b] -> Expr b -> Expr b Source

Bind all supplied binders over an expression in a nested lambda expression. Prefer to use mkCoreLams if possible

mkApps :: Expr b -> [Arg b] -> Expr b infixl 4 Source

Apply a list of argument expressions to a function expression in a nested fashion. Prefer to use mkCoreApps if possible

mkTyApps :: Expr b -> [Type] -> Expr b infixl 4 Source

Apply a list of type argument expressions to a function expression in a nested fashion

mkCoApps :: Expr b -> [Coercion] -> Expr b infixl 4 Source

Apply a list of coercion argument expressions to a function expression in a nested fashion

mkVarApps :: Expr b -> [Var] -> Expr b infixl 4 Source

Apply a list of type or value variables to a function expression in a nested fashion

mkIntLit :: DynFlags -> Integer -> Expr b Source

Create a machine integer literal expression of type Int# from an Integer. If you want an expression of type Int use mkIntExpr

mkIntLitInt :: DynFlags -> Int -> Expr b Source

Create a machine integer literal expression of type Int# from an Int. If you want an expression of type Int use mkIntExpr

mkWordLit :: DynFlags -> Integer -> Expr b Source

Create a machine word literal expression of type Word# from an Integer. If you want an expression of type Word use mkWordExpr

mkWordLitWord :: DynFlags -> Word -> Expr b Source

Create a machine word literal expression of type Word# from a Word. If you want an expression of type Word use mkWordExpr

mkCharLit :: Char -> Expr b Source

Create a machine character literal expression of type Char#. If you want an expression of type Char use mkCharExpr

mkStringLit :: String -> Expr b Source

Create a machine string literal expression of type Addr#. If you want an expression of type String use mkStringExpr

mkFloatLit :: Rational -> Expr b Source

Create a machine single precision literal expression of type Float# from a Rational. If you want an expression of type Float use mkFloatExpr

mkFloatLitFloat :: Float -> Expr b Source

Create a machine single precision literal expression of type Float# from a Float. If you want an expression of type Float use mkFloatExpr

mkDoubleLit :: Rational -> Expr b Source

Create a machine double precision literal expression of type Double# from a Rational. If you want an expression of type Double use mkDoubleExpr

mkDoubleLitDouble :: Double -> Expr b Source

Create a machine double precision literal expression of type Double# from a Double. If you want an expression of type Double use mkDoubleExpr

mkConApp :: DataCon -> [Arg b] -> Expr b Source

Apply a list of argument expressions to a data constructor in a nested fashion. Prefer to use mkCoreConApps if possible

mkConApp2 :: DataCon -> [Type] -> [Var] -> Expr b Source

mkTyBind :: TyVar -> Type -> CoreBind Source

Create a binding group where a type variable is bound to a type. Per CoreSyn, this can only be used to bind something in a non-recursive let expression

mkCoBind :: CoVar -> Coercion -> CoreBind Source

Create a binding group where a type variable is bound to a type. Per CoreSyn, this can only be used to bind something in a non-recursive let expression

varToCoreExpr :: CoreBndr -> Expr b Source

Convert a binder into either a Var or Type Expr appropriately

cmpAltCon :: AltCon -> AltCon -> Ordering Source

Compares AltCons within a single list of alternatives

cmpAlt :: (AltCon, a, b) -> (AltCon, a, b) -> Ordering Source

ltAlt :: (AltCon, a, b) -> (AltCon, a, b) -> Bool Source

Simple Expr access functions and predicates

bindersOf :: Bind b -> [b] Source

Extract every variable by this group

bindersOfBinds :: [Bind b] -> [b] Source

bindersOf applied to a list of binding groups

rhssOfAlts :: [Alt b] -> [Expr b] Source

collectBinders :: Expr b -> ([b], Expr b) Source

We often want to strip off leading lambdas before getting down to business. This function is your friend.

collectTyBinders :: CoreExpr -> ([TyVar], CoreExpr) Source

Collect as many type bindings as possible from the front of a nested lambda

collectValBinders :: CoreExpr -> ([Id], CoreExpr) Source

Collect as many value bindings as possible from the front of a nested lambda

collectTyAndValBinders :: CoreExpr -> ([TyVar], [Id], CoreExpr) Source

Collect type binders from the front of the lambda first, then follow up by collecting as many value bindings as possible from the resulting stripped expression

collectArgs :: Expr b -> (Expr b, [Arg b]) Source

Takes a nested application expression and returns the the function being applied and the arguments to which it is applied

collectArgsTicks :: (Tickish Id -> Bool) -> Expr b -> (Expr b, [Arg b], [Tickish Id]) Source

Like collectArgs, but also collects looks through floatable ticks if it means that we can find more arguments.

flattenBinds :: [Bind b] -> [(b, Expr b)] Source

Collapse all the bindings in the supplied groups into a single list of lhs/rhs pairs suitable for binding in a Rec binding group

isValArg :: Expr b -> Bool Source

Returns True for value arguments, false for type args NB: coercions are value arguments (zero width, to be sure, like State#, but still value args).

isTypeArg :: Expr b -> Bool Source

Returns True iff the expression is a Type expression at its top level. Note this does NOT include Coercions.

isTyCoArg :: Expr b -> Bool Source

Returns True iff the expression is a Type or Coercion expression at its top level

valArgCount :: [Arg b] -> Int Source

The number of argument expressions that are values rather than types at their top level

valBndrCount :: [CoreBndr] -> Int Source

The number of binders that bind values rather than types

isRuntimeArg :: CoreExpr -> Bool Source

Will this argument expression exist at runtime?

isRuntimeVar :: Var -> Bool Source

Will this variable exist at runtime?

tickishCounts :: Tickish id -> Bool Source

A "counting tick" (where tickishCounts is True) is one that counts evaluations in some way. We cannot discard a counting tick, and the compiler should preserve the number of counting ticks as far as possible.

However, we still allow the simplifier to increase or decrease sharing, so in practice the actual number of ticks may vary, except that we never change the value from zero to non-zero or vice versa.

tickishScoped :: Tickish id -> TickishScoping Source

Returns the intended scoping rule for a Tickish

tickishScopesLike :: Tickish id -> TickishScoping -> Bool Source

Returns whether the tick scoping rule is at least as permissive as the given scoping rule.

tickishFloatable :: Tickish id -> Bool Source

Returns True for ticks that can be floated upwards easily even where it might change execution counts, such as:

Just (tick... foo) ==> tick... (Just foo)

This is a combination of tickishSoftScope and tickishCounts. Note that in principle splittable ticks can become floatable using mkNoTick -- even though there's currently no tickish for which that is the case.

tickishCanSplit :: Tickish id -> Bool Source

Returns True for a tick that is both counting and scoping and can be split into its (tick, scope) parts using mkNoScope and mkNoTick respectively.

tickishIsCode :: Tickish id -> Bool Source

Return True if this source annotation compiles to some backend code. Without this flag, the tickish is seen as a simple annotation that does not have any associated evaluation code.

What this means that we are allowed to disregard the tick if doing so means that we can skip generating any code in the first place. A typical example is top-level bindings:

foo = tick... y -> ... ==> foo = y -> tick... ...

Here there is just no operational difference between the first and the second version. Therefore code generation should simply translate the code as if it found the latter.

tickishPlace :: Tickish id -> TickishPlacement Source

Placement behaviour we want for the ticks

tickishContains :: Eq b => Tickish b -> Tickish b -> Bool Source

Returns whether one tick "contains" the other one, therefore making the second tick redundant.

Unfolding data types

data Unfolding Source

Records the unfolding of an identifier, which is approximately the form the identifier would have if we substituted its definition in for the identifier. This type should be treated as abstract everywhere except in CoreUnfold

Constructors

NoUnfolding

We have no information about the unfolding

OtherCon [AltCon]

It ain't one of these constructors. OtherCon xs also indicates that something has been evaluated and hence there's no point in re-evaluating it. OtherCon [] is used even for non-data-type values to indicated evaluated-ness. Notably:

data C = C !(Int -> Int)
case x of { C f -> ... }

Here, f gets an OtherCon [] unfolding.

DFunUnfolding 

Fields

df_bndrs :: [Var]
 
df_con :: DataCon
 
df_args :: [CoreExpr]
 
CoreUnfolding

An unfolding with redundant cached information. Parameters:

uf_tmpl: Template used to perform unfolding; NB: Occurrence info is guaranteed correct: see Note [OccInfo in unfoldings and rules]

uf_is_top: Is this a top level binding?

uf_is_value: exprIsHNF template (cached); it is ok to discard a seq on this variable

uf_is_work_free: Does this waste only a little work if we expand it inside an inlining? Basically this is a cached version of exprIsWorkFree

uf_guidance: Tells us about the size of the unfolding template

data UnfoldingGuidance Source

UnfoldingGuidance says when unfolding should take place

Constructors

UnfWhen 
UnfIfGoodArgs 

Fields

ug_args :: [Int]
 
ug_size :: Int
 
ug_res :: Int
 
UnfNever 

Constructing Unfoldings

evaldUnfolding :: Unfolding Source

This unfolding marks the associated thing as being evaluated

Predicates and deconstruction on Unfolding

unfoldingTemplate :: Unfolding -> CoreExpr Source

Retrieves the template of an unfolding: panics if none is known

maybeUnfoldingTemplate :: Unfolding -> Maybe CoreExpr Source

Retrieves the template of an unfolding if possible maybeUnfoldingTemplate is used mainly wnen specialising, and we do want to specialise DFuns, so it's important to return a template for DFunUnfoldings

otherCons :: Unfolding -> [AltCon] Source

The constructors that the unfolding could never be: returns [] if no information is available

isValueUnfolding :: Unfolding -> Bool Source

Determines if it is certainly the case that the unfolding will yield a value (something in HNF): returns False if unsure

isEvaldUnfolding :: Unfolding -> Bool Source

Determines if it possibly the case that the unfolding will yield a value. Unlike isValueUnfolding it returns True for OtherCon

isCheapUnfolding :: Unfolding -> Bool Source

Is the thing we will unfold into certainly cheap?

isConLikeUnfolding :: Unfolding -> Bool Source

True if the unfolding is a constructor application, the application of a CONLIKE function or OtherCon

hasSomeUnfolding :: Unfolding -> Bool Source

Only returns False if there is no unfolding information available at all

Strictness

Annotated expression data types

type AnnExpr bndr annot = (annot, AnnExpr' bndr annot) Source

Annotated core: allows annotation at every node in the tree

data AnnExpr' bndr annot Source

A clone of the Expr type but allowing annotation at every tree node

Constructors

AnnVar Id 
AnnLit Literal 
AnnLam bndr (AnnExpr bndr annot) 
AnnApp (AnnExpr bndr annot) (AnnExpr bndr annot) 
AnnCase (AnnExpr bndr annot) bndr Type [AnnAlt bndr annot] 
AnnLet (AnnBind bndr annot) (AnnExpr bndr annot) 
AnnCast (AnnExpr bndr annot) (annot, Coercion) 
AnnTick (Tickish Id) (AnnExpr bndr annot) 
AnnType Type 
AnnCoercion Coercion 

data AnnBind bndr annot Source

A clone of the Bind type but allowing annotation at every tree node

Constructors

AnnNonRec bndr (AnnExpr bndr annot) 
AnnRec [(bndr, AnnExpr bndr annot)] 

type AnnAlt bndr annot = (AltCon, [bndr], AnnExpr bndr annot) Source

A clone of the Alt type but allowing annotation at every tree node

Operations on annotated expressions

collectAnnArgs :: AnnExpr b a -> (AnnExpr b a, [AnnExpr b a]) Source

Takes a nested application expression and returns the the function being applied and the arguments to which it is applied

Operations on annotations

deAnnotate :: AnnExpr bndr annot -> Expr bndr Source

deAnnotate' :: AnnExpr' bndr annot -> Expr bndr Source

deAnnAlt :: AnnAlt bndr annot -> Alt bndr Source

collectAnnBndrs :: AnnExpr bndr annot -> ([bndr], AnnExpr bndr annot) Source

As collectBinders but for AnnExpr rather than Expr

Core rule data types

data CoreRule Source

A CoreRule is:

  • "Local" if the function it is a rule for is defined in the same module as the rule itself.
  • "Orphan" if nothing on the LHS is defined in the same module as the rule itself

Constructors

Rule 

Fields

ru_name :: RuleName

Name of the rule, for communication with the user

ru_act :: Activation

When the rule is active

ru_fn :: Name

Name of the Id at the head of this rule

ru_rough :: [Maybe Name]

Name at the head of each argument to the left hand side

ru_bndrs :: [CoreBndr]

Variables quantified over

ru_args :: [CoreExpr]

Left hand side arguments

ru_rhs :: CoreExpr

Right hand side of the rule Occurrence info is guaranteed correct See Note [OccInfo in unfoldings and rules]

ru_auto :: Bool

True = this rule is auto-generated False = generated at the users behest Main effect: reporting of orphan-hood

ru_local :: Bool

True iff the fn at the head of the rule is defined in the same module as the rule and is not an implicit Id (like a record selector, class operation, or data constructor)

BuiltinRule

Built-in rules are used for constant folding and suchlike. They have no free variables.

Fields

ru_name :: RuleName

Name of the rule, for communication with the user

ru_fn :: Name

Name of the Id at the head of this rule

ru_nargs :: Int

Number of arguments that ru_try consumes, if it fires, including type arguments

ru_try :: RuleFun

This function does the rewrite. It given too many arguments, it simply discards them; the returned CoreExpr is just the rewrite of ru_fn applied to the first ru_nargs args

Operations on CoreRules

ruleArity :: CoreRule -> Int Source

The number of arguments the ru_fn must be applied to before the rule can match on it

ruleIdName :: CoreRule -> Name Source

The Name of the Id at the head of the rule left hand side

setRuleIdName :: Name -> CoreRule -> CoreRule Source

Set the Name of the Id at the head of the rule left hand side

Core vectorisation declarations data type