ghc-8.8.1: The GHC API

StgLiftLams

Description

Implements a selective lambda lifter, running late in the optimisation pipeline.

The transformation itself is implemented in StgLiftLams.Transformation. If you are interested in the cost model that is employed to decide whether to lift a binding or not, look at StgLiftLams.Analysis. StgLiftLams.LiftM contains the transformation monad that hides away some plumbing of the transformation.

Synopsis

# Late lambda lifting in STG

The basic idea behind lambda lifting is to turn locally defined functions into top-level functions. Free variables are then passed as additional arguments at *call sites* instead of having a closure allocated for them at *definition site*. Example:

   let x = ...; y = ... in
let f = {x y} a -> a + x + y in
let g = {f x} b -> f b + x in
g 5


Lambda lifting f would

1. Turn f's free variables into formal parameters
2. Update f's call site within g to f x y b
3. Update g's closure: Add y as an additional free variable, while removing f, because f no longer allocates and can be floated to top-level.
4. Actually float the binding of f to top-level, eliminating the let in the process.

This results in the following program (with free var annotations):

   f x y a = a + x + y;
let x = ...; y = ... in
let g = {x y} b -> f x y b + x in
g 5


This optimisation is all about lifting only when it is beneficial to do so. The above seems like a worthwhile lift, judging from heap allocation: We eliminate f's closure, saving to allocate a closure with 2 words, while not changing the size of g's closure.

You can probably sense that there's some kind of cost model at play here. And you are right! But we also employ a couple of other heuristics for the lifting decision which are outlined in StgLiftLams.Analysis.

The transformation is done in StgLiftLams.Transformation, which calls out to goodToLift for its lifting decision. It relies on StgLiftLams.LiftM, which abstracts some subtle STG invariants into a monadic substrate.

Suffice to say: We trade heap allocation for stack allocation. The additional arguments have to passed on the stack (or in registers, depending on architecture) every time we call the function to save a single heap allocation when entering the let binding. Nofib suggests a mean improvement of about 1% for this pass, so it seems like a worthwhile thing to do. Compile-times went up by 0.6%, so all in all a very modest change.

For a concrete example, look at spectral/atom. There's a call to zipWith that is ultimately compiled to something like this (module desugaring/lowering to actual STG):

   propagate dt = ...;
runExperiment ... =
let xs = ... in
let ys = ... in
let go = {dt go} xs ys -> case (xs, ys) of
([], []) -> []
(x:xs', y:ys') -> propagate dt x y : go xs' ys'
in go xs ys


This will lambda lift go to top-level, speeding up the resulting program by roughly one percent:

   propagate dt = ...;
go dt xs ys = case (xs, ys) of
([], []) -> []
(x:xs', y:ys') -> propagate dt x y : go dt xs' ys'
runExperiment ... =
let xs = ... in
let ys = ... in
in go dt xs ys


Lambda lifts bindings to top-level deemed worth lifting (see goodToLift).