-- | Implements a selective lambda lifter, running late in the optimisation -- pipeline. -- -- The transformation itself is implemented in "StgLiftLams.Transformation". -- If you are interested in the cost model that is employed to decide whether -- to lift a binding or not, look at "StgLiftLams.Analysis". -- "StgLiftLams.LiftM" contains the transformation monad that hides away some -- plumbing of the transformation. module StgLiftLams ( -- * Late lambda lifting in STG -- $note Transformation.stgLiftLams ) where import qualified StgLiftLams.Transformation as Transformation -- Note [Late lambda lifting in STG] -- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -- $note -- See also the <https://ghc.haskell.org/trac/ghc/wiki/LateLamLift wiki page> -- and Trac #9476. -- -- The basic idea behind lambda lifting is to turn locally defined functions -- into top-level functions. Free variables are then passed as additional -- arguments at *call sites* instead of having a closure allocated for them at -- *definition site*. Example: -- -- @ -- let x = ...; y = ... in -- let f = {x y} \a -> a + x + y in -- let g = {f x} \b -> f b + x in -- g 5 -- @ -- -- Lambda lifting @f@ would -- -- 1. Turn @f@'s free variables into formal parameters -- 2. Update @f@'s call site within @g@ to @f x y b@ -- 3. Update @g@'s closure: Add @y@ as an additional free variable, while -- removing @f@, because @f@ no longer allocates and can be floated to -- top-level. -- 4. Actually float the binding of @f@ to top-level, eliminating the @let@ -- in the process. -- -- This results in the following program (with free var annotations): -- -- @ -- f x y a = a + x + y; -- let x = ...; y = ... in -- let g = {x y} \b -> f x y b + x in -- g 5 -- @ -- -- This optimisation is all about lifting only when it is beneficial to do so. -- The above seems like a worthwhile lift, judging from heap allocation: -- We eliminate @f@'s closure, saving to allocate a closure with 2 words, while -- not changing the size of @g@'s closure. -- -- You can probably sense that there's some kind of cost model at play here. -- And you are right! But we also employ a couple of other heuristics for the -- lifting decision which are outlined in "StgLiftLams.Analysis#when". -- -- The transformation is done in "StgLiftLams.Transformation", which calls out -- to 'StgLiftLams.Analysis.goodToLift' for its lifting decision. -- It relies on "StgLiftLams.LiftM", which abstracts some subtle STG invariants -- into a monadic substrate. -- -- Suffice to say: We trade heap allocation for stack allocation. -- The additional arguments have to passed on the stack (or in registers, -- depending on architecture) every time we call the function to save a single -- heap allocation when entering the let binding. Nofib suggests a mean -- improvement of about 1% for this pass, so it seems like a worthwhile thing to -- do. Compile-times went up by 0.6%, so all in all a very modest change. -- -- For a concrete example, look at @spectral/atom@. There's a call to 'zipWith' -- that is ultimately compiled to something like this -- (module desugaring/lowering to actual STG): -- -- @ -- propagate dt = ...; -- runExperiment ... = -- let xs = ... in -- let ys = ... in -- let go = {dt go} \xs ys -> case (xs, ys) of -- ([], []) -> [] -- (x:xs', y:ys') -> propagate dt x y : go xs' ys' -- in go xs ys -- @ -- -- This will lambda lift @go@ to top-level, speeding up the resulting program -- by roughly one percent: -- -- @ -- propagate dt = ...; -- go dt xs ys = case (xs, ys) of -- ([], []) -> [] -- (x:xs', y:ys') -> propagate dt x y : go dt xs' ys' -- runExperiment ... = -- let xs = ... in -- let ys = ... in -- in go dt xs ys -- @