{-# LANGUAGE ScopedTypeVariables #-} module MkZipCfg ( AGraph, (<*>), catAGraphs , freshBlockId , emptyAGraph, withFreshLabel, withUnique , mkMiddle, mkMiddles, mkLast, mkZTail, mkBranch, mkLabel, mkIfThenElse, mkWhileDo , outOfLine , emptyGraph, graphOfMiddles, graphOfZTail , lgraphOfAGraph, graphOfAGraph, labelAGraph, pprAGraph ) where import BlockId (BlockId(..), emptyBlockEnv, plusBlockEnv) import ZipCfg import Outputable import Unique import UniqSupply import Util import Prelude hiding (zip, unzip, last) #include "HsVersions.h" ------------------------------------------------------------------------- -- GENERIC ZIPPER-BASED CONTROL-FLOW GRAPH (CONSTRUCTOR VIEW) -- ------------------------------------------------------------------------- {- You can think of an AGraph like this: it is the program built by composing in sequence three kinds of nodes: * Label nodes (e.g. L2:) * Middle nodes (e.g. x = y*3) * Last nodes (e.g. if b then goto L1 else goto L2) The constructors mkLabel, mkMiddle, and mkLast build single-node AGraphs of the indicated type. The composition operator <*> glues AGraphs together in sequence (in constant time). For example: x = 0 L1: x = x+1 if x<10 then goto L1 else goto L2 L2: y = y*x x = 0 Notice that the AGraph may begin without a label, and may end without a control transfer. Control *always* falls through a label and middle node, and *never* falls through a Last node. A 'AGraph m l' is simply an abstract version of a 'Graph m l' from module 'ZipCfg'. The only difference is that the 'AGraph m l' supports a constant-time splicing operation, written infix <*>. That splicing operation, together with the constructor functions in this module (and with 'labelAGraph'), is the recommended way to build large graphs. Each construction or splice has constant cost, and to turn an AGraph into a Graph requires time linear in the number of nodes and N log N in the number of basic blocks. The splicing operation warrants careful explanation. Like a Graph, an AGraph is a control-flow graph which begins with a distinguished, unlabelled sequence of middle nodes called the *entry*. An unlabelled graph may also end with a sequence of middle nodes called the *exit*. The entry may fall straight through to the exit, or it may fall into the rest of the graph, which may include arbitrary control flow. Using ASCII art, here are examples of the two kinds of graph. On the left, the entry and exit sequences are labelled A and B, where the control flow in the middle is labelled X. On the right, there is no exit sequence: | | | A | C | | / \ / \ / \ / \ | X | | Y | \ / \ / \ / \_/ | | B | The AGraph has these properties: * A AGraph is opaque; nothing about its structure can be observed. * A AGraph may be turned into a LGraph in time linear in the number of nodes and O(N log N) in the number of basic blocks. * Two AGraphs may be spliced in constant time by writing g1 <*> g2 There are two rules for splicing, depending on whether the left-hand graph falls through. If it does, the rule is as follows: | | | | A | C | A | | | / \ / \ / \ / \ / \ / \ | X | <*> | Y | = | X | \ / \ / \ / \ / \_/ \ / | | | | B | D | B | | | | | C | / \ / \ | Y | \ / \ / | | D | And in the case where the left-hand graph does not fall through, the rule is | | | | A | C | A | | | / \ / \ / \ / \ / \ / \ | X | <*> | Y | = | X | \ / \ / \ / \_/ \_/ \_/ | | D _ | / \ / \ | Y | \ / \ / | | D | In this case C will become unreachable and is lost; when such a graph is converted into a data structure, the system will bleat about unreachable code. Also it must be assumed that there are branches from somewhere in X to labelled blocks in Y; otherwise Y and D are unreachable as well. (However, it may be the case that X branches into some third AGraph, which in turn branches into D; the representation is agnostic on this point.) -} infixr 3 <*> (<*>) :: AGraph m l -> AGraph m l -> AGraph m l catAGraphs :: [AGraph m l] -> AGraph m l -- | A graph is built up by splicing together graphs each containing a -- single node (where a label is considered a 'first' node. The empty -- graph is a left and right unit for splicing. All of the AGraph -- constructors (even complex ones like 'mkIfThenElse', as well as the -- splicing operation <*>, are constant-time operations. emptyAGraph :: AGraph m l mkLabel :: (LastNode l) => BlockId -> AGraph m l -- graph contains the label mkMiddle :: m -> AGraph m l -- graph contains the node mkLast :: (Outputable m, Outputable l, LastNode l) => l -> AGraph m l -- graph contains the node -- | This function provides access to fresh labels without requiring -- clients to be programmed monadically. withFreshLabel :: String -> (BlockId -> AGraph m l) -> AGraph m l withUnique :: (Unique -> AGraph m l) -> AGraph m l outOfLine :: (LastNode l, Outputable m, Outputable l) => AGraph m l -> AGraph m l -- ^ The argument is an AGraph that has an -- empty entry sequence and no exit sequence. -- The result is a new AGraph that has an empty entry sequence -- connected to an empty exit sequence, with the original graph -- sitting to the side out-of-line. -- -- Example: mkMiddle (x = 3) -- <*> outOfLine (mkLabel L <*> ...stuff...) -- <*> mkMiddle (y = x) -- Control will flow directly from x=3 to y=x; -- the block starting with L is "on the side". -- -- N.B. algebraically forall g g' : g <*> outOfLine g' == outOfLine g' <*> g -- below for convenience mkMiddles :: [m] -> AGraph m l mkZTail :: (Outputable m, Outputable l, LastNode l) => ZTail m l -> AGraph m l mkBranch :: (Outputable m, Outputable l, LastNode l) => BlockId -> AGraph m l -- | For the structured control-flow constructs, a condition is -- represented as a function that takes as arguments the labels to -- goto on truth or falsehood. -- -- mkIfThenElse mk_cond then else -- = (mk_cond L1 L2) <*> L1: then <*> goto J -- <*> L2: else <*> goto J -- <*> J: -- -- where L1, L2, J are fresh mkIfThenElse :: (Outputable m, Outputable l, LastNode l) => (BlockId -> BlockId -> AGraph m l) -- branch condition -> AGraph m l -- code in the 'then' branch -> AGraph m l -- code in the 'else' branch -> AGraph m l -- resulting if-then-else construct mkWhileDo :: (Outputable m, Outputable l, LastNode l) => (BlockId -> BlockId -> AGraph m l) -- loop condition -> AGraph m l -- body of the bloop -> AGraph m l -- the final while loop -- | Converting an abstract graph to a concrete form is expensive: the -- cost is linear in the number of nodes in the answer, plus N log N -- in the number of basic blocks. The conversion is also monadic -- because it may require the allocation of fresh, unique labels. graphOfAGraph :: AGraph m l -> UniqSM (Graph m l) lgraphOfAGraph :: AGraph m l -> UniqSM (LGraph m l) -- ^ allocate a fresh label for the entry point labelAGraph :: BlockId -> AGraph m l -> UniqSM (LGraph m l) -- ^ use the given BlockId as the label of the entry point -- | The functions below build Graphs directly; for convenience, they -- are included here with the rest of the constructor functions. emptyGraph :: Graph m l graphOfMiddles :: [m] -> Graph m l graphOfZTail :: ZTail m l -> Graph m l -- ================================================================ -- IMPLEMENTATION -- ================================================================ newtype AGraph m l = AGraph (Graph m l -> UniqSM (Graph m l)) -- an AGraph is a monadic function from a successor Graph to a new Graph AGraph f1 <*> AGraph f2 = AGraph f where f g = f2 g >>= f1 -- note right associativity catAGraphs = foldr (<*>) emptyAGraph emptyAGraph = AGraph return graphOfAGraph (AGraph f) = f emptyGraph emptyGraph = Graph (ZLast LastExit) emptyBlockEnv labelAGraph id g = do Graph tail blocks <- graphOfAGraph g return $ LGraph id $ insertBlock (Block id tail) blocks lgraphOfAGraph g = do id <- freshBlockId "graph entry" labelAGraph id g ------------------------------------- -- constructors mkLabel id = AGraph f where f (Graph tail blocks) = return $ Graph (ZLast (mkBranchNode id)) (insertBlock (Block id tail) blocks) mkBranch target = mkLast $ mkBranchNode target mkMiddle m = AGraph f where f (Graph tail blocks) = return $ Graph (ZTail m tail) blocks mkMiddles ms = AGraph f where f (Graph tail blocks) = return $ Graph (foldr ZTail tail ms) blocks graphOfMiddles ms = Graph (foldr ZTail (ZLast LastExit) ms) emptyBlockEnv graphOfZTail t = Graph t emptyBlockEnv mkLast l = AGraph f where f (Graph tail blocks) = do note_this_code_becomes_unreachable "mkLast" (ppr l <+> ppr blocks) tail return $ Graph (ZLast (LastOther l)) blocks mkZTail tail = AGraph f where f (Graph utail blocks) = do note_this_code_becomes_unreachable "mkZTail" (ppr tail) utail return $ Graph tail blocks withFreshLabel name ofId = AGraph f where f g = do id <- freshBlockId name let AGraph f' = ofId id f' g withUnique ofU = AGraph f where f g = do u <- getUniqueM let AGraph f' = ofU u f' g outOfLine (AGraph f) = AGraph f' where f' (Graph tail' blocks') = do Graph emptyEntrance blocks <- f emptyGraph note_this_code_becomes_unreachable "outOfLine" (ppr tail') emptyEntrance return $ Graph tail' (blocks `plusBlockEnv` blocks') mkIfThenElse cbranch tbranch fbranch = withFreshLabel "end of if" $ \endif -> withFreshLabel "start of then" $ \tid -> withFreshLabel "start of else" $ \fid -> cbranch tid fid <*> mkLabel tid <*> tbranch <*> mkBranch endif <*> mkLabel fid <*> fbranch <*> mkLabel endif mkWhileDo cbranch body = withFreshLabel "loop test" $ \test -> withFreshLabel "loop head" $ \head -> withFreshLabel "end while" $ \endwhile -> -- Forrest Baskett's while-loop layout mkBranch test <*> mkLabel head <*> body <*> mkLabel test <*> cbranch head endwhile <*> mkLabel endwhile -- | Bleat if the insertion of a last node will create unreachable code note_this_code_becomes_unreachable :: (Monad m, LastNode l, Outputable middle, Outputable l) => String -> SDoc -> ZTail middle l -> m () note_this_code_becomes_unreachable str old = if debugIsOn then u else \_ -> return () where u (ZLast LastExit) = return () u (ZLast (LastOther l)) | isBranchNode l = return () -- Note [Branch follows branch] u tail = fail ("unreachable code in " ++ str ++ ": " ++ (showSDoc ((ppr tail) <+> old))) -- | The string argument to 'freshBlockId' was originally helpful in debugging -- the Quick C-- compiler, so I have kept it here even though at present it is -- thrown away at this spot---there's no reason a BlockId couldn't one day carry -- a string. freshBlockId :: MonadUnique m => String -> m BlockId freshBlockId _s = getUniqueM >>= return . BlockId ------------------------------------- -- Debugging pprAGraph :: (Outputable m, LastNode l, Outputable l) => AGraph m l -> UniqSM SDoc pprAGraph g = graphOfAGraph g >>= return . ppr {- Note [Branch follows branch] ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Why do we say it's ok for a Branch to follow a Branch? Because the standard constructor mkLabel-- has fall-through semantics. So if you do a mkLabel, you finish the current block, giving it a label, and start a new one that branches to that label. Emitting a Branch at this point is fine: goto L1; L2: ...stuff... -}