This section describes how GHC supports separate compilation.
When GHC compiles a source file A.hs which contains a module A, say, it generates an object A.o, and a companion interface file A.hi. The interface file is merely there to help the compiler compile other modules in the same program. Interfaces are in a binary format, so don't try to look at one; however you can see the contents of an interface file by using GHC with the --show-iface option (see Section 4.9.4, below).
NOTE: In general, the name of a file containing module M should be named M.hs or M.lhs. The only exception to this rule is module Main, which can be placed in any file.
The interface file for A contains information needed by the compiler when it compiles any module B that imports A, whether directly or indirectly. When compiling B, GHC will read A.hi to find the details that it needs to know about things defined in A.
The interface file may contain all sorts of things that aren't explicitly exported from A by the programmer. For example, even though a data type is exported abstractly, A.hi will contain the full data type definition. For small function definitions, A.hi will contain the complete definition of the function. For bigger functions, A.hi will contain strictness information about the function. And so on. GHC puts much more information into .hi files when optimisation is turned on with the -O flag (see Section 4.11). Without -O it puts in just the minimum; with -O it lobs in a whole pile of stuff.
A.hi should really be thought of as a compiler-readable version of A.o. If you use a .hi file that wasn't generated by the same compilation run that generates the .o file the compiler may assume all sorts of incorrect things about A, resulting in core dumps and other unpleasant happenings.
In your program, you import a module Foo by saying import Foo. GHC goes looking for an interface file, Foo.hi. It has a builtin list of directories (notably including .) where it looks.
This flag appends a colon-separated list of dirs to the “import directories” list, which initially contains a single entry: ".".
This list is scanned before any package directories (see Section 4.10) when looking for imports, but note that if you have a home module with the same name as a package module then this is likely to cause trouble in other ways, with link errors being the least nasty thing that can go wrong...
See also Section 4.9.5 for the significance of using relative and absolute pathnames in the -i list.
resets the “import directories” list back to nothing.
See also the section on packages (Section 4.10), which describes how to use installed libraries.
GHC supports a hierarchical module namespace as an extension to Haskell 98 (see Section 7.3.1).
A module name in general consists of a sequence of components separated by dots (‘.’). When looking for interface files for a hierarchical module, the compiler turns the dots into path separators, so for example a module A.B.C becomes A/B/C (or A\B\C under Windows). Then each component of the import directories list is tested in turn; so for example if the list contains directories D1 to Dn, then the compiler will look for the interface in D1/A/B/C.hi first, then D2/A/B/C.hi and so on.
Note that it's perfectly reasonable to have a module which is both a leaf and a branch of the tree. For example, if we have modules A.B and A.B.C, then A.B's interface file will be in A/B.hi and A.B.C's interface file will be in A/B/C.hi.
For GHCi and --make, the search strategy for source files is exactly the same, just replace the .hi suffix in the above description with .hs or .lhs.
Dumps the new interface to standard output.
The compiler does not overwrite an existing .hi interface file if the new one is the same as the old one; this is friendly to make. When an interface does change, it is often enlightening to be informed. The -ddump-hi-diffs option will make GHC run diff on the old and new .hi files.
Dump to the file "M.imports" (where M is the module being compiled) a "minimal" set of import declarations. You can safely replace all the import declarations in "M.hs" with those found in "M.imports". Why would you want to do that? Because the "minimal" imports (a) import everything explicitly, by name, and (b) import nothing that is not required. It can be quite painful to maintain this property by hand, so this flag is intended to reduce the labour.
Where file is the name of an interface file, dumps the contents of that interface in a human-readable (ish) format.
Turn off recompilation checking (which is on by default). Recompilation checking normally stops compilation early, leaving an existing .o file in place, if it can be determined that the module does not need to be recompiled.
In the olden days, GHC compared the newly-generated .hi file with the previous version; if they were identical, it left the old one alone and didn't change its modification date. In consequence, importers of a module with an unchanged output .hi file were not recompiled.
This doesn't work any more. Suppose module C imports module B, and B imports module A. So changes to A.hi should force a recompilation of C. And some changes to A (changing the definition of a function that appears in an inlining of a function exported by B, say) may conceivably not change B.hi one jot. So now…
GHC keeps a version number on each interface file, and on each type signature within the interface file. It also keeps in every interface file a list of the version numbers of everything it used when it last compiled the file. If the source file's modification date is earlier than the .o file's date (i.e. the source hasn't changed since the file was last compiled), and the reompilation checking is on, GHC will be clever. It compares the version numbers on the things it needs this time with the version numbers on the things it needed last time (gleaned from the interface file of the module being compiled); if they are all the same it stops compiling rather early in the process saying “Compilation IS NOT required”. What a beautiful sight!
Patrick Sansom had a workshop paper about how all this is done (though the details have changed quite a bit). Ask him if you want a copy.
It is reasonably straightforward to set up a Makefile to use with GHC, assuming you name your source files the same as your modules. Thus:
HC = ghc HC_OPTS = -cpp $(EXTRA_HC_OPTS) SRCS = Main.lhs Foo.lhs Bar.lhs OBJS = Main.o Foo.o Bar.o .SUFFIXES : .o .hs .hi .lhs .hc .s cool_pgm : $(OBJS) rm -f $@ $(HC) -o $@ $(HC_OPTS) $(OBJS) # Standard suffix rules .o.hi: @: .lhs.o: $(HC) -c $< $(HC_OPTS) .hs.o: $(HC) -c $< $(HC_OPTS) # Inter-module dependencies Foo.o Foo.hc Foo.s : Baz.hi # Foo imports Baz Main.o Main.hc Main.s : Foo.hi Baz.hi # Main imports Foo and Baz |
(Sophisticated make variants may achieve some of the above more elegantly. Notably, gmake's pattern rules let you write the more comprehensible:
%.o : %.lhs $(HC) -c $< $(HC_OPTS) |
What we've shown should work with any make.)
Note the cheesy .o.hi rule: It records the dependency of the interface (.hi) file on the source. The rule says a .hi file can be made from a .o file by doing…nothing. Which is true.
Note the inter-module dependencies at the end of the Makefile, which take the form
Foo.o Foo.hc Foo.s : Baz.hi # Foo imports Baz |
They tell make that if any of Foo.o, Foo.hc or Foo.s have an earlier modification date than Baz.hi, then the out-of-date file must be brought up to date. To bring it up to date, make looks for a rule to do so; one of the preceding suffix rules does the job nicely.
Putting inter-dependencies of the form Foo.o : Bar.hi into your Makefile by hand is rather error-prone. Don't worry, GHC has support for automatically generating the required dependencies. Add the following to your Makefile:
depend : ghc -M $(HC_OPTS) $(SRCS) |
Now, before you start compiling, and any time you change the imports in your program, do make depend before you do make cool_pgm. ghc -M will append the needed dependencies to your Makefile.
In general, if module A contains the line
import B ...blah... |
A.o : B.hi |
import {-# SOURCE #-} B ...blah... |
A.o : B.hi-boot |
By default, ghc -M generates all the dependencies, and then concatenates them onto the end of makefile (or Makefile if makefile doesn't exist) bracketed by the lines "# DO NOT DELETE: Beginning of Haskell dependencies" and "# DO NOT DELETE: End of Haskell dependencies". If these lines already exist in the makefile, then the old dependencies are deleted first.
Don't forget to use the same -package options on the ghc -M command line as you would when compiling; this enables the dependency generator to locate any imported modules that come from packages. The package modules won't be included in the dependencies generated, though (but see the ––include-prelude option below).
The dependency generation phase of GHC can take some additional options, which you may find useful. For historical reasons, each option passed to the dependency generator from the GHC command line must be preceded by -optdep. For example, to pass -f .depend to the dependency generator, you say
ghc -M -optdep-f -optdep.depend ... |
Turn off warnings about interface file shadowing.
Use file as the makefile, rather than makefile or Makefile. If file doesn't exist, mkdependHS creates it. We often use -f .depend to put the dependencies in .depend and then include the file .depend into Makefile.
Use .<osuf> as the "target file" suffix ( default: o). Multiple -o flags are permitted (GHC2.05 onwards). Thus "-o hc -o o" will generate dependencies for .hc and .o files.
Make extra dependencies that declare that files with suffix .<suf>_<osuf> depend on interface files with suffix .<suf>_hi, or (for {-# SOURCE #-} imports) on .hi-boot. Multiple -s flags are permitted. For example, -o hc -s a -s b will make dependencies for .hc on .hi, .a_hc on .a_hi, and .b_hc on .b_hi. (Useful in conjunction with NoFib "ways".)
Regard <file> as "stable"; i.e., exclude it from having dependencies on it.
same as ––exclude-module
Regard the colon-separated list of directories <dirs> as containing stable, don't generate any dependencies on modules therein.
Regard <file> as not "stable"; i.e., generate dependencies on it (if any). This option is normally used in conjunction with the ––exclude-directory option.
Regard modules imported from packages as unstable, i.e., generate dependencies on the package modules used (including Prelude, and all other standard Haskell libraries). This option is normally only used by the various system libraries.
Currently, the compiler does not have proper support for dealing with mutually recursive modules:
module A where import B newtype TA = MkTA Int f :: TB -> TA f (MkTB x) = MkTA x -------- module B where import A data TB = MkTB !Int g :: TA -> TB g (MkTA x) = MkTB x |
When compiling either module A and B, the compiler will try (in vain) to look for the interface file of the other. So, to get mutually recursive modules off the ground, you need to hand write an interface file for A or B, so as to break the loop. These hand-written interface files are called hi-boot files, and are placed in a file called <module>.hi-boot. To import from an hi-boot file instead of the standard .hi file, use the following syntax in the importing module:
import {-# SOURCE #-} A |
The hand-written interface need only contain the bare minimum of information needed to get the bootstrapping process started. For example, it doesn't need to contain declarations for everything that module A exports, only the things required by the module that imports A recursively.
For the example at hand, the boot interface file for A would look like the following:
module A where newtype TA = MkTA GHC.Base.Int |
The syntax is similar to a normal Haskell source file, but with some important differences:
Non-local entities must be qualified with their original defining module. Qualifying by a module which just re-exports the entity won't do. In particular, most Prelude entities aren't actually defined in the Prelude (see for example GHC.Base.Int in the above example). HINT: to find out the fully-qualified name for entities in the Prelude (or anywhere for that matter), try using GHCi's :info command, eg.
Prelude> :m -Prelude > :i IO.IO -- GHC.IOBase.IO is a type constructor newtype GHC.IOBase.IO a ... |
Only data, type, newtype, class, and type signature declarations may be included. You cannot declare instances or derive them automatically.
Notice that we only put the declaration for the newtype TA in the hi-boot file, not the signature for f, since f isn't used by B.
If you want an hi-boot file to export a data type, but you don't want to give its constructors (because the constructors aren't used by the SOURCE-importing module), you can write simply:
module A where data TA |
(You must write all the type parameters, but leave out the '=' and everything that follows it.)
Haskell specifies that when compiling module M, any instance declaration in any module "below" M is visible. (Module A is "below" M if A is imported directly by M, or if A is below a module that M imports directly.) In principle, GHC must therefore read the interface files of every module below M, just in case they contain an instance declaration that matters to M. This would be a disaster in practice, so GHC tries to be clever.
In particular, if an instance declaration is in the same module as the definition of any type or class mentioned in the head of the instance declaration, then GHC has to visit that interface file anyway. Example:
module A where instance C a => D (T a) where ... data T a = ... |
The instance declaration is only relevant if the type T is in use, and if so, GHC will have visited A's interface file to find T's definition.
The only problem comes when a module contains an instance declaration and GHC has no other reason for visiting the module. Example:
module Orphan where instance C a => D (T a) where ... class C a where ... |
An orphan module contains at least one orphan instance or at least one orphan rule.
An instance declaration in a module M is an orphan instance if none of the type constructors or classes mentioned in the instance head (the part after the ``=>'') are declared in M.
Only the instance head counts. In the example above, it is not good enough for C's declaration to be in module A; it must be the declaration of D or T.
A rewrite rule in a module M is an orphan rule if none of the variables, type constructors, or classes that are free in the left hand side of the rule are declared in M.
GHC identifies orphan modules, and visits the interface file of every orphan module below the module being compiled. This is usually wasted work, but there is no avoiding it. You should therefore do your best to have as few orphan modules as possible.
You can identify an orphan module by looking in its interface file, M.hi, using the --show-iface. If there is a ``!'' on the first line, GHC considers it an orphan module.