3.7. Separate compilation

This section describes how GHC supports separate compilation.

3.7.1. Interface files

When GHC compiles a source file F which contains a module A, say, it generates an object F.o, and a companion interface file A.hi. The interface file is not intended for human consumption, as you'll see if you take a look at one. It's merely there to help the compiler compile other modules in the same program.

NOTE: Having the name of the interface file follow the module name and not the file name, means that working with tools such as make become harder. make implicitly assumes that any output files produced by processing a translation unit will have file names that can be derived from the file name of the translation unit. For instance, pattern rules becomes unusable. For this reason, we recommend you stick to using the same file name as the module name.

The interface file for A contains information needed by the compiler when it compiles any module B that imports A, whether directly or indirectly. When compiling B, GHC will read A.hi to find the details that it needs to know about things defined in A.

Furthermore, when compiling module C which imports B, GHC may decide that it needs to know something about A—for example, B might export a function that involves a type defined in A. In this case, GHC will go and read A.hi even though C does not explicitly import A at all.

The interface file may contain all sorts of things that aren't explicitly exported from A by the programmer. For example, even though a data type is exported abstractly, A.hi will contain the full data type definition. For small function definitions, A.hi will contain the complete definition of the function. For bigger functions, A.hi will contain strictness information about the function. And so on. GHC puts much more information into .hi files when optimisation is turned on with the -O flag. Without -O it puts in just the minimum; with -O it lobs in a whole pile of stuff.

A.hi should really be thought of as a compiler-readable version of A.o. If you use a .hi file that wasn't generated by the same compilation run that generates the .o file the compiler may assume all sorts of incorrect things about A, resulting in core dumps and other unpleasant happenings.

3.7.2. Finding interface files

In your program, you import a module Foo by saying import Foo. GHC goes looking for an interface file, Foo.hi. It has a builtin list of directories (notably including .) where it looks.

-i<dirs>

This flag prepends a colon-separated list of dirs to the “import directories” list. See also Section 3.7.4 for the significance of using relative and absolute pathnames in the -i list.

-i

resets the “import directories” list back to nothing.

-fno-implicit-prelude

GHC normally imports Prelude.hi files for you. If you'd rather it didn't, then give it a -fno-implicit-prelude option. You are unlikely to get very far without a Prelude, but, hey, it's a free country.

-package <lib>

If you are using a system-supplied non-Prelude library (e.g., the POSIX library), just use a -package posix option (for example). The right interface files should then be available. The accompanying HsLibs document lists the libraries available by this mechanism.

-I<dir>

Once a Haskell module has been compiled to C (.hc file), you may wish to specify where GHC tells the C compiler to look for .h files. (Or, if you are using the -cpp option, where it tells the C pre-processor to look…) For this purpose, use a -I option in the usual C-ish way.

3.7.3. Other options related to interface files

The interface output may be directed to another file bar2/Wurble.iface with the option -ohi bar2/Wurble.iface (not recommended).

To avoid generating an interface file at all, use a -nohi option.

The compiler does not overwrite an existing .hi interface file if the new one is byte-for-byte the same as the old one; this is friendly to make. When an interface does change, it is often enlightening to be informed. The -hi-diffs option will make GHC run diff on the old and new .hi files. You can also record the difference in the interface file itself, the -keep-hi-diffs option takes care of that.

The .hi files from GHC contain “usage” information which changes often and uninterestingly. If you really want to see these changes reported, you need to use the -hi-diffs-with-usages option.

Interface files are normally jammed full of compiler-produced pragmas, which record arities, strictness info, etc. If you think these pragmas are messing you up (or you are doing some kind of weird experiment), you can tell GHC to ignore them with the -fignore-interface-pragmas option.

When compiling without optimisations on, the compiler is extra-careful about not slurping in data constructors and instance declarations that it will not need. If you believe it is getting it wrong and not importing stuff which you think it should, this optimisation can be turned off with -fno-prune-tydecls and -fno-prune-instdecls.

See also Section 3.9.3, which describes how the linker finds standard Haskell libraries.

3.7.4. The recompilation checker

-recomp

(On by default) Turn on recompilation checking. This will stop compilation early, leaving an existing .o file in place, if it can be determined that the module does not need to be recompiled.

-no-recomp

Turn off recompilation checking.

In the olden days, GHC compared the newly-generated .hi file with the previous version; if they were identical, it left the old one alone and didn't change its modification date. In consequence, importers of a module with an unchanged output .hi file were not recompiled.

This doesn't work any more. In our earlier example, module C does not import module A directly, yet changes to A.hi should force a recompilation of C. And some changes to A (changing the definition of a function that appears in an inlining of a function exported by B, say) may conceivably not change B.hi one jot. So now…

GHC keeps a version number on each interface file, and on each type signature within the interface file. It also keeps in every interface file a list of the version numbers of everything it used when it last compiled the file. If the source file's modification date is earlier than the .o file's date (i.e. the source hasn't changed since the file was last compiled), and the -recomp is given on the command line, GHC will be clever. It compares the version numbers on the things it needs this time with the version numbers on the things it needed last time (gleaned from the interface file of the module being compiled); if they are all the same it stops compiling rather early in the process saying “Compilation IS NOT required”. What a beautiful sight!

Patrick Sansom had a workshop paper about how all this is done (though the details have changed quite a bit). Ask him if you want a copy.

3.7.4.1. Packages

To simplify organisation and compilation, GHC keeps libraries in packages. Packages are also compiled into single libraries on Unix, and DLLs on Windows. The term ``package'' can be used pretty much synonymously with ``library'', except that an application also forms a package, the Main package.

  • A package is a group of modules. It may span many directories, or many packages may exist in a single directory. Packages may not be mutually recursive.

  • A package has a name (e.g. std)

  • Each package is built into a single library (Unix; e.g. libHSfoo.a), or a single DLL (Windows; e.g. HSfoo.dll)

  • The -package-name foo flag tells GHC that the module being compiled is destined for package foo. If this is omitted, the default package, Main, is assumed.

  • The -package foo flag tells GHC to make available modules from package foo. It replaces -syslib foo, which is now deprecated.

  • GHC does not maintain detailed cross-package dependency information. It does remember which modules in other packages the current module depends on, but not which things within those imported things.

All of this tidies up the Prelude enormously. The Prelude and Standard Libraries are built into a single package called std. (This is a change; the library is now called libHSstd.a instead of libHS.a).

It is worth noting that on Windows, because each package is built as a DLL, and a reference to a DLL costs an extra indirection, intra-package references are cheaper than inter-package references. Of course, this applies to the Main package as well. This is not normally the case on most Unices.

3.7.5. Using make

It is reasonably straightforward to set up a Makefile to use with GHC, assuming you name your source files the same as your modules. Thus:

HC      = ghc
HC_OPTS = -cpp $(EXTRA_HC_OPTS)

SRCS = Main.lhs Foo.lhs Bar.lhs
OBJS = Main.o   Foo.o   Bar.o

.SUFFIXES : .o .hs .hi .lhs .hc .s

cool_pgm : $(OBJS)
        rm $@
        $(HC) -o $@ $(HC_OPTS) $(OBJS)

# Standard suffix rules
.o.hi:
        @:

.lhs.o:
        $(HC) -c $< $(HC_OPTS)

.hs.o:
        $(HC) -c $< $(HC_OPTS)

# Inter-module dependencies
Foo.o Foo.hc Foo.s    : Baz.hi          # Foo imports Baz
Main.o Main.hc Main.s : Foo.hi Baz.hi   # Main imports Foo and Baz

(Sophisticated make variants may achieve some of the above more elegantly. Notably, gmake's pattern rules let you write the more comprehensible:

%.o : %.lhs
        $(HC) -c $< $(HC_OPTS)

What we've shown should work with any make.)

Note the cheesy .o.hi rule: It records the dependency of the interface (.hi) file on the source. The rule says a .hi file can be made from a .o file by doing…nothing. Which is true.

Note the inter-module dependencies at the end of the Makefile, which take the form

Foo.o Foo.hc Foo.s    : Baz.hi          # Foo imports Baz

They tell make that if any of Foo.o, Foo.hc or Foo.s have an earlier modification date than Baz.hi, then the out-of-date file must be brought up to date. To bring it up to date, make looks for a rule to do so; one of the preceding suffix rules does the job nicely.

3.7.6. Dependency generation

Putting inter-dependencies of the form Foo.o : Bar.hi into your Makefile by hand is rather error-prone. Don't worry, GHC has support for automatically generating the required dependencies. Add the following to your Makefile:

depend :
        ghc -M $(HC_OPTS) $(SRCS)

Now, before you start compiling, and any time you change the imports in your program, do make depend before you do make cool_pgm. ghc -M will append the needed dependencies to your Makefile.

In general, if module A contains the line
import B ...blah...
then ghc -M will generate a dependency line of the form:
A.o : B.hi
If module A contains the line
import {-# SOURCE #-} B ...blah...
then ghc -M will generate a dependency line of the form:
A.o : B.hi-boot
(See Section 3.7.1 for details of interface files.) If A imports multiple modules, then there will be multiple lines with A.o as the target.

By default, ghc -M generates all the dependencies, and then concatenates them onto the end of makefile (or Makefile if makefile doesn't exist) bracketed by the lines "# DO NOT DELETE: Beginning of Haskell dependencies" and "# DO NOT DELETE: End of Haskell dependencies". If these lines already exist in the makefile, then the old dependencies are deleted first.

Internally, GHC uses a script to generate the dependencies, called mkdependHS. This script has some options of its own, which you might find useful. Options can be passed directly to mkdependHS with GHC's -optdep option. For example, to generate the dependencies into a file called .depend instead of Makefile:

ghc -M -optdep-f optdep.depend ...

The full list of options accepted by mkdependHS is:

-w

Turn off warnings about interface file shadowing.

-f blah

Use blah as the makefile, rather than makefile or Makefile. If blah doesn't exist, mkdependHS creates it. We often use -f .depend to put the dependencies in .depend and then include the file .depend into Makefile.

-o <osuf>

Use .<osuf> as the "target file" suffix ( default: o). Multiple -o flags are permitted (GHC2.05 onwards). Thus "-o hc -o o" will generate dependencies for .hc and .o files.

-s <suf>

Make extra dependencies that declare that files with suffix .<suf>_<osuf> depend on interface files with suffix .<suf>_hi, or (for {-# SOURCE #-} imports) on .hi-boot. Multiple -s flags are permitted. For example, -o hc -s a -s b will make dependencies for .hc on .hi, .a_hc on .a_hi, and .b_hc on .b_hi. (Useful in conjunction with NoFib "ways".)

--exclude-module=<file>

Regard <file> as "stable"; i.e., exclude it from having dependencies on it.

-x

same as --exclude-module

--exclude-directory=<dirs>

Regard the colon-separated list of directories <dirs> as containing stable, don't generate any dependencies on modules therein.

-xdirs

same as --exclude-directory.

--include-module=<file>

Regard <file> as not "stable"; i.e., generate dependencies on it (if any). This option is normally used in conjunction with the --exclude-directory option.

--include-prelude

Regard prelude libraries as unstable, i.e., generate dependencies on the prelude modules used (including Prelude). This option is normally only used by the various system libraries. If a -package option is used, dependencies will also be generated on the library's interfaces.

3.7.7. How to compile mutually recursive modules

Currently, the compiler does not have proper support for dealing with mutually recursive modules:

module A where

import B

newtype TA = MkTA Int

f :: TB -> TA
f (MkTB x) = MkTA x
--------
module B where

import A

data TB = MkTB !Int

g :: TA -> TB
g (MkTA x) = MkTB x

When compiling either module A and B, the compiler will try (in vain) to look for the interface file of the other. So, to get mutually recursive modules off the ground, you need to hand write an interface file for A or B, so as to break the loop. These hand-written interface files are called hi-boot files, and are placed in a file called <module>.hi-boot. To import from an hi-boot file instead of the standard .hi file, use the following syntax in the importing module:

import {-# SOURCE #-} A

The hand-written interface need only contain the bare minimum of information needed to get the bootstrapping process started. For example, it doesn't need to contain declarations for everything that module A exports, only the things required by the module that imports A recursively.

For the example at hand, the boot interface file for A would look like the following:

__interface A 1 404 where
__export A TA{MkTA} ;
1 newtype TA = MkTA PrelBase.Int ;

The syntax is essentially the same as a normal .hi file (unfortunately), but you can usually tailor an existing .hi file to make a .hi-boot file.

Notice that we only put the declaration for the newtype TA in the hi-boot file, not the signature for f, since f isn't used by B.

The number “1” after “__interface A” gives the version number of module A; it is incremented whenever anything in A's interface file changes. The “404” is the version number of the interface file syntax; we change it when we change the syntax of interface files so that you get a better error message when you try to read an old-format file with a new-format compiler.

The number “1” at the beginning of a declaration is the version number of that declaration: for the purposes of .hi-boot files these can all be set to 1. All names must be fully qualified with the original module that an object comes from: for example, the reference to Int in the interface for A comes from PrelBase, which is a module internal to GHC's prelude. It's a pain, but that's the way it is.

If you want an hi-boot file to export a data type, but you don't want to give its constructors (because the constructors aren't used by the SOURCE-importing module), you can write simply:

__interface A 1 404 where
__export A TA;
1 data TA

(You must write all the type parameters, but leave out the '=' and everything that follows it.)

Note: This is all a temporary solution, a version of the compiler that handles mutually recursive modules properly without the manual construction of interface files, is (allegedly) in the works.