This section describes what files GHC expects to find, what files it creates, where these files are stored, and what options affect this behaviour.
Note that this section is written with hierarchical modules in mind (see Section 7.3.4, “Hierarchical Modules”); hierarchical modules are an extension to Haskell 98 which extends the lexical syntax of module names to include a dot ‘.’. Non-hierarchical modules are thus a special case in which none of the module names contain dots.
Pathname conventions vary from system to system. In
particular, the directory separator is
‘/
’ on Unix systems and
‘\
’ on Windows systems. In the
sections that follow, we shall consistently use
‘/
’ as the directory separator;
substitute this for the appropriate character for your
system.
Each Haskell source module should be placed in a file on its own.
Usually, the file should be named after the module name,
replacing dots in the module name by directory separators. For
example, on a Unix system, the module A.B.C
should be placed in the file A/B/C.hs
,
relative to some base directory. If the module is not going to
be imported by another module (Main
, for
example), then you are free to use any filename for it.
GHC assumes that source files are ASCII or UTF-8 only, other encodings are not recognised. However, invalid UTF-8 sequences will be ignored in comments, so it is possible to use other encodings such as Latin-1, as long as the non-comment source code is ASCII only.
When asked to compile a source file, GHC normally generates two files: an object file, and an interface file.
The object file, which normally ends in a
.o
suffix, contains the compiled code for the
module.
The interface file,
which normally ends in a .hi
suffix, contains
the information that GHC needs in order to compile further
modules that depend on this module. It contains things like the
types of exported functions, definitions of data types, and so
on. It is stored in a binary format, so don't try to read one;
use the --show-iface
option instead (see Section 4.7.7, “Other options related to interface files”).
You should think of the object file and the interface file as a pair, since the interface file is in a sense a compiler-readable description of the contents of the object file. If the interface file and object file get out of sync for any reason, then the compiler may end up making assumptions about the object file that aren't true; trouble will almost certainly follow. For this reason, we recommend keeping object files and interface files in the same place (GHC does this by default, but it is possible to override the defaults as we'll explain shortly).
Every module has a module name
defined in its source code (module A.B.C where
...
).
The name of the object file generated by GHC is derived
according to the following rules, where
osuf
is the object-file suffix (this
can be changed with the -osuf
option).
If there is no -odir
option (the
default), then the object filename is derived from the
source filename (ignoring the module name) by replacing the
suffix with osuf
.
If
-odir
dir
has been specified, then the object filename is
dir
/mod
.osuf
,
where mod
is the module name with
dots replaced by slashes. GHC will silently create the necessary directory
structure underneath dir
, if it does not
already exist.
The name of the interface file is derived using the same
rules, except that the suffix is
hisuf
(.hi
by
default) instead of osuf
, and the
relevant options are -hidir
and
-hisuf
instead of -odir
and
-osuf
respectively.
For example, if GHC compiles the module
A.B.C
in the file
src/A/B/C.hs
, with no
-odir
or -hidir
flags, the
interface file will be put in src/A/B/C.hi
and the object file in src/A/B/C.o
.
For any module that is imported, GHC requires that the name of the module in the import statement exactly matches the name of the module in the interface file (or source file) found using the strategy specified in Section 4.7.3, “The search path”. This means that for most modules, the source file name should match the module name.
However, note that it is reasonable to have a module
Main
in a file named
foo.hs
, but this only works because GHC
never needs to search for the interface for module
Main
(because it is never imported). It is
therefore possible to have several Main
modules in separate source files in the same directory, and GHC
will not get confused.
In batch compilation mode, the name of the object file can
also be overridden using the -o
option, and the
name of the interface file can be specified directly using the
-ohi
option.
In your program, you import a module
Foo
by saying import Foo
.
In --make
mode or GHCi, GHC will look for a
source file for Foo
and arrange to compile it
first. Without --make
, GHC will look for the
interface file for Foo
, which should have
been created by an earlier compilation of
Foo
. GHC uses the same strategy in each of
these cases for finding the appropriate file.
This strategy is as follows: GHC keeps a list of
directories called the search path. For
each of these directories, it tries appending
basename
.
extension
to the directory, and checks whether the file exists. The value
of basename
is the module name with
dots replaced by the directory separator ('/' or '\', depending
on the system), and extension
is a
source extension (hs
, lhs
)
if we are in --make
mode or GHCi, or
hisuf
otherwise.
For example, suppose the search path contains directories
d1
, d2
, and
d3
, and we are in --make
mode looking for the source file for a module
A.B.C
. GHC will look in
d1/A/B/C.hs
, d1/A/B/C.lhs
,
d2/A/B/C.hs
, and so on.
The search path by default contains a single directory: “.” (i.e. the current directory). The following options can be used to add to or change the contents of the search path:
This isn't the whole story: GHC also looks for modules in pre-compiled libraries, known as packages. See the section on packages (Section 4.9, “ Packages ”) for details.
-o
file
GHC's compiled output normally goes into a
.hc
, .o
, etc.,
file, depending on the last-run compilation phase. The
option -o
re-directs the output of that last-run phase to
file
file
.
Note: this “feature” can be
counterintuitive: ghc -C -o foo.o
foo.hs will put the intermediate C code in the
file foo.o
, name
notwithstanding!
This option is most often used when creating an executable file, to set the filename of the executable. For example:
ghc -o prog --make Main
will compile the program starting with module
Main
and put the executable in the
file prog
.
Note: on Windows, if the result is an executable
file, the extension ".exe
" is added
if the specified filename does not already have an
extension. Thus
ghc -o foo Main.hs
will compile and link the module
Main.hs
, and put the resulting
executable in foo.exe
(not
foo
).
If you use ghc --make and you don't
use the -o
, the name GHC will choose
for the executable will be based on the name of the file
containing the module Main
.
Note that with GHC the Main
module doesn't
have to be put in file Main.hs
.
Thus both
ghc --make Prog
and
ghc --make Prog.hs
will produce Prog
(or
Prog.exe
if you are on Windows).
-odir
dir
Redirects object files to directory
dir
. For example:
$ ghc -c parse/Foo.hs parse/Bar.hs gurgle/Bumble.hs -odir `uname -m`
The object files, Foo.o
,
Bar.o
, and
Bumble.o
would be put into a
subdirectory named after the architecture of the executing
machine (x86
,
mips
, etc).
Note that the -odir
option does
not affect where the interface files
are put; use the -hidir
option for that.
In the above example, they would still be put in
parse/Foo.hi
,
parse/Bar.hi
, and
gurgle/Bumble.hi
.
-ohi
file
The interface output may be directed to another file
bar2/Wurble.iface
with the option
-ohi bar2/Wurble.iface
(not
recommended).
WARNING: if you redirect the interface file
somewhere that GHC can't find it, then the recompilation
checker may get confused (at the least, you won't get any
recompilation avoidance). We recommend using a
combination of -hidir
and
-hisuf
options instead, if
possible.
To avoid generating an interface at all, you could
use this option to redirect the interface into the bit
bucket: -ohi /dev/null
, for
example.
-hidir
dir
Redirects all generated interface files into
dir
, instead of the
default.
-stubdir
dir
Redirects all generated FFI stub files into
dir
. Stub files are generated when the
Haskell source contains a foreign export
or
foreign import "&wrapper"
declaration (see Section 8.2.1, “Using foreign export
and foreign import ccall "wrapper"
with GHC”). The -stubdir
option behaves in exactly the same way as -odir
and -hidir
with respect to hierarchical
modules.
-outputdir
dir
The -outputdir
option is shorthand for
the combination
of -odir
, -hidir
,
and -stubdir
.
-osuf
suffix
,
-hisuf
suffix
,
-hcsuf
suffix
The -osuf
suffix
will change the
.o
file suffix for object files to
whatever you specify. We use this when compiling
libraries, so that objects for the profiling versions of
the libraries don't clobber the normal ones.
Similarly, the -hisuf
suffix
will change the
.hi
file suffix for non-system
interface files (see Section 4.7.7, “Other options related to interface files”).
Finally, the option -hcsuf
suffix
will change the
.hc
file suffix for compiler-generated
intermediate C files.
The -hisuf
/-osuf
game is particularly useful if you want to compile a
program both with and without profiling, in the same
directory. You can say:
ghc ...
to get the ordinary version, and
ghc ... -osuf prof.o -hisuf prof.hi -prof -auto-all
to get the profiled version.
The following options are useful for keeping certain intermediate files around, when normally GHC would throw these away after compilation:
-keep-hc-file
,
-keep-hc-files
Keep intermediate .hc
files when
doing .hs
-to-.o
compilations via C (NOTE: .hc
files
aren't generated when using the native code generator, you
may need to use -fvia-C
to force them
to be produced).
-keep-llvm-file
,
-keep-llvm-files
Keep intermediate .ll
files when
doing .hs
-to-.o
compilations via LLVM (NOTE: .ll
files
aren't generated when using the native code generator, you
may need to use -fllvm
to force them
to be produced).
-keep-s-file
,
-keep-s-files
Keep intermediate .s
files.
-keep-raw-s-file
,
-keep-raw-s-files
Keep intermediate .raw-s
files.
These are the direct output from the C compiler, before
GHC does “assembly mangling” to produce the
.s
file. Again, these are not produced
when using the native code generator.
-keep-tmp-files
Instructs the GHC driver not to delete any of its
temporary files, which it normally keeps in
/tmp
(or possibly elsewhere; see Section 4.7.6, “Redirecting temporary files”). Running GHC with
-v
will show you what temporary files
were generated along the way.
-tmpdir
If you have trouble because of running out of space
in /tmp
(or wherever your
installation thinks temporary files should go), you may
use the -tmpdir
<dir>
option to specify
an alternate directory. For example, -tmpdir
.
says to put temporary files in the current
working directory.
Alternatively, use your TMPDIR
environment variable. Set it to the
name of the directory where temporary files should be put.
GCC and other programs will honour the
TMPDIR
variable as well.
Even better idea: Set the
DEFAULT_TMPDIR
make variable when
building GHC, and never worry about
TMPDIR
again. (see the build
documentation).
-ddump-hi
Dumps the new interface to standard output.
-ddump-hi-diffs
The compiler does not overwrite an existing
.hi
interface file if the new one is
the same as the old one; this is friendly to
make. When an interface does change,
it is often enlightening to be informed. The
-ddump-hi-diffs
option will make GHC
report the differences between the old and
new .hi
files.
-ddump-minimal-imports
Dump to the file "M.imports" (where M is the module being compiled) a "minimal" set of import declarations. You can safely replace all the import declarations in "M.hs" with those found in "M.imports". Why would you want to do that? Because the "minimal" imports (a) import everything explicitly, by name, and (b) import nothing that is not required. It can be quite painful to maintain this property by hand, so this flag is intended to reduce the labour.
--show-iface
file
where file
is the name of
an interface file, dumps the contents of that interface in
a human-readable (ish) format. See Section 4.5, “Modes of operation”.
In the olden days, GHC compared the newly-generated
.hi
file with the previous version; if they
were identical, it left the old one alone and didn't change its
modification date. In consequence, importers of a module with
an unchanged output .hi
file were not
recompiled.
This doesn't work any more. Suppose module
C
imports module B
, and
B
imports module A
. So
changes to module A
might require module
C
to be recompiled, and hence when
A.hi
changes we should check whether
C
should be recompiled. However, the
dependencies of C
will only list
B.hi
, not A.hi
, and some
changes to A
(changing the definition of a
function that appears in an inlining of a function exported by
B
, say) may conceivably not change
B.hi
one jot. So now…
GHC calculates a fingerprint (in fact an MD5 hash) of each
interface file, and of each declaration within the interface
file. It also keeps in every interface file a list of the
fingerprints of everything it used when it last compiled the
file. If the source file's modification date is earlier than
the .o
file's date (i.e. the source hasn't
changed since the file was last compiled), and the recompilation
checking is on, GHC will be clever. It compares the fingerprints
on the things it needs this time with the fingerprints
on the things it needed last time (gleaned from the
interface file of the module being compiled); if they are all
the same it stops compiling early in the process saying
“Compilation IS NOT required”. What a beautiful
sight!
You can read about how all this works in the GHC commentary.
GHC supports the compilation of mutually recursive modules. This section explains how.
Every cycle in the module import graph must be broken by a hs-boot
file.
Suppose that modules A.hs
and B.hs
are Haskell source files,
thus:
module A where import B( TB(..) ) newtype TA = MkTA Int f :: TB -> TA f (MkTB x) = MkTA x module B where import {-# SOURCE #-} A( TA(..) ) data TB = MkTB !Int g :: TA -> TB g (MkTA x) = MkTB x
Here A
imports B
, but B
imports
A
with a {-# SOURCE #-}
pragma, which breaks the
circular dependency. Every loop in the module import graph must be broken by a {-# SOURCE #-}
import;
or, equivalently, the module import graph must be acyclic if {-# SOURCE #-}
imports are ignored.
For every module A.hs
that is {-# SOURCE #-}
-imported
in this way there must exist a source file A.hs-boot
. This file contains an abbreviated
version of A.hs
, thus:
module A where newtype TA = MkTA Int
To compile these three files, issue the following commands:
ghc -c A.hs-boot -- Produces A.hi-boot, A.o-boot ghc -c B.hs -- Consumes A.hi-boot, produces B.hi, B.o ghc -c A.hs -- Consumes B.hi, produces A.hi, A.o ghc -o foo A.o B.o -- Linking the program
There are several points to note here:
The file A.hs-boot
is a programmer-written source file.
It must live in the same directory as its parent source file A.hs
.
Currently, if you use a literate source file A.lhs
you must
also use a literate boot file, A.lhs-boot
; and vice versa.
A hs-boot
file is compiled by GHC, just like a hs
file:
ghc -c A.hs-boot
When a hs-boot file A.hs-boot
is compiled, it is checked for scope and type errors.
When its parent module A.hs
is compiled, the two are compared, and
an error is reported if the two are inconsistent.
Just as compiling A.hs
produces an
interface file A.hi
, and an object file
A.o
, so compiling
A.hs-boot
produces an interface file
A.hi-boot
, and an pseudo-object file
A.o-boot
:
The pseudo-object file A.o-boot
is
empty (don't link it!), but it is very useful when using a
Makefile, to record when the A.hi-boot
was
last brought up to date (see Section 4.7.10, “Using make”).
The hi-boot
generated by compiling a
hs-boot
file is in the same
machine-generated binary format as any other GHC-generated
interface file (e.g. B.hi
). You can
display its contents with ghc
--show-iface. If you specify a directory for
interface files, the -ohidir
flag, then that
affects hi-boot
files
too.
If hs-boot files are considered distinct from their parent source
files, and if a {-# SOURCE #-}
import is considered to refer to the
hs-boot file, then the module import graph must have no cycles. The command
ghc -M will report an error if a cycle is found.
A module M
that is
{-# SOURCE #-}
-imported in a program will usually also be
ordinarily imported elsewhere. If not, ghc --make
automatically adds M
to the set of modules it tries to
compile and link, to ensure that M
's implementation is included in
the final program.
A hs-boot file need only contain the bare
minimum of information needed to get the bootstrapping process
started. For example, it doesn't need to contain declarations
for everything that module
A
exports, only the things required by the
module(s) that import A
recursively.
A hs-boot file is written in a subset of Haskell:
The module header (including the export list), and import statements, are exactly as in Haskell, and so are the scoping rules. Hence, to mention a non-Prelude type or class, you must import it.
There must be no value declarations, but there can be type signatures for values. For example:
double :: Int -> Int
Fixity declarations are exactly as in Haskell.
Type synonym declarations are exactly as in Haskell.
A data type declaration can either be given in full, exactly as in Haskell, or it can be given abstractly, by omitting the '=' sign and everything that follows. For example:
data T a b
In a source program this would declare TA to have no constructors (a GHC extension: see Section 7.4.1, “Data types with no constructors”), but in an hi-boot file it means "I don't know or care what the constructors are". This is the most common form of data type declaration, because it's easy to get right. You can also write out the constructors but, if you do so, you must write it out precisely as in its real definition.
If you do not write out the constructors, you may need to give a kind annotation (Section 7.8.4, “Explicitly-kinded quantification”), to tell GHC the kind of the type variable, if it is not "*". (In source files, this is worked out from the way the type variable is used in the constructors.) For example:
data R (x :: * -> *) y
You cannot use deriving
on a data type declaration; write an
instance
declaration instead.
Class declarations is exactly as in Haskell, except that you may not put default method declarations. You can also omit all the superclasses and class methods entirely; but you must either omit them all or put them all in.
You can include instance declarations just as in Haskell; but omit the "where" part.
It is reasonably straightforward to set up a
Makefile
to use with GHC, assuming you name
your source files the same as your modules. Thus:
HC = ghc HC_OPTS = -cpp $(EXTRA_HC_OPTS) SRCS = Main.lhs Foo.lhs Bar.lhs OBJS = Main.o Foo.o Bar.o .SUFFIXES : .o .hs .hi .lhs .hc .s cool_pgm : $(OBJS) rm -f $@ $(HC) -o $@ $(HC_OPTS) $(OBJS) # Standard suffix rules .o.hi: @: .lhs.o: $(HC) -c $< $(HC_OPTS) .hs.o: $(HC) -c $< $(HC_OPTS) .o-boot.hi-boot: @: .lhs-boot.o-boot: $(HC) -c $< $(HC_OPTS) .hs-boot.o-boot: $(HC) -c $< $(HC_OPTS) # Inter-module dependencies Foo.o Foo.hc Foo.s : Baz.hi # Foo imports Baz Main.o Main.hc Main.s : Foo.hi Baz.hi # Main imports Foo and Baz
(Sophisticated make variants may achieve some of the above more elegantly. Notably, gmake's pattern rules let you write the more comprehensible:
%.o : %.lhs $(HC) -c $< $(HC_OPTS)
What we've shown should work with any make.)
Note the cheesy .o.hi
rule: It records
the dependency of the interface (.hi
) file
on the source. The rule says a .hi
file
can be made from a .o
file by
doing…nothing. Which is true.
Note that the suffix rules are all repeated twice, once
for normal Haskell source files, and once for hs-boot
files (see Section 4.7.9, “How to compile mutually recursive modules”).
Note also the inter-module dependencies at the end of the Makefile, which take the form
Foo.o Foo.hc Foo.s : Baz.hi # Foo imports Baz
They tell make that if any of
Foo.o
, Foo.hc
or
Foo.s
have an earlier modification date than
Baz.hi
, then the out-of-date file must be
brought up to date. To bring it up to date,
make
looks for a rule to do so; one of the
preceding suffix rules does the job nicely. These dependencies
can be generated automatically by ghc; see
Section 4.7.11, “Dependency generation”
Putting inter-dependencies of the form Foo.o :
Bar.hi
into your Makefile
by
hand is rather error-prone. Don't worry, GHC has support for
automatically generating the required dependencies. Add the
following to your Makefile
:
depend : ghc -M $(HC_OPTS) $(SRCS)
Now, before you start compiling, and any time you change
the imports
in your program, do
make depend before you do make
cool_pgm. The command ghc -M will
append the needed dependencies to your
Makefile
.
In general, ghc -M Foo does the following.
For each module M
in the set
Foo
plus all its imports (transitively),
it adds to the Makefile:
A line recording the dependence of the object file on the source file.
M.o : M.hs
(or M.lhs
if that is the filename you used).
For each import declaration import X
in M
,
a line recording the dependence of M
on X
:
M.o : X.hi
For each import declaration import {-# SOURCE #-} X
in M
,
a line recording the dependence of M
on X
:
M.o : X.hi-boot
(See Section 4.7.9, “How to compile mutually recursive modules” for details of
hi-boot
style interface files.)
If M
imports multiple modules, then there will
be multiple lines with M.o
as the
target.
There is no need to list all of the source files as arguments to the ghc -M command; ghc traces the dependencies, just like ghc --make (a new feature in GHC 6.4).
Note that ghc -M
needs to find a source
file for each module in the dependency graph, so that it can
parse the import declarations and follow dependencies. Any pre-compiled
modules without source files must therefore belong to a
package[7].
By default, ghc -M generates all the
dependencies, and then concatenates them onto the end of
makefile
(or
Makefile
if makefile
doesn't exist) bracketed by the lines "# DO NOT
DELETE: Beginning of Haskell dependencies
" and
"# DO NOT DELETE: End of Haskell
dependencies
". If these lines already exist in the
makefile
, then the old dependencies are
deleted first.
Don't forget to use the same -package
options on the ghc -M
command line as you
would when compiling; this enables the dependency generator to
locate any imported modules that come from packages. The
package modules won't be included in the dependencies
generated, though (but see the
––include-pkg-deps
option below).
The dependency generation phase of GHC can take some additional options, which you may find useful. The options which affect dependency generation are:
-ddump-mod-cycles
Display a list of the cycles in the module graph. This is useful when trying to eliminate such cycles.
-v2
Print a full list of the module dependencies to stdout.
(This is the standard verbosity flag, so the list will
also be displayed with -v3
and
-v4
;
Section 4.6, “Help and verbosity options”.)
-dep-makefile
file
Use file
as the makefile,
rather than makefile
or
Makefile
. If
file
doesn't exist,
mkdependHS creates it. We often use
-dep-makefile .depend
to put the dependencies in
.depend
and then
include the file
.depend
into
Makefile
.
-dep-suffix <suf>
Make extra dependencies that declare that files
with suffix
.<suf>_<osuf>
depend on interface files with suffix
.<suf>_hi
, or (for
{-# SOURCE #-}
imports) on .hi-boot
. Multiple
-dep-suffix
flags are permitted. For example,
-dep-suffix a -dep-suffix b
will make dependencies
for .hs
on
.hi
,
.a_hs
on
.a_hi
, and
.b_hs
on
.b_hi
. (Useful in
conjunction with NoFib "ways".)
––exclude-module=<file>
Regard <file>
as
"stable"; i.e., exclude it from having dependencies on
it.
––include-pkg-deps
Regard modules imported from packages as unstable,
i.e., generate dependencies on any imported package modules
(including Prelude
, and all other
standard Haskell libraries). Dependencies are not traced
recursively into packages; dependencies are only generated for
home-package modules on external-package modules directly imported
by the home package module.
This option is normally
only used by the various system libraries.
Haskell specifies that when compiling module M, any instance declaration in any module "below" M is visible. (Module A is "below" M if A is imported directly by M, or if A is below a module that M imports directly.) In principle, GHC must therefore read the interface files of every module below M, just in case they contain an instance declaration that matters to M. This would be a disaster in practice, so GHC tries to be clever.
In particular, if an instance declaration is in the same module as the definition
of any type or class mentioned in the head of the instance declaration
(the part after the “=>
”; see Section 7.6.3.2, “Relaxed rules for instance contexts”), then
GHC has to visit that interface file anyway. Example:
module A where instance C a => D (T a) where ... data T a = ...
The instance declaration is only relevant if the type T is in use, and if so, GHC will have visited A's interface file to find T's definition.
The only problem comes when a module contains an instance declaration and GHC has no other reason for visiting the module. Example:
module Orphan where instance C a => D (T a) where ... class C a where ...
Here, neither D nor T is declared in module Orphan. We call such modules “orphan modules”. GHC identifies orphan modules, and visits the interface file of every orphan module below the module being compiled. This is usually wasted work, but there is no avoiding it. You should therefore do your best to have as few orphan modules as possible.
Functional dependencies complicate matters. Suppose we have:
module B where instance E T Int where ... data T = ...
Is this an orphan module? Apparently not, because T
is declared in the same module. But suppose class E
had a
functional dependency:
module Lib where class E x y | y -> x where ...
Then in some importing module M, the constraint (E a Int)
should be "improved" by setting
a = T
, even though there is no explicit mention
of T
in M.
An orphan module contains at least one orphan instance or at least one orphan rule.
An instance declaration in a module M is an orphan instance if
The class of the instance declaration is not declared in M, and
Either the class has no functional dependencies, and none of the type constructors in the instance head is declared in M; or there is a functional dependency for which none of the type constructors mentioned in the non-determined part of the instance head is defined in M.
Only the instance head counts. In the example above, it is not good enough for C's declaration to be in module A; it must be the declaration of D or T.
A rewrite rule in a module M is an orphan rule if none of the variables, type constructors, or classes that are free in the left hand side of the rule are declared in M.
If you use the flag -fwarn-orphans
, GHC will warn you
if you are creating an orphan module.
Like any warning, you can switch the warning off with -fno-warn-orphans
,
and -Werror
will make the compilation fail if the warning is issued.
You can identify an orphan module by looking in its interface
file, M.hi
, using the
--show-iface
mode. If there is a [orphan module]
on the
first line, GHC considers it an orphan module.