This section describes how to port GHC to a currenly unsupported platform. There are two distinct possibilities:
The hardware architecture for your system is already
supported by GHC, but you're running an OS that isn't
supported (or perhaps has been supported in the past, but
currently isn't). This is the easiest type of porting job,
but it still requires some careful bootstrapping. Proceed to
Section 10.1, “Booting/porting from C (.hc
) files”.
Your system's hardware architecture isn't supported by GHC. This will be a more difficult port (though by comparison perhaps not as difficult as porting gcc). Proceed to Section 10.2, “Porting GHC to a new architecture”.
Bootstrapping GHC on a system without GHC already installed is achieved by taking the intermediate C files (known as HC files) from another GHC compilation, compiling them using gcc to get a working GHC.
NOTE: GHC versions 5.xx were hard to bootstrap from C. We recommend using GHC 6.0.1 or later.
HC files are platform-dependent, so you have to get a set that were generated on the same platform. There may be some supplied on the GHC download page, otherwise you'll have to compile some up yourself, or start from unregisterised HC files - see Section 10.2, “Porting GHC to a new architecture”.
The following steps should result in a working GHC build with full libraries:
Unpack the HC files on top of a fresh source tree
(make sure the source tree version matches the version of
the HC files exactly!). This will
place matching .hc
files next to the
corresponding Haskell source (.hs
or
.lhs
) in the compiler subdirectory
ghc/compiler
and in the libraries
(subdirectories of hslibs
and
libraries
).
The actual build process is fully automated by the
hc-build
script located in the
distrib
directory. If you eventually
want to install GHC into the directory
dir
, the following
command will execute the whole build process (it won't
install yet):
$ distrib/hc-build --prefix=dir
By default, the installation directory is
/usr/local
. If that is what you want,
you may omit the argument to hc-build
.
Generally, any option given to hc-build
is passed through to the configuration script
configure
. If
hc-build
successfully completes the
build process, you can install the resulting system, as
normal, with
$ make install
The first step in porting to a new architecture is to get an unregisterised build working. An unregisterised build is one that compiles via vanilla C only. By contrast, a registerised build uses the following architecture-specific hacks for speed:
Global register variables: certain abstract machine
“registers” are mapped to real machine
registers, depending on how many machine registers are
available (see
ghc/includes/MachRegs.h
).
Assembly-mangling: when compiling via C, we feed the
assembly generated by gcc though a Perl script known as the
mangler (see
ghc/driver/mangler/ghc-asm.lprl
). The
mangler rearranges the assembly to support tail-calls and
various other optimisations.
In an unregisterised build, neither of these hacks are used — the idea is that the C code generated by the compiler should compile using gcc only. The lack of these optimisations costs about a factor of two in performance, but since unregisterised compilation is usually just a step on the way to a full registerised port, we don't mind too much.
Notes on GHC portability in general: we've tried to stick
to writing portable code in most parts of the system, so it
should compile on any POSIXish system with gcc, but in our
experience most systems differ from the standards in one way or
another. Deal with any problems as they arise - if you get
stuck, ask the experts on
<glasgow-haskell-users@haskell.org>
.
Lots of useful information about the innards of GHC is available in the GHC Commentary, which might be helpful if you run into some code which needs tweaking for your system.
NOTE! These instructions apply to GHC 6.4 and (hopefully) later. If you need instructions for an earlier version of GHC, try to get hold of the version of this document that was current at the time. It should be available from the appropriate download page on the GHC homepage.
In this section, we explain how to bootstrap GHC on a new platform, using unregisterised intermediate C files. We haven't put a great deal of effort into automating this process, for two reasons: it is done very rarely, and the process usually requires human intervention to cope with minor porting issues anyway.
The following step-by-step instructions should result in a fully working, albeit unregisterised, GHC. Firstly, you need a machine that already has a working GHC (we'll call this the host machine), in order to cross-compile the intermediate C files that we will use to bootstrap the compiler on the target machine.
On the target machine:
Unpack a source tree (preferably a released
version). We will call the path to the root of this
tree T
.
$ cd T
$ ./configure --enable-hc-boot --enable-hc-boot-unregisterised
You might need to update
configure.in
to recognise the new
architecture, and re-generate
configure
with
autoreconf
.
$ cd T
/ghc/includes
$ make
On the host machine:
Unpack a source tree (same released version). Call
this directory H
.
$ cd H
$ ./configure
Create
,
with the following contents:H
/mk/build.mk
GhcUnregisterised = YES GhcLibHcOpts = -O -fvia-C -keep-hc-files GhcRtsHcOpts = -keep-hc-files GhcLibWays = SplitObjs = NO GhcWithNativeCodeGen = NO GhcWithInterpreter = NO GhcStage1HcOpts = -O -fasm GhcStage2HcOpts = -O -fvia-C -keep-hc-files SRC_HC_OPTS += -H32m GhcBootLibs = YES
Edit
:H
/mk/config.mk
change TARGETPLATFORM
appropriately, and set the variables involving
TARGET
to the correct values for
the target platform. This step is necessary because
currently configure
doesn't cope
with specifying different values for the
--host
and
--target
flags.
copy LeadingUnderscore
setting from target.
Copy
, T
/ghc/includes/ghcautoconf.h
, and T
/ghc/includes/DerivedConstants.h
to
T
/ghc/includes/GHCConstants.h
.
Note that we are building on the host machine, using the
target machine's configuration files. This
is so that the intermediate C files generated here will
be suitable for compiling on the target system.H
/ghc/includes
Touch the generated configuration files, just to make sure they don't get replaced during the build:
$ touch H
/ghc/includes/{ghcautoconf.h,DerivedConstants.h,GHCConstants.h}
Now build the compiler:
$ cdH
/glafp-utils && make boot && make $ cdH
/ghc && make boot && make
Don't worry if the build falls over in the RTS, we don't need the RTS yet.
$ cd H
/libraries
$ make boot && make
$ cd H
/ghc/compiler
$ make boot stage=2 && make stage=2
$ cdH
/ghc/lib $ make clean $ make -k UseStage1=YES EXTRA_HC_OPTS='-O -fvia-C -keep-hc-files' $ cdH
/ghc/utils $ make clean $ make -k UseStage1=YES EXTRA_HC_OPTS='-O -fvia-C -keep-hc-files'
$ cd H
$ make hc-file-bundle Project=Ghc
copy
to H
/*-hc.tar.gz
.T
/..
On the target machine:
At this stage we simply need to bootstrap a compiler
from the intermediate C files we generated above. The
process of bootstrapping from C files is automated by the
script in distrib/hc-build
, and is
described in Section 10.1, “Booting/porting from C (.hc
) files”.
$ ./distrib/hc-build --enable-hc-boot-unregisterised
However, since this is a bootstrap on a new machine,
the automated process might not run to completion the
first time. For that reason, you might want to treat the
hc-build
script as a list of
instructions to follow, rather than as a fully automated
script. This way you'll be able to restart the process
part-way through if you need to fix anything on the
way.
Don't bother with running
make install
in the newly
bootstrapped tree; just use the compiler in that tree to
build a fresh compiler from scratch, this time without
booting from C files. Before doing this, you might want
to check that the bootstrapped compiler is generating
working binaries:
$ cat >hello.hs
main = putStrLn "Hello World!\n"
^D
$ T
/ghc/compiler/ghc-inplace hello.hs -o hello
$ ./hello
Hello World!
Once you have the unregisterised compiler up and running, you can use it to start a registerised port. The following sections describe the various parts of the system that will need architecture-specific tweaks in order to get a registerised build going.
The following files need architecture-specific code for a registerised build:
ghc/includes/MachRegs.h
Defines the STG-register to machine-register mapping. You need to know your platform's C calling convention, and which registers are generally available for mapping to global register variables. There are plenty of useful comments in this file.
ghc/includes/TailCalls.h
Macros that cooperate with the mangler (see Section 10.2.3, “The mangler”) to make proper tail-calls work.
ghc/rts/Adjustor.c
Support for
foreign import "wrapper"
(aka
foreign export dynamic
).
Not essential for getting GHC bootstrapped, so this file
can be deferred until later if necessary.
ghc/rts/StgCRun.c
The little assembly layer between the C world and the Haskell world. See the comments and code for the other architectures in this file for pointers.
ghc/rts/MBlock.h
, ghc/rts/MBlock.c
These files are really OS-specific rather than
architecture-specific. In MBlock.h
is specified the absolute location at which the RTS
should try to allocate memory on your platform (try to
find an area which doesn't conflict with code or dynamic
libraries). In Mblock.c
you might
need to tweak the call to mmap()
for
your OS.
The mangler is an evil Perl-script
(ghc/driver/mangler/ghc-asm.lprl
) that
rearranges the assembly code output from gcc to do two main
things:
Remove function prologues and epilogues, and all movement of the C stack pointer. This is to support tail-calls: every code block in Haskell code ends in an explicit jump, so we don't want the C-stack overflowing while we're jumping around between code blocks.
Move the info table for a closure next to the entry code for that closure. In unregisterised code, info tables contain a pointer to the entry code, but in registerised compilation we arrange that the info table is shoved right up against the entry code, and addressed backwards from the entry code pointer (this saves a word in the info table and an extra indirection when jumping to the closure entry code).
The mangler is abstracted to a certain extent over some architecture-specific things such as the particular assembler directives used to herald symbols. Take a look at the definitions for other architectures and use these as a starting point.
The splitter is another evil Perl script
(ghc/driver/split/ghc-split.lprl
). It
cooperates with the mangler to support object splitting.
Object splitting is what happens when the
-split-objs
option is passed to GHC: the
object file is split into many smaller objects. This feature
is used when building libraries, so that a program statically
linked against the library will pull in less of the
library.
The splitter has some platform-specific stuff; take a look and tweak it for your system.
The native code generator isn't essential to getting a registerised build going, but it's a desirable thing to have because it can cut compilation times in half. The native code generator is described in some detail in the GHC commentary.
To support GHCi, you need to port the dynamic linker
(fptools/ghc/rts/Linker.c
). The linker
currently supports the ELF and PEi386 object file formats - if
your platform uses one of these then things will be
significantly easier. The majority of Unix platforms use the
ELF format these days. Even so, there are some
machine-specific parts of the ELF linker: for example, the
code for resolving particular relocation types is
machine-specific, so some porting of this code to your
architecture will probaly be necessary.
If your system uses a different object file format, then you have to write a linker — good luck!