This section describes how to port GHC to a currenly unsupported platform. There are two distinct possibilities:
The hardware architecture for your system is already supported by GHC, but you're running an OS that isn't supported (or perhaps has been supported in the past, but currently isn't). This is the easiest type of porting job, but it still requires some careful bootstrapping. Proceed to Section 10.1.
Your system's hardware architecture isn't supported by GHC. This will be a more difficult port (though by comparison perhaps not as difficult as porting gcc). Proceed to Section 10.2.
Bootstrapping GHC on a system without GHC already installed is achieved by taking the intermediate C files (known as HC files) from a GHC compilation on a supported system to the target machine, and compiling them using gcc to get a working GHC.
NOTE: GHC version 5.xx is significantly harder to bootstrap from C than previous versions. We recommend starting from version 4.08.2 if you need to bootstrap in this way.
HC files are architecture-dependent (but not OS-dependent), so you have to get a set that were generated on similar hardware. There may be some supplied on the GHC download page, otherwise you'll have to compile some up yourself, or start from unregisterised HC files - see Section 10.2.
The following steps should result in a working GHC build with full libraries:
Unpack the HC files on top of a fresh source tree (make sure the source tree version matches the version of the HC files exactly!). This will place matching .hc files next to the corresponding Haskell source (.hs or .lhs) in the compiler subdirectory ghc/compiler and in the libraries (subdirectories of hslibs and libraries).
The actual build process is fully automated by the hc-build script located in the distrib directory. If you eventually want to install GHC into the directory dir, the following command will execute the whole build process (it won't install yet):
foo% distrib/hc-build --prefix=dir
By default, the installation directory is /usr/local. If that is what you want, you may omit the argument to hc-build. Generally, any option given to hc-build is passed through to the configuration script configure. If hc-build successfully completes the build process, you can install the resulting system, as normal, with
foo% make install
The first step in porting to a new architecture is to get an unregisterised build working. An unregisterised build is one that compiles via vanilla C only. By contrast, a registerised build uses the following architecture-specific hacks for speed:
Global register variables: certain abstract machine "registers" are mapped to real machine registers, depending on how many machine registers are available (see ghc/includes/MachRegs.h).
Assembly-mangling: when compiling via C, we feed the assembly generated by gcc though a Perl script known as the mangler (see ghc/driver/mangler/ghc-asm.lprl). The mangler rearranges the assembly to support tail-calls and various other optimisations.
In an unregisterised build, neither of these hacks are used — the idea is that the C code generated by the compiler should compile using gcc only. The lack of these optimisations costs about a factor of two in performance, but since unregisterised compilation is usually just a step on the way to a full registerised port, we don't mind too much.
The first step is to get some unregisterised HC files. Either (a) download them from the GHC site (if there are some available for the right version of GHC), or (b) build them yourself on any machine with a working GHC. If at all possible this should be a machine with the same word size as the target.
There is a script available which should automate the process of doing the 2-stage bootstrap necessary to get the unregisterised HC files - it's available in fptools/distrib/cross-port in CVS.
Now take these unregisterised HC files to the target platform and bootstrap a compiler from them as per the instructions in Section 10.1. In build.mk, you need to tell the build system that the compiler you're building is (a) unregisterised itself, and (b) builds unregisterised binaries. This varies depending on the GHC version you're bootstraping:
# build.mk for GHC 4.08.x GhcWithRegisterised=NO
# build.mk for GHC 5.xx GhcUnregisterised=YES
Version 5.xx only: use the option --enable-hc-boot-unregisterised instead of --enable-hc-boot when running ./configure.
The build may not go through cleanly. We've tried to stick to writing portable code in most parts of the compiler, so it should compile on any POSIXish system with gcc, but in our experience most systems differ from the standards in one way or another. Deal with any problems as they arise - if you get stuck, ask the experts on <firstname.lastname@example.org>.
Once you have the unregisterised compiler up and running, you can use it to start a registerised port. The following sections describe the various parts of the system that will need architecture-specific tweaks in order to get a registerised build going.
Lots of useful information about the innards of GHC is available in the GHC Commentary, which might be helpful if you run into some code which needs tweaking for your system.
The following files need architecture-specific code for a registerised build:
Defines the STG-register to machine-register mapping. You need to know your platform's C calling convention, and which registers are generally available for mapping to global register variables. There are plenty of useful comments in this file.
Macros that cooperate with the mangler (see Section 10.2.3) to make proper tail-calls work.
Support for foreign import "wrapper" (aka foreign export dynamic). Not essential for getting GHC bootstrapped, so this file can be deferred until later if necessary.
The little assembly layer between the C world and the Haskell world. See the comments and code for the other architectures in this file for pointers.
These files are really OS-specific rather than architecture-specific. In MBlock.h is specified the absolute location at which the RTS should try to allocate memory on your platform (try to find an area which doesn't conflict with code or dynamic libraries). In Mblock.c you might need to tweak the call to mmap() for your OS.
The mangler is an evil Perl-script that rearranges the assembly code output from gcc to do two main things:
Remove function prologues and epilogues, and all movement of the C stack pointer. This is to support tail-calls: every code block in Haskell code ends in an explicit jump, so we don't want the C-stack overflowing while we're jumping around between code blocks.
Move the info table for a closure next to the entry code for that closure. In unregisterised code, info tables contain a pointer to the entry code, but in registerised compilation we arrange that the info table is shoved right up against the entry code, and addressed backwards from the entry code pointer (this saves a word in the info table and an extra indirection when jumping to the closure entry code).
The mangler is abstracted to a certain extent over some architecture-specific things such as the particular assembler directives used to herald symbols. Take a look at the definitions for other architectures and use these as a starting point.
The native code generator isn't essential to getting a registerised build going, but it's a desirable thing to have because it can cut compilation times in half. The native code generator is described in some detail in the GHC commentary.
To support GHCi, you need to port the dynamic linker (fptools/ghc/rts/Linker.c). The linker currently supports the ELF and PEi386 object file formats - if your platform uses one of these then you probably don't have to do anything except fiddle with the #ifdefs at the top of Linker.c to tell it about your OS.
If your system uses a different object file format, then you have to write a linker — good luck!