Chapter 12. Known bugs and infelicities

Table of Contents

12.1. Haskell 98 vs. Glasgow Haskell: language non-compliance
12.1.1. Divergence from Haskell 98
12.1.1.1. Lexical syntax
12.1.1.2. Context-free syntax
12.1.1.3. Expressions and patterns
12.1.1.4. Declarations and bindings
12.1.1.5. Module system and interface files
12.1.1.6. Numbers, basic types, and built-in classes
12.1.1.7. In Prelude support
12.1.2. GHC's interpretation of undefined behaviour in Haskell 98
12.2. Known bugs or infelicities
12.2.1. Bugs in GHC
12.2.2. Bugs in GHCi (the interactive GHC)

12.1. Haskell 98 vs. Glasgow Haskell: language non-compliance

This section lists Glasgow Haskell infelicities in its implementation of Haskell 98. See also the “when things go wrong” section (Chapter 9, What to do when something goes wrong) for information about crashes, space leaks, and other undesirable phenomena.

The limitations here are listed in Haskell Report order (roughly).

12.1.1. Divergence from Haskell 98

12.1.1.1. Lexical syntax

  • The Haskell report specifies that programs may be written using Unicode. GHC only accepts the ISO-8859-1 character set at the moment.

  • Certain lexical rules regarding qualified identifiers are slightly different in GHC compared to the Haskell report. When you have module.reservedop, such as M.\, GHC will interpret it as a single qualified operator rather than the two lexemes M and .\.

12.1.1.2. Context-free syntax

  • GHC is a little less strict about the layout rule when used in do expressions. Specifically, the restriction that "a nested context must be indented further to the right than the enclosing context" is relaxed to allow the nested context to be at the same level as the enclosing context, if the enclosing context is a do expression.

    For example, the following code is accepted by GHC:

    main = do args <- getArgs
    	  if null args then return [] else do
              ps <- mapM process args
              mapM print ps

  • GHC doesn't do fixity resolution in expressions during parsing. For example, according to the Haskell report, the following expression is legal Haskell:

        let x = 42 in x == 42 == True

    and parses as:

        (let x = 42 in x == 42) == True

    because according to the report, the let expression “extends as far to the right as possible”. Since it can't extend past the second equals sign without causing a parse error (== is non-fix), the let-expression must terminate there. GHC simply gobbles up the whole expression, parsing like this:

        (let x = 42 in x == 42 == True)

    The Haskell report is arguably wrong here, but nevertheless it's a difference between GHC & Haskell 98.

12.1.1.3. Expressions and patterns

None known.

12.1.1.4. Declarations and bindings

None known.

12.1.1.5. Module system and interface files

None known.

12.1.1.6. Numbers, basic types, and built-in classes

Multiply-defined array elements—not checked:

This code fragment should elicit a fatal error, but it does not:

main = print (array (1,1) [(1,2), (1,3)])

GHC's implementation of array takes the value of an array slot from the last (index,value) pair in the list, and does no checking for duplicates. The reason for this is efficiency, pure and simple.

12.1.1.7. In Prelude support

Arbitrary-sized tuples

Tuples are currently limited to size 100. HOWEVER: standard instances for tuples (Eq, Ord, Bounded, Ix Read, and Show) are available only up to 16-tuples.

This limitation is easily subvertible, so please ask if you get stuck on it.

Reading integers

GHC's implementation of the Read class for integral types accepts hexadecimal and octal literals (the code in the Haskell 98 report doesn't). So, for example,

read "0xf00" :: Int

works in GHC.

A possible reason for this is that readLitChar accepts hex and octal escapes, so it seems inconsistent not to do so for integers too.

isAlpha

The Haskell 98 definition of isAlpha is:

isAlpha c = isUpper c || isLower c

GHC's implementation diverges from the Haskell 98 definition in the sense that Unicode alphabetic characters which are neither upper nor lower case will still be identified as alphabetic by isAlpha.

12.1.2. GHC's interpretation of undefined behaviour in Haskell 98

This section documents GHC's take on various issues that are left undefined or implementation specific in Haskell 98.

The Char type

Following the ISO-10646 standard, maxBound :: Char in GHC is 0x10FFFF.

Sized integral types

In GHC the Int type follows the size of an address on the host architecture; in other words it holds 32 bits on a 32-bit machine, and 64-bits on a 64-bit machine.

Arithmetic on Int is unchecked for overflow, so all operations on Int happen modulo 2n where n is the size in bits of the Int type.

The fromIntegerfunction (and hence also fromIntegral) is a special case when converting to Int. The value of fromIntegral x :: Int is given by taking the lower n bits of (abs x), multiplied by the sign of x (in 2's complement n-bit arithmetic). This behaviour was chosen so that for example writing 0xffffffff :: Int preserves the bit-pattern in the resulting Int.

Negative literals, such as -3, are specified by (a careful reading of) the Haskell Report as meaning Prelude.negate (Prelude.fromInteger 3). So -2147483648 means negate (fromInteger 2147483648). Since fromInteger takes the lower 32 bits of the representation, fromInteger (2147483648::Integer), computed at type Int is -2147483648::Int. The negate operation then overflows, but it is unchecked, so negate (-2147483648::Int) is just -2147483648. In short, one can write minBound::Int as a literal with the expected meaning (but that is not in general guaranteed.

The fromIntegral function also preserves bit-patterns when converting between the sized integral types (Int8, Int16, Int32, Int64 and the unsigned Word variants), see the modules Data.Int and Data.Word in the library documentation.

Unchecked float arithmetic

Operations on Float and Double numbers are unchecked for overflow, underflow, and other sad occurrences. (note, however that some architectures trap floating-point overflow and loss-of-precision and report a floating-point exception, probably terminating the program).