8.5. Regex: The low-level regex matching interface

(Sigbjorn Finne supplied the regular-expressions interface.)

The Regex library provides quite direct interface to the GNU regular-expression library, for doing manipulation on PackedStrings (Section 4.19). You probably need to see the GNU documentation if you are operating at this level. Alternatively, you can use the simpler and higher-level RegexString (Section 8.6) interface.

The datatypes and functions that Regex provides are:
data PatBuffer  # just a bunch of bytes (mutable)

data REmatch
 = REmatch (Array Int GroupBounds)  -- for $1, ... $n
           GroupBounds              -- for $` (everything before match)
           GroupBounds              -- for $& (entire matched string)
           GroupBounds              -- for $' (everything after)
           GroupBounds              -- for $+ (matched by last bracket)

-- GroupBounds hold the interval where a group
-- matched inside a string, e.g.
--
-- matching "reg(exp)" "a regexp" returns the pair (5,7) for the
-- (exp) group. (PackedString indices start from 0)

type GroupBounds = (Int, Int)

re_compile_pattern
        :: PackedString         -- pattern to compile
        -> Bool                 -- True <=> assume single-line mode
        -> Bool                 -- True <=> case-insensitive
        -> IO PatBuffer

re_match :: PatBuffer           -- compiled regexp
         -> PackedString        -- string to match
         -> Int                 -- start position
         -> Bool                -- True <=> record results in registers
         -> IO (Maybe REmatch)

-- Matching on 2 strings is useful when you're dealing with multiple
-- buffers, which is something that could prove useful for
-- PackedStrings, as we don't want to stuff the contents of a file
-- into one massive heap chunk, but load (smaller chunks) on demand.

re_match2 :: PatBuffer          -- 2-string version
          -> PackedString
          -> PackedString
          -> Int
          -> Int
          -> Bool
          -> IO (Maybe REmatch)

re_search :: PatBuffer          -- compiled regexp
          -> PackedString       -- string to search
          -> Int                -- start index
          -> Int                -- stop index
          -> Bool               -- True <=> record results in registers
          -> IO (Maybe REmatch)

re_search2 :: PatBuffer         -- Double buffer search
           -> PackedString
           -> PackedString
           -> Int               -- start index
           -> Int               -- range (?)
           -> Int               -- stop index
           -> Bool              -- True <=> results in registers
           -> IO (Maybe REmatch)