Go to the first, previous, next, last section, table of contents.
(Sigbjorn Finne supplied the regular-expressions interface.)
The `Regex' library provides quite direct interface to the GNU
regular-expression library, for doing manipulation on
`PackedString's. You probably need to see the GNU documentation
if you are operating at this level.
The datatypes and functions that `Regex' provides are:
data PatBuffer # just a bunch of bytes (mutable)
data REmatch
= REmatch (Array Int GroupBounds) -- for $1, ... $n
GroupBounds -- for $` (everything before match)
GroupBounds -- for $& (entire matched string)
GroupBounds -- for $' (everything after)
GroupBounds -- for $+ (matched by last bracket)
-- GroupBounds hold the interval where a group
-- matched inside a string, e.g.
--
-- matching "reg(exp)" "a regexp" returns the pair (5,7) for the
-- (exp) group. (PackedString indices start from 0)
type GroupBounds = (Int, Int)
re_compile_pattern
:: PackedString -- pattern to compile
-> Bool -- True <=> assume single-line mode
-> Bool -- True <=> case-insensitive
-> PrimIO PatBuffer
re_match :: PatBuffer -- compiled regexp
-> PackedString -- string to match
-> Int -- start position
-> Bool -- True <=> record results in registers
-> PrimIO (Maybe REmatch)
-- Matching on 2 strings is useful when you're dealing with multiple
-- buffers, which is something that could prove useful for
-- PackedStrings, as we don't want to stuff the contents of a file
-- into one massive heap chunk, but load (smaller chunks) on demand.
re_match2 :: PatBuffer -- 2-string version
-> PackedString
-> PackedString
-> Int
-> Int
-> Bool
-> PrimIO (Maybe REmatch)
re_search :: PatBuffer -- compiled regexp
-> PackedString -- string to search
-> Int -- start index
-> Int -- stop index
-> Bool -- True <=> record results in registers
-> PrimIO (Maybe REmatch)
re_search2 :: PatBuffer -- Double buffer search
-> PackedString
-> PackedString
-> Int -- start index
-> Int -- range (?)
-> Int -- stop index
-> Bool -- True <=> results in registers
-> PrimIO (Maybe REmatch)
The `MatchPS' module provides Perl-like "higher-level" facilities
to operate on `PackedStrings'. The regular expressions in
question are in Perl syntax. The "flags" on various functions can
include: `i' for case-insensitive, `s' for single-line mode, and
`g' for global. (It's probably worth your time to peruse the
source code...)
matchPS :: PackedString -- regexp
-> PackedString -- string to match
-> [Char] -- flags
-> Maybe REmatch -- info about what matched and where
searchPS :: PackedString -- regexp
-> PackedString -- string to match
-> [Char] -- flags
-> Maybe REmatch
-- Perl-like match-and-substitute:
substPS :: PackedString -- regexp
-> PackedString -- replacement
-> [Char] -- flags
-> PackedString -- string
-> PackedString
-- same as substPS, but no prefix and suffix:
replacePS :: PackedString -- regexp
-> PackedString -- replacement
-> [Char] -- flags
-> PackedString -- string
-> PackedString
match2PS :: PackedString -- regexp
-> PackedString -- string1 to match
-> PackedString -- string2 to match
-> [Char] -- flags
-> Maybe REmatch
search2PS :: PackedString -- regexp
-> PackedString -- string to match
-> PackedString -- string to match
-> [Char] -- flags
-> Maybe REmatch
-- functions to pull the matched pieces out of an REmatch:
getMatchesNo :: REmatch -> Int
getMatchedGroup :: REmatch -> Int -> PackedString -> PackedString
getWholeMatch :: REmatch -> PackedString -> PackedString
getLastMatch :: REmatch -> PackedString -> PackedString
getAfterMatch :: REmatch -> PackedString -> PackedString
-- (reverse) brute-force string matching;
-- Perl equivalent is index/rindex:
findPS, rfindPS :: PackedString -> PackedString -> Maybe Int
-- Equivalent to Perl "chop" (off the last character, if any):
chopPS :: PackedString -> PackedString
-- matchPrefixPS: tries to match as much as possible of strA starting
-- from the beginning of strB (handy when matching fancy literals in
-- parsers):
matchPrefixPS :: PackedString -> PackedString -> Int
Go to the first, previous, next, last section, table of contents.