Haskell Hierarchical Libraries (regex-posix package)Source codeContentsIndex
Text.Regex.Posix.Wrap
Portabilitynon-portable (regex-base needs MPTC+FD)
Stabilityexperimental
Maintainerlibraries@haskell.org, textregexlazy@personal.mightyreason.com
Contents
High-level API
Low-level API
Miscellaneous
Compilation options
Execution options
Return codes
Description

WrapPosix.hsc exports a wrapped version of the ffi imports. To increase type safety, the flags are newtype'd. The other important export is a Regex type that is specific to the Posix library backend. The flags are documented in Text.Regex.Posix. The defaultCompOpt is (compExtended .|. compNewline).

The Regex, CompOption, and ExecOption types and their RegexOptions instance is declared. The =~ and =~~ convenience functions are defined.

The exported symbols are the same whether HAVE_REGEX_H is defined, but when it is not defined then getVersion == Nothing and all other exported values will call error or fail.

This module will fail or error only if allocation fails or a nullPtr is passed in.

Synopsis
data Regex
type RegOffset = Int64
(=~) :: (RegexMaker Regex CompOption ExecOption source, RegexContext Regex source1 target) => source1 -> source -> target
(=~~) :: (RegexMaker Regex CompOption ExecOption source, RegexContext Regex source1 target, Monad m) => source1 -> source -> m target
type WrapError = (ReturnCode, String)
wrapCompile :: CompOption -> ExecOption -> CString -> IO (Either WrapError Regex)
wrapTest :: Regex -> CString -> IO (Either WrapError Bool)
wrapMatch :: Regex -> CString -> IO (Either WrapError (Maybe [(RegOffset, RegOffset)]))
wrapMatchAll :: Regex -> CString -> IO (Either WrapError [MatchArray])
wrapCount :: Regex -> CString -> IO (Either WrapError Int)
unusedRegOffset :: RegOffset
newtype CompOption = CompOption CInt
compBlank :: CompOption
compExtended :: CompOption
compIgnoreCase :: CompOption
compNoSub :: CompOption
compNewline :: CompOption
newtype ExecOption = ExecOption CInt
execBlank :: ExecOption
execNotBOL :: ExecOption
execNotEOL :: ExecOption
newtype ReturnCode = ReturnCode CInt
retBadbr :: ReturnCode
retBadpat :: ReturnCode
retBadrpt :: ReturnCode
retEcollate :: ReturnCode
retEctype :: ReturnCode
retEescape :: ReturnCode
retEsubreg :: ReturnCode
retEbrack :: ReturnCode
retEparen :: ReturnCode
retEbrace :: ReturnCode
retErange :: ReturnCode
retEspace :: ReturnCode
High-level API
data Regex
A compiled regular expression.
show/hide Instances
type RegOffset = Int64

RegOffset is typedef int regoff_t on Linux and ultimately typedef long long __int64_t on Max OS X. So rather than saying 2,147,483,647 is all the length you need, I'll take the larger: 9,223,372,036,854,775,807 should be enough bytes for anyone, no need for Integer. The alternative is to compile to different sizes in a platform dependent manner with type RegOffset = (#type regoff_t), which I do not want to do.

There is also a special value unusedRegOffset :: RegOffset which is (-1) and as a starting index means that the subgroup capture was unused. Otherwise the RegOffset indicates a character boundary that is before the character at that index offset, with the first character at index offset 0. So starting at 1 and ending at 2 means to take only the second character.

(=~) :: (RegexMaker Regex CompOption ExecOption source, RegexContext Regex source1 target) => source1 -> source -> target
(=~~) :: (RegexMaker Regex CompOption ExecOption source, RegexContext Regex source1 target, Monad m) => source1 -> source -> m target
Low-level API
type WrapError = (ReturnCode, String)
The return code will be retOk when it is the Haskell wrapper and not the underlying library generating the error message.
wrapCompile
:: CompOptionFlags (bitmapped)
-> ExecOptionFlags (bitmapped)
-> CStringThe regular expression to compile (ASCII only, no null bytes)
-> IO (Either WrapError Regex)Returns: the compiled regular expression
wrapTest :: Regex -> CString -> IO (Either WrapError Bool)
wrapMatch :: Regex -> CString -> IO (Either WrapError (Maybe [(RegOffset, RegOffset)]))
wrapMatch returns offsets for the begin and end of each capture. Unused captures have offsets of unusedRegOffset which is (-1)
wrapMatchAll :: Regex -> CString -> IO (Either WrapError [MatchArray])
wrapMatchAll returns the offset and length of each capture. Unused captures have an offset of unusedRegOffset which is (-1) and length of 0.
wrapCount :: Regex -> CString -> IO (Either WrapError Int)
Miscellaneous
unusedRegOffset :: RegOffset
Compilation options
newtype CompOption

A bitmapped CInt containing options for compilation of regular expressions. Option values (and their man 3 regcomp names) are

  • compBlank which is a completely zero value for all the flags. This is also the blankCompOpt value.
  • compExtended (REG_EXTENDED) which can be set to use extended instead of basic regular expressions. This is set in the defaultCompOpt value.
  • compNewline (REG_NEWLINE) turns on newline sensitivity: The dot (.) and inverted set [^ ] never match newline, and ^ and $ anchors do match after and before newlines. This is set in the defaultCompOpt value.
  • compIgnoreCase (REG_ICASE) which can be set to match ignoring upper and lower distinctions.
  • compNoSub (REG_NOSUB) which turns off all information from matching except whether a match exists.
Constructors
CompOption CInt
show/hide Instances
compBlank :: CompOption
A completely zero value for all the flags. This is also the blankCompOpt value.
compExtended :: CompOption
compIgnoreCase :: CompOption
compNoSub :: CompOption
compNewline :: CompOption
Execution options
newtype ExecOption

A bitmapped CInt containing options for execution of compiled regular expressions. Option values (and their man 3 regexec names) are

  • execBlank which is a complete zero value for all the flags. This is the blankExecOpt value.
  • execNotBOL (REG_NOTBOL) can be set to prevent ^ from matching at the start of the input.
  • execNotEOL (REG_NOTEOL) can be set to prevent $ from matching at the end of the input (before the terminating NUL).
Constructors
ExecOption CInt
show/hide Instances
execBlank :: ExecOption
A completely zero value for all the flags. This is also the blankExecOpt value.
execNotBOL :: ExecOption
execNotEOL :: ExecOption
Return codes
newtype ReturnCode

ReturnCode is an enumerated CInt, corresponding to the error codes from man 3 regex:

  • retBadbr (REG_BADBR) invalid repetition count(s) in { }
  • retBadpat (REG_BADPAT) invalid regular expression
  • retBadrpt (REG_BADRPT) ?, *, or + operand invalid
  • retEcollate (REG_ECOLLATE) invalid collating element
  • retEctype (REG_ECTYPE) invalid character class
  • retEescape (REG_EESCAPE) \ applied to unescapable character
  • retEsubreg (REG_ESUBREG) invalid backreference number
  • retEbrack (REG_EBRACK) brackets [ ] not balanced
  • retEparen (REG_EPAREN) parentheses ( ) not balanced
  • retEbrace (REG_EBRACE) braces { } not balanced
  • retErange (REG_ERANGE) invalid character range in [ ]
  • retEspace (REG_ESPACE) ran out of memory
  • retNoMatch (REG_NOMATCH) The regexec() function failed to match
Constructors
ReturnCode CInt
show/hide Instances
retBadbr :: ReturnCode
retBadpat :: ReturnCode
retBadrpt :: ReturnCode
retEcollate :: ReturnCode
retEctype :: ReturnCode
retEescape :: ReturnCode
retEsubreg :: ReturnCode
retEbrack :: ReturnCode
retEparen :: ReturnCode
retEbrace :: ReturnCode
retErange :: ReturnCode
retEspace :: ReturnCode
Produced by Haddock version 0.8