regex-posix-0.72.0.1: Replaces/Enhances Text.RegexContentsIndex
Text.Regex.Posix.Wrap
Portabilitynon-portable (regex-base needs MPTC+FD)
Stabilityexperimental
Maintainerlibraries@haskell.org, textregexlazy@personal.mightyreason.com
Contents
High-level API
Low-level API
Miscellaneous
Compilation options
Execution options
Return codes
Description

WrapPosix.hsc exports a wrapped version of the ffi imports. To increase type safety, the flags are newtype'd. The other important export is a Regex type that is specific to the Posix library backend. The flags are documented in Text.Regex.Posix. The defaultCompOpt is (compExtended .|. compNewline).

The Regex, CompOption, and ExecOption types and their RegexOptions instance is declared. The =~ and =~~ convenience functions are defined.

The exported symbols are the same whether 1 is defined, but when it is not defined then getVersion == Nothing and all other exported values will call error or fail.

This module will fail or error only if allocation fails or a nullPtr is passed in.

Synopsis
data Regex
type RegOffset = Int64
(=~) :: (RegexMaker Regex CompOption ExecOption source, RegexContext Regex source1 target) => source1 -> source -> target
(=~~) :: (RegexMaker Regex CompOption ExecOption source, RegexContext Regex source1 target, Monad m) => source1 -> source -> m target
type WrapError = (ReturnCode, String)
wrapCompile :: CompOption -> ExecOption -> CString -> IO (Either WrapError Regex)
wrapTest :: Regex -> CString -> IO (Either WrapError Bool)
wrapMatch :: Regex -> CString -> IO (Either WrapError (Maybe [(RegOffset, RegOffset)]))
wrapMatchAll :: Regex -> CString -> IO (Either WrapError [MatchArray])
wrapCount :: Regex -> CString -> IO (Either WrapError Int)
unusedRegOffset :: RegOffset
newtype CompOption = CompOption CInt
compBlank :: CompOption
compExtended :: CompOption
compIgnoreCase :: CompOption
compNoSub :: CompOption
compNewline :: CompOption
newtype ExecOption = ExecOption CInt
execBlank :: ExecOption
execNotBOL :: ExecOption
execNotEOL :: ExecOption
newtype ReturnCode = ReturnCode CInt
retBadbr :: ReturnCode
retBadpat :: ReturnCode
retBadrpt :: ReturnCode
retEcollate :: ReturnCode
retEctype :: ReturnCode
retEescape :: ReturnCode
retEsubreg :: ReturnCode
retEbrack :: ReturnCode
retEparen :: ReturnCode
retEbrace :: ReturnCode
retErange :: ReturnCode
retEspace :: ReturnCode
High-level API
data Regex
A compiled regular expression.
show/hide Instances
RegexLike Regex ByteString
RegexLike Regex String
RegexContext Regex ByteString ByteString
RegexContext Regex String String
RegexOptions Regex CompOption ExecOption
RegexMaker Regex CompOption ExecOption ByteString
RegexMaker Regex CompOption ExecOption String
type RegOffset = Int64

RegOffset is typedef int regoff_t on Linux and ultimately typedef long long __int64_t on Max OS X. So rather than saying 2,147,483,647 is all the length you need, I'll take the larger: 9,223,372,036,854,775,807 should be enough bytes for anyone, no need for Integer. The alternative is to compile to different sizes in a platform dependent manner with type RegOffset = (#type regoff_t), which I do not want to do.

There is also a special value unusedRegOffset :: RegOffset which is (-1) and as a starting index means that the subgroup capture was unused. Otherwise the RegOffset indicates a character boundary that is before the character at that index offset, with the first character at index offset 0. So starting at 1 and ending at 2 means to take only the second character.

(=~) :: (RegexMaker Regex CompOption ExecOption source, RegexContext Regex source1 target) => source1 -> source -> target
(=~~) :: (RegexMaker Regex CompOption ExecOption source, RegexContext Regex source1 target, Monad m) => source1 -> source -> m target
Low-level API
type WrapError = (ReturnCode, String)
The return code will be retOk when it is the Haskell wrapper and not the underlying library generating the error message.
wrapCompile
:: CompOptionFlags (bitmapped)
-> ExecOptionFlags (bitmapped)
-> CStringThe regular expression to compile (ASCII only, no null bytes)
-> IO (Either WrapError Regex)Returns: the compiled regular expression
wrapTest :: Regex -> CString -> IO (Either WrapError Bool)
wrapMatch :: Regex -> CString -> IO (Either WrapError (Maybe [(RegOffset, RegOffset)]))
wrapMatch returns offsets for the begin and end of each capture. Unused captures have offsets of unusedRegOffset which is (-1)
wrapMatchAll :: Regex -> CString -> IO (Either WrapError [MatchArray])
wrapMatchAll returns the offset and length of each capture. Unused captures have an offset of unusedRegOffset which is (-1) and length of 0.
wrapCount :: Regex -> CString -> IO (Either WrapError Int)
Miscellaneous
unusedRegOffset :: RegOffset
Compilation options
newtype CompOption

A bitmapped CInt containing options for compilation of regular expressions. Option values (and their man 3 regcomp names) are

  • compBlank which is a completely zero value for all the flags. This is also the blankCompOpt value.
  • compExtended (REG_EXTENDED) which can be set to use extended instead of basic regular expressions. This is set in the defaultCompOpt value.
  • compNewline (REG_NEWLINE) turns on newline sensitivity: The dot (.) and inverted set [^ ] never match newline, and ^ and $ anchors do match after and before newlines. This is set in the defaultCompOpt value.
  • compIgnoreCase (REG_ICASE) which can be set to match ignoring upper and lower distinctions.
  • compNoSub (REG_NOSUB) which turns off all information from matching except whether a match exists.
Constructors
CompOption CInt
show/hide Instances
compBlank :: CompOption
A completely zero value for all the flags. This is also the blankCompOpt value.
compExtended :: CompOption
compIgnoreCase :: CompOption
compNoSub :: CompOption
compNewline :: CompOption
Execution options
newtype ExecOption

A bitmapped CInt containing options for execution of compiled regular expressions. Option values (and their man 3 regexec names) are

  • execBlank which is a complete zero value for all the flags. This is the blankExecOpt value.
  • execNotBOL (REG_NOTBOL) can be set to prevent ^ from matching at the start of the input.
  • execNotEOL (REG_NOTEOL) can be set to prevent $ from matching at the end of the input (before the terminating NUL).
Constructors
ExecOption CInt
show/hide Instances
execBlank :: ExecOption
A completely zero value for all the flags. This is also the blankExecOpt value.
execNotBOL :: ExecOption
execNotEOL :: ExecOption
Return codes
newtype ReturnCode

ReturnCode is an enumerated CInt, corresponding to the error codes from man 3 regex:

  • retBadbr (REG_BADBR) invalid repetition count(s) in { }
  • retBadpat (REG_BADPAT) invalid regular expression
  • retBadrpt (REG_BADRPT) ?, *, or + operand invalid
  • retEcollate (REG_ECOLLATE) invalid collating element
  • retEctype (REG_ECTYPE) invalid character class
  • retEescape (REG_EESCAPE) \ applied to unescapable character
  • retEsubreg (REG_ESUBREG) invalid backreference number
  • retEbrack (REG_EBRACK) brackets [ ] not balanced
  • retEparen (REG_EPAREN) parentheses ( ) not balanced
  • retEbrace (REG_EBRACE) braces { } not balanced
  • retErange (REG_ERANGE) invalid character range in [ ]
  • retEspace (REG_ESPACE) ran out of memory
  • retNoMatch (REG_NOMATCH) The regexec() function failed to match
Constructors
ReturnCode CInt
show/hide Instances
retBadbr :: ReturnCode
retBadpat :: ReturnCode
retBadrpt :: ReturnCode
retEcollate :: ReturnCode
retEctype :: ReturnCode
retEescape :: ReturnCode
retEsubreg :: ReturnCode
retEbrack :: ReturnCode
retEparen :: ReturnCode
retEbrace :: ReturnCode
retErange :: ReturnCode
retEspace :: ReturnCode
Produced by Haddock version 0.8