|
Text.Regex.Posix.Wrap | Portability | non-portable (regex-base needs MPTC+FD) | Stability | experimental | Maintainer | libraries@haskell.org, textregexlazy@personal.mightyreason.com |
|
|
|
|
|
Description |
WrapPosix.hsc exports a wrapped version of the ffi imports. To
increase type safety, the flags are newtype'd. The other important
export is a Regex type that is specific to the Posix library
backend. The flags are documented in Text.Regex.Posix. The
defaultCompOpt is (compExtended .|. compNewline).
The Regex, CompOption, and ExecOption types and their RegexOptions
instance is declared. The =~ and =~~ convenience functions are
defined.
The exported symbols are the same whether HAVE_REGEX_H is defined, but
when it is not defined then getVersion == Nothing and all other
exported values will call error or fail.
This module will fail or error only if allocation fails or a nullPtr
is passed in.
|
|
Synopsis |
|
|
|
|
High-level API
|
|
data Regex |
A compiled regular expression.
| Instances | |
|
|
type RegOffset = Int64 |
RegOffset is typedef int regoff_t on Linux and ultimately typedef
long long __int64_t on Max OS X. So rather than saying
2,147,483,647 is all the length you need, I'll take the larger:
9,223,372,036,854,775,807 should be enough bytes for anyone, no
need for Integer. The alternative is to compile to different sizes
in a platform dependent manner with type RegOffset = (#type
regoff_t), which I do not want to do.
There is also a special value unusedRegOffset :: RegOffset which is
(-1) and as a starting index means that the subgroup capture was
unused. Otherwise the RegOffset indicates a character boundary that
is before the character at that index offset, with the first
character at index offset 0. So starting at 1 and ending at 2 means
to take only the second character.
|
|
(=~) :: (RegexMaker Regex CompOption ExecOption source, RegexContext Regex source1 target) => source1 -> source -> target |
|
(=~~) :: (RegexMaker Regex CompOption ExecOption source, RegexContext Regex source1 target, Monad m) => source1 -> source -> m target |
|
Low-level API
|
|
type WrapError = (ReturnCode, String) |
The return code will be retOk when it is the Haskell wrapper and
not the underlying library generating the error message.
|
|
wrapCompile |
|
|
wrapTest :: Regex -> CString -> IO (Either WrapError Bool) |
|
wrapMatch :: Regex -> CString -> IO (Either WrapError (Maybe [(RegOffset, RegOffset)])) |
wrapMatch returns offsets for the begin and end of each capture.
Unused captures have offsets of unusedRegOffset which is (-1)
|
|
wrapMatchAll :: Regex -> CString -> IO (Either WrapError [MatchArray]) |
wrapMatchAll returns the offset and length of each capture.
Unused captures have an offset of unusedRegOffset which is (-1) and
length of 0.
|
|
wrapCount :: Regex -> CString -> IO (Either WrapError Int) |
|
Miscellaneous
|
|
unusedRegOffset :: RegOffset |
|
Compilation options
|
|
newtype CompOption |
A bitmapped CInt containing options for compilation of regular
expressions. Option values (and their man 3 regcomp names) are
- compBlank which is a completely zero value for all the flags.
This is also the blankCompOpt value.
- compExtended (REG_EXTENDED) which can be set to use extended instead
of basic regular expressions.
This is set in the defaultCompOpt value.
- compNewline (REG_NEWLINE) turns on newline sensitivity: The dot (.)
and inverted set [^ ] never match newline, and ^ and $ anchors do
match after and before newlines.
This is set in the defaultCompOpt value.
- compIgnoreCase (REG_ICASE) which can be set to match ignoring upper
and lower distinctions.
- compNoSub (REG_NOSUB) which turns off all information from matching
except whether a match exists.
| Constructors | | Instances | |
|
|
compBlank :: CompOption |
A completely zero value for all the flags.
This is also the blankCompOpt value.
|
|
compExtended :: CompOption |
|
compIgnoreCase :: CompOption |
|
compNoSub :: CompOption |
|
compNewline :: CompOption |
|
Execution options
|
|
newtype ExecOption |
A bitmapped CInt containing options for execution of compiled
regular expressions. Option values (and their man 3 regexec names) are
- execBlank which is a complete zero value for all the flags. This is
the blankExecOpt value.
- execNotBOL (REG_NOTBOL) can be set to prevent ^ from matching at the
start of the input.
- execNotEOL (REG_NOTEOL) can be set to prevent $ from matching at the
end of the input (before the terminating NUL).
| Constructors | | Instances | |
|
|
execBlank :: ExecOption |
A completely zero value for all the flags.
This is also the blankExecOpt value.
|
|
execNotBOL :: ExecOption |
|
execNotEOL :: ExecOption |
|
Return codes
|
|
newtype ReturnCode |
ReturnCode is an enumerated CInt, corresponding to the error codes
from man 3 regex:
- retBadbr (REG_BADBR) invalid repetition count(s) in { }
- retBadpat (REG_BADPAT) invalid regular expression
- retBadrpt (REG_BADRPT) ?, *, or + operand invalid
- retEcollate (REG_ECOLLATE) invalid collating element
- retEctype (REG_ECTYPE) invalid character class
- retEescape (REG_EESCAPE) \ applied to unescapable character
- retEsubreg (REG_ESUBREG) invalid backreference number
- retEbrack (REG_EBRACK) brackets [ ] not balanced
- retEparen (REG_EPAREN) parentheses ( ) not balanced
- retEbrace (REG_EBRACE) braces { } not balanced
- retErange (REG_ERANGE) invalid character range in [ ]
- retEspace (REG_ESPACE) ran out of memory
- retNoMatch (REG_NOMATCH) The regexec() function failed to match
| Constructors | | Instances | |
|
|
retBadbr :: ReturnCode |
|
retBadpat :: ReturnCode |
|
retBadrpt :: ReturnCode |
|
retEcollate :: ReturnCode |
|
retEctype :: ReturnCode |
|
retEescape :: ReturnCode |
|
retEsubreg :: ReturnCode |
|
retEbrack :: ReturnCode |
|
retEparen :: ReturnCode |
|
retEbrace :: ReturnCode |
|
retErange :: ReturnCode |
|
retEspace :: ReturnCode |
|
Produced by Haddock version 0.8 |