|
Data.ByteString.Char8 | Portability | portable | Stability | experimental | Maintainer | dons@cse.unsw.edu.au |
|
|
|
|
|
Description |
Manipulate ByteStrings using Char operations. All Chars will be
truncated to 8 bits. It can be expected that these functions will run
at identical speeds to their Word8 equivalents in Data.ByteString.
More specifically these byte strings are taken to be in the
subset of Unicode covered by code points 0-255. This covers
Unicode Basic Latin, Latin-1 Supplement and C0+C1 Controls.
See:
This module is intended to be imported qualified, to avoid name
clashes with Prelude functions. eg.
import qualified Data.ByteString.Char8 as B
|
|
Synopsis |
|
data ByteString | | empty :: ByteString | | singleton :: Char -> ByteString | | pack :: String -> ByteString | | unpack :: ByteString -> [Char] | | cons :: Char -> ByteString -> ByteString | | snoc :: ByteString -> Char -> ByteString | | append :: ByteString -> ByteString -> ByteString | | head :: ByteString -> Char | | uncons :: ByteString -> Maybe (Char, ByteString) | | last :: ByteString -> Char | | tail :: ByteString -> ByteString | | init :: ByteString -> ByteString | | null :: ByteString -> Bool | | length :: ByteString -> Int | | map :: (Char -> Char) -> ByteString -> ByteString | | reverse :: ByteString -> ByteString | | intersperse :: Char -> ByteString -> ByteString | | intercalate :: ByteString -> [ByteString] -> ByteString | | transpose :: [ByteString] -> [ByteString] | | foldl :: (a -> Char -> a) -> a -> ByteString -> a | | foldl' :: (a -> Char -> a) -> a -> ByteString -> a | | foldl1 :: (Char -> Char -> Char) -> ByteString -> Char | | foldl1' :: (Char -> Char -> Char) -> ByteString -> Char | | foldr :: (Char -> a -> a) -> a -> ByteString -> a | | foldr' :: (Char -> a -> a) -> a -> ByteString -> a | | foldr1 :: (Char -> Char -> Char) -> ByteString -> Char | | foldr1' :: (Char -> Char -> Char) -> ByteString -> Char | | concat :: [ByteString] -> ByteString | | concatMap :: (Char -> ByteString) -> ByteString -> ByteString | | any :: (Char -> Bool) -> ByteString -> Bool | | all :: (Char -> Bool) -> ByteString -> Bool | | maximum :: ByteString -> Char | | minimum :: ByteString -> Char | | scanl :: (Char -> Char -> Char) -> Char -> ByteString -> ByteString | | scanl1 :: (Char -> Char -> Char) -> ByteString -> ByteString | | scanr :: (Char -> Char -> Char) -> Char -> ByteString -> ByteString | | scanr1 :: (Char -> Char -> Char) -> ByteString -> ByteString | | mapAccumL :: (acc -> Char -> (acc, Char)) -> acc -> ByteString -> (acc, ByteString) | | mapAccumR :: (acc -> Char -> (acc, Char)) -> acc -> ByteString -> (acc, ByteString) | | mapIndexed :: (Int -> Char -> Char) -> ByteString -> ByteString | | replicate :: Int -> Char -> ByteString | | unfoldr :: (a -> Maybe (Char, a)) -> a -> ByteString | | unfoldrN :: Int -> (a -> Maybe (Char, a)) -> a -> (ByteString, Maybe a) | | take :: Int -> ByteString -> ByteString | | drop :: Int -> ByteString -> ByteString | | splitAt :: Int -> ByteString -> (ByteString, ByteString) | | takeWhile :: (Char -> Bool) -> ByteString -> ByteString | | dropWhile :: (Char -> Bool) -> ByteString -> ByteString | | span :: (Char -> Bool) -> ByteString -> (ByteString, ByteString) | | spanEnd :: (Char -> Bool) -> ByteString -> (ByteString, ByteString) | | break :: (Char -> Bool) -> ByteString -> (ByteString, ByteString) | | breakEnd :: (Char -> Bool) -> ByteString -> (ByteString, ByteString) | | group :: ByteString -> [ByteString] | | groupBy :: (Char -> Char -> Bool) -> ByteString -> [ByteString] | | inits :: ByteString -> [ByteString] | | tails :: ByteString -> [ByteString] | | split :: Char -> ByteString -> [ByteString] | | splitWith :: (Char -> Bool) -> ByteString -> [ByteString] | | lines :: ByteString -> [ByteString] | | words :: ByteString -> [ByteString] | | unlines :: [ByteString] -> ByteString | | unwords :: [ByteString] -> ByteString | | isPrefixOf :: ByteString -> ByteString -> Bool | | isSuffixOf :: ByteString -> ByteString -> Bool | | isInfixOf :: ByteString -> ByteString -> Bool | | isSubstringOf :: ByteString -> ByteString -> Bool | | findSubstring :: ByteString -> ByteString -> Maybe Int | | findSubstrings :: ByteString -> ByteString -> [Int] | | elem :: Char -> ByteString -> Bool | | notElem :: Char -> ByteString -> Bool | | find :: (Char -> Bool) -> ByteString -> Maybe Char | | filter :: (Char -> Bool) -> ByteString -> ByteString | | index :: ByteString -> Int -> Char | | elemIndex :: Char -> ByteString -> Maybe Int | | elemIndices :: Char -> ByteString -> [Int] | | elemIndexEnd :: Char -> ByteString -> Maybe Int | | findIndex :: (Char -> Bool) -> ByteString -> Maybe Int | | findIndices :: (Char -> Bool) -> ByteString -> [Int] | | count :: Char -> ByteString -> Int | | zip :: ByteString -> ByteString -> [(Char, Char)] | | zipWith :: (Char -> Char -> a) -> ByteString -> ByteString -> [a] | | unzip :: [(Char, Char)] -> (ByteString, ByteString) | | sort :: ByteString -> ByteString | | readInt :: ByteString -> Maybe (Int, ByteString) | | readInteger :: ByteString -> Maybe (Integer, ByteString) | | copy :: ByteString -> ByteString | | packCString :: CString -> IO ByteString | | packCStringLen :: CStringLen -> IO ByteString | | useAsCString :: ByteString -> (CString -> IO a) -> IO a | | useAsCStringLen :: ByteString -> (CStringLen -> IO a) -> IO a | | getLine :: IO ByteString | | getContents :: IO ByteString | | putStr :: ByteString -> IO () | | putStrLn :: ByteString -> IO () | | interact :: (ByteString -> ByteString) -> IO () | | readFile :: FilePath -> IO ByteString | | writeFile :: FilePath -> ByteString -> IO () | | appendFile :: FilePath -> ByteString -> IO () | | hGetLine :: Handle -> IO ByteString | | hGetContents :: Handle -> IO ByteString | | hGet :: Handle -> Int -> IO ByteString | | hGetNonBlocking :: Handle -> Int -> IO ByteString | | hPut :: Handle -> ByteString -> IO () | | hPutStr :: Handle -> ByteString -> IO () | | hPutStrLn :: Handle -> ByteString -> IO () |
|
|
|
The ByteString type
|
|
|
A space-efficient representation of a Word8 vector, supporting many
efficient operations. A ByteString contains 8-bit characters only.
Instances of Eq, Ord, Read, Show, Data, Typeable
| Instances | |
|
|
Introducing and eliminating ByteStrings
|
|
|
O(1) The empty ByteString
|
|
|
O(1) Convert a Char into a ByteString
|
|
|
O(n) Convert a String into a ByteString
For applications with large numbers of string literals, pack can be a
bottleneck.
|
|
|
O(n) Converts a ByteString to a String.
|
|
Basic interface
|
|
|
O(n) cons is analogous to (:) for lists, but of different
complexity, as it requires a memcpy.
|
|
|
O(n) Append a Char to the end of a ByteString. Similar to
cons, this function performs a memcpy.
|
|
|
O(n) Append two ByteStrings
|
|
|
O(1) Extract the first element of a ByteString, which must be non-empty.
|
|
|
O(1) Extract the head and tail of a ByteString, returning Nothing
if it is empty.
|
|
|
O(1) Extract the last element of a packed string, which must be non-empty.
|
|
|
O(1) Extract the elements after the head of a ByteString, which must be non-empty.
An exception will be thrown in the case of an empty ByteString.
|
|
|
O(1) Return all the elements of a ByteString except the last one.
An exception will be thrown in the case of an empty ByteString.
|
|
|
O(1) Test whether a ByteString is empty.
|
|
|
O(1) length returns the length of a ByteString as an Int.
|
|
Transformating ByteStrings
|
|
|
O(n) map f xs is the ByteString obtained by applying f to each element of xs
|
|
|
O(n) reverse xs efficiently returns the elements of xs in reverse order.
|
|
|
O(n) The intersperse function takes a Char and a ByteString
and `intersperses' that Char between the elements of the
ByteString. It is analogous to the intersperse function on Lists.
|
|
|
O(n) The intercalate function takes a ByteString and a list of
ByteStrings and concatenates the list after interspersing the first
argument between each element of the list.
|
|
|
The transpose function transposes the rows and columns of its
ByteString argument.
|
|
Reducing ByteStrings (folds)
|
|
|
foldl, applied to a binary operator, a starting value (typically
the left-identity of the operator), and a ByteString, reduces the
ByteString using the binary operator, from left to right.
|
|
|
foldl is like foldl, but strict in the accumulator.
|
|
|
foldl1 is a variant of foldl that has no starting value
argument, and thus must be applied to non-empty ByteStrings.
|
|
|
A strict version of foldl1
|
|
|
foldr, applied to a binary operator, a starting value
(typically the right-identity of the operator), and a packed string,
reduces the packed string using the binary operator, from right to left.
|
|
|
foldr is a strict variant of foldr
|
|
|
foldr1 is a variant of foldr that has no starting value argument,
and thus must be applied to non-empty ByteStrings
|
|
|
A strict variant of foldr1
|
|
Special folds
|
|
|
O(n) Concatenate a list of ByteStrings.
|
|
|
Map a function over a ByteString and concatenate the results
|
|
|
Applied to a predicate and a ByteString, any determines if
any element of the ByteString satisfies the predicate.
|
|
|
Applied to a predicate and a ByteString, all determines if
all elements of the ByteString satisfy the predicate.
|
|
|
maximum returns the maximum value from a ByteString
|
|
|
minimum returns the minimum value from a ByteString
|
|
Building ByteStrings
|
|
Scans
|
|
|
scanl is similar to foldl, but returns a list of successive
reduced values from the left:
scanl f z [x1, x2, ...] == [z, z `f` x1, (z `f` x1) `f` x2, ...]
Note that
last (scanl f z xs) == foldl f z xs.
|
|
|
scanl1 is a variant of scanl that has no starting value argument:
scanl1 f [x1, x2, ...] == [x1, x1 `f` x2, ...]
|
|
|
scanr is the right-to-left dual of scanl.
|
|
|
scanr1 is a variant of scanr that has no starting value argument.
|
|
Accumulating maps
|
|
|
The mapAccumL function behaves like a combination of map and
foldl; it applies a function to each element of a ByteString,
passing an accumulating parameter from left to right, and returning a
final value of this accumulator together with the new list.
|
|
|
The mapAccumR function behaves like a combination of map and
foldr; it applies a function to each element of a ByteString,
passing an accumulating parameter from right to left, and returning a
final value of this accumulator together with the new ByteString.
|
|
|
O(n) map Char functions, provided with the index at each position
|
|
Generating and unfolding ByteStrings
|
|
|
O(n) replicate n x is a ByteString of length n with x
the value of every element. The following holds:
replicate w c = unfoldr w (\u -> Just (u,u)) c
This implemenation uses memset(3)
|
|
|
O(n), where n is the length of the result. The unfoldr
function is analogous to the List 'unfoldr'. unfoldr builds a
ByteString from a seed value. The function takes the element and
returns Nothing if it is done producing the ByteString or returns
Just (a,b), in which case, a is the next character in the string,
and b is the seed value for further production.
Examples:
unfoldr (\x -> if x <= '9' then Just (x, succ x) else Nothing) '0' == "0123456789"
|
|
|
O(n) Like unfoldr, unfoldrN builds a ByteString from a seed
value. However, the length of the result is limited by the first
argument to unfoldrN. This function is more efficient than unfoldr
when the maximum length of the result is known.
The following equation relates unfoldrN and unfoldr:
unfoldrN n f s == take n (unfoldr f s)
|
|
Substrings
|
|
Breaking strings
|
|
|
O(1) take n, applied to a ByteString xs, returns the prefix
of xs of length n, or xs itself if n > length xs.
|
|
|
O(1) drop n xs returns the suffix of xs after the first n
elements, or [] if n > length xs.
|
|
|
O(1) splitAt n xs is equivalent to (take n xs, drop n xs).
|
|
|
takeWhile, applied to a predicate p and a ByteString xs,
returns the longest prefix (possibly empty) of xs of elements that
satisfy p.
|
|
|
dropWhile p xs returns the suffix remaining after takeWhile p xs.
|
|
|
span p xs breaks the ByteString into two segments. It is
equivalent to (takeWhile p xs, dropWhile p xs)
|
|
|
spanEnd behaves like span but from the end of the ByteString.
We have
spanEnd (not.isSpace) "x y z" == ("x y ","z")
and
spanEnd (not . isSpace) ps
==
let (x,y) = span (not.isSpace) (reverse ps) in (reverse y, reverse x)
|
|
|
break p is equivalent to span (not . p).
|
|
|
breakEnd behaves like break but from the end of the ByteString
breakEnd p == spanEnd (not.p)
|
|
|
The group function takes a ByteString and returns a list of
ByteStrings such that the concatenation of the result is equal to the
argument. Moreover, each sublist in the result contains only equal
elements. For example,
group "Mississippi" = ["M","i","ss","i","ss","i","pp","i"]
It is a special case of groupBy, which allows the programmer to
supply their own equality test. It is about 40% faster than
groupBy (==)
|
|
|
The groupBy function is the non-overloaded version of group.
|
|
|
O(n) Return all initial segments of the given ByteString, shortest first.
|
|
|
O(n) Return all final segments of the given ByteString, longest first.
|
|
Breaking into many substrings
|
|
|
O(n) Break a ByteString into pieces separated by the byte
argument, consuming the delimiter. I.e.
split '\n' "a\nb\nd\ne" == ["a","b","d","e"]
split 'a' "aXaXaXa" == ["","X","X","X",""]
split 'x' "x" == ["",""]
and
intercalate [c] . split c == id
split == splitWith . (==)
As for all splitting functions in this library, this function does
not copy the substrings, it just constructs new ByteStrings that
are slices of the original.
|
|
|
O(n) Splits a ByteString into components delimited by
separators, where the predicate returns True for a separator element.
The resulting components do not contain the separators. Two adjacent
separators result in an empty component in the output. eg.
splitWith (=='a') "aabbaca" == ["","","bb","c",""]
|
|
Breaking into lines and words
|
|
|
lines breaks a ByteString up into a list of ByteStrings at
newline Chars. The resulting strings do not contain newlines.
|
|
|
words breaks a ByteString up into a list of words, which
were delimited by Chars representing white space.
|
|
|
unlines is an inverse operation to lines. It joins lines,
after appending a terminating newline to each.
|
|
|
The unwords function is analogous to the unlines function, on words.
|
|
Predicates
|
|
|
O(n) The isPrefixOf function takes two ByteStrings and returns True
iff the first is a prefix of the second.
|
|
|
O(n) The isSuffixOf function takes two ByteStrings and returns True
iff the first is a suffix of the second.
The following holds:
isSuffixOf x y == reverse x `isPrefixOf` reverse y
However, the real implemenation uses memcmp to compare the end of the
string only, with no reverse required..
|
|
|
Alias of isSubstringOf
|
|
|
:: ByteString | String to search for.
| -> ByteString | String to search in.
| -> Bool | | Check whether one string is a substring of another. isSubstringOf
p s is equivalent to not (null (findSubstrings p s)).
|
|
|
Search for arbitrary substrings
|
|
|
:: ByteString | String to search for.
| -> ByteString | String to seach in.
| -> Maybe Int | | Get the first index of a substring in another string,
or Nothing if the string is not found.
findSubstring p s is equivalent to listToMaybe (findSubstrings p s).
|
|
|
|
:: ByteString | String to search for.
| -> ByteString | String to seach in.
| -> [Int] | | Find the indexes of all (possibly overlapping) occurances of a
substring in a string. This function uses the Knuth-Morris-Pratt
string matching algorithm.
|
|
|
Searching ByteStrings
|
|
Searching by equality
|
|
|
O(n) elem is the ByteString membership predicate. This
implementation uses memchr(3).
|
|
|
O(n) notElem is the inverse of elem
|
|
Searching with a predicate
|
|
|
O(n) The find function takes a predicate and a ByteString,
and returns the first element in matching the predicate, or Nothing
if there is no such element.
|
|
|
O(n) filter, applied to a predicate and a ByteString,
returns a ByteString containing those characters that satisfy the
predicate.
|
|
Indexing ByteStrings
|
|
|
O(1) ByteString index (subscript) operator, starting from 0.
|
|
|
O(n) The elemIndex function returns the index of the first
element in the given ByteString which is equal (by memchr) to the
query element, or Nothing if there is no such element.
|
|
|
O(n) The elemIndices function extends elemIndex, by returning
the indices of all elements equal to the query element, in ascending order.
|
|
|
O(n) The elemIndexEnd function returns the last index of the
element in the given ByteString which is equal to the query
element, or Nothing if there is no such element. The following
holds:
elemIndexEnd c xs ==
(-) (length xs - 1) `fmap` elemIndex c (reverse xs)
|
|
|
The findIndex function takes a predicate and a ByteString and
returns the index of the first element in the ByteString satisfying the predicate.
|
|
|
The findIndices function extends findIndex, by returning the
indices of all elements satisfying the predicate, in ascending order.
|
|
|
count returns the number of times its argument appears in the ByteString
count = length . elemIndices
Also
count '\n' == length . lines
But more efficiently than using length on the intermediate list.
|
|
Zipping and unzipping ByteStrings
|
|
|
O(n) zip takes two ByteStrings and returns a list of
corresponding pairs of Chars. If one input ByteString is short,
excess elements of the longer ByteString are discarded. This is
equivalent to a pair of unpack operations, and so space
usage may be large for multi-megabyte ByteStrings
|
|
|
zipWith generalises zip by zipping with the function given as
the first argument, instead of a tupling function. For example,
zipWith (+) is applied to two ByteStrings to produce the list
of corresponding sums.
|
|
|
unzip transforms a list of pairs of Chars into a pair of
ByteStrings. Note that this performs two pack operations.
|
|
Ordered ByteStrings
|
|
|
O(n) Sort a ByteString efficiently, using counting sort.
|
|
Reading from ByteStrings
|
|
|
readInt reads an Int from the beginning of the ByteString. If there is no
integer at the beginning of the string, it returns Nothing, otherwise
it just returns the int read, and the rest of the string.
|
|
|
readInteger reads an Integer from the beginning of the ByteString. If
there is no integer at the beginning of the string, it returns Nothing,
otherwise it just returns the int read, and the rest of the string.
|
|
Low level CString conversions
|
|
Copying ByteStrings
|
|
|
O(n) Make a copy of the ByteString with its own storage.
This is mainly useful to allow the rest of the data pointed
to by the ByteString to be garbage collected, for example
if a large string has been read in, and only a small part of it
is needed in the rest of the program.
|
|
Packing CStrings and pointers
|
|
|
O(n). Construct a new ByteString from a CString. The
resulting ByteString is an immutable copy of the original
CString, and is managed on the Haskell heap. The original
CString must be null terminated.
|
|
|
O(n). Construct a new ByteString from a CStringLen. The
resulting ByteString is an immutable copy of the original CStringLen.
The ByteString is a normal Haskell value and will be managed on the
Haskell heap.
|
|
Using ByteStrings as CStrings
|
|
|
O(n) construction Use a ByteString with a function requiring a
null-terminated CString. The CString will be freed
automatically. This is a memcpy(3).
|
|
|
O(n) construction Use a ByteString with a function requiring a CStringLen.
As for useAsCString this function makes a copy of the original ByteString.
|
|
I/O with ByteStrings
|
|
Standard input and output
|
|
|
Read a line from stdin.
|
|
|
getContents. Equivalent to hGetContents stdin
|
|
|
Write a ByteString to stdout
|
|
|
Write a ByteString to stdout, appending a newline byte
|
|
|
The interact function takes a function of type ByteString -> ByteString
as its argument. The entire input from the standard input device is passed
to this function as its argument, and the resulting string is output on the
standard output device. It's great for writing one line programs!
|
|
Files
|
|
|
Read an entire file strictly into a ByteString. This is far more
efficient than reading the characters into a String and then using
pack. It also may be more efficient than opening the file and
reading it using hGet.
|
|
|
Write a ByteString to a file.
|
|
|
Append a ByteString to a file.
|
|
I/O with Handles
|
|
|
Read a line from a handle
|
|
|
Read entire handle contents into a ByteString.
This function reads chunks at a time, doubling the chunksize on each
read. The final buffer is then realloced to the appropriate size. For
files > half of available memory, this may lead to memory exhaustion.
Consider using readFile in this case.
As with hGet, the string representation in the file is assumed to
be ISO-8859-1.
|
|
|
Read a ByteString directly from the specified Handle. This
is far more efficient than reading the characters into a String
and then using pack.
|
|
|
hGetNonBlocking is identical to hGet, except that it will never block
waiting for data to become available, instead it returns only whatever data
is available.
|
|
|
Outputs a ByteString to the specified Handle.
|
|
|
A synonym for hPut, for compatibility
|
|
|
Write a ByteString to a handle, appending a newline byte
|
|
Produced by Haddock version 0.8 |