Safe Haskell	None
Language	GHC2021

GHC.Data.StringBuffer

Contents

Creation/destruction
Inspection
Moving and comparison
Conversion
Parsing integers
Checking for bi-directional format characters

Synopsis

data StringBuffer = StringBuffer {
- buf :: !(ForeignPtr Word8)
- len :: !Int
- cur :: !Int
}
hGetStringBuffer :: FilePath -> IO StringBuffer
hGetStringBufferBlock :: Handle -> Int -> IO StringBuffer
hPutStringBuffer :: Handle -> StringBuffer -> IO ()
appendStringBuffers :: StringBuffer -> StringBuffer -> IO StringBuffer
stringToStringBuffer :: String -> StringBuffer
stringBufferFromByteString :: ByteString -> StringBuffer
nextChar :: StringBuffer -> (Char, StringBuffer)
currentChar :: StringBuffer -> Char
prevChar :: StringBuffer -> Char -> Char
atEnd :: StringBuffer -> Bool
fingerprintStringBuffer :: StringBuffer -> Fingerprint
stepOn :: StringBuffer -> StringBuffer
offsetBytes :: Int -> StringBuffer -> StringBuffer
byteDiff :: StringBuffer -> StringBuffer -> Int
atLine :: Int -> StringBuffer -> Maybe StringBuffer
lexemeToString :: StringBuffer -> Int -> String
lexemeToFastString :: StringBuffer -> Int -> FastString
decodePrevNChars :: Int -> StringBuffer -> String
parseUnsignedInteger :: StringBuffer -> Int -> Integer -> (Char -> Int) -> Integer
findHashOffset :: StringBuffer -> Int
containsBidirectionalFormatChar :: StringBuffer -> Bool
bidirectionalFormatChars :: [(Char, String)]

Documentation

data StringBuffer Source #

A StringBuffer is an internal pointer to a sized chunk of bytes. The bytes are intended to be *immutable*. There are pure operations to read the contents of a StringBuffer.

A StringBuffer may have a finalizer, depending on how it was obtained.

Constructors

StringBuffer
Fields buf :: !(ForeignPtr Word8) len :: !Int cur :: !Int

Instances

Instances details

Show StringBuffer Source #
Instance details Defined in GHC.Data.StringBuffer Methods showsPrec :: Int -> StringBuffer -> ShowS # show :: StringBuffer -> String # showList :: [StringBuffer] -> ShowS #

Creation/destruction

hGetStringBuffer :: FilePath -> IO StringBuffer Source #

Read a file into a StringBuffer. The resulting buffer is automatically managed by the garbage collector.

hGetStringBufferBlock :: Handle -> Int -> IO StringBuffer Source #

hPutStringBuffer :: Handle -> StringBuffer -> IO () Source #

appendStringBuffers :: StringBuffer -> StringBuffer -> IO StringBuffer Source #

stringToStringBuffer :: String -> StringBuffer Source #

Encode a String into a StringBuffer as UTF-8. The resulting buffer is automatically managed by the garbage collector.

stringBufferFromByteString :: ByteString -> StringBuffer Source #

Convert a UTF-8 encoded ByteString into a 'StringBuffer. This really relies on the internals of both ByteString and StringBuffer.

O(n) (but optimized into a memcpy by bytestring under the hood)

Inspection

nextChar :: StringBuffer -> (Char, StringBuffer) Source #

Return the first UTF-8 character of a nonempty StringBuffer and as well the remaining portion (analogous to uncons). Warning: The behavior is undefined if the StringBuffer is empty. The result shares the same buffer as the original. Similar to utf8DecodeChar, if the character cannot be decoded as UTF-8, '\0' is returned.

currentChar :: StringBuffer -> Char Source #

Return the first UTF-8 character of a nonempty StringBuffer (analogous to head). Warning: The behavior is undefined if the StringBuffer is empty. Similar to utf8DecodeChar, if the character cannot be decoded as UTF-8, '\0' is returned.

prevChar :: StringBuffer -> Char -> Char Source #

atEnd :: StringBuffer -> Bool Source #

Check whether a StringBuffer is empty (analogous to null).

fingerprintStringBuffer :: StringBuffer -> Fingerprint Source #

Computes a hash of the contents of a StringBuffer.

Moving and comparison

stepOn :: StringBuffer -> StringBuffer Source #

Return a StringBuffer with the first UTF-8 character removed (analogous to tail). Warning: The behavior is undefined if the StringBuffer is empty. The result shares the same buffer as the original.

offsetBytes Source #

Arguments

:: Int	`n`, the number of bytes
-> StringBuffer
-> StringBuffer

Return a StringBuffer with the first n bytes removed. Warning: If there aren't enough characters, the returned StringBuffer will be invalid and any use of it may lead to undefined behavior. The result shares the same buffer as the original.

byteDiff :: StringBuffer -> StringBuffer -> Int Source #

Compute the difference in offset between two StringBuffers that share the same buffer. Warning: The behavior is undefined if the StringBuffers use separate buffers.

atLine :: Int -> StringBuffer -> Maybe StringBuffer Source #

Computes a StringBuffer which points to the first character of the wanted line. Lines begin at 1.

Conversion

lexemeToString Source #

Arguments

:: StringBuffer
-> Int	`n`, the number of bytes
-> String

Decode the first n bytes of a StringBuffer as UTF-8 into a String. Similar to utf8DecodeChar, if the character cannot be decoded as UTF-8, they will be replaced with '\0'.

lexemeToFastString Source #

Arguments

:: StringBuffer
-> Int	`n`, the number of bytes
-> FastString

decodePrevNChars :: Int -> StringBuffer -> String Source #

Return the previous n characters (or fewer if we are less than n characters into the buffer.

Parsing integers

parseUnsignedInteger :: StringBuffer -> Int -> Integer -> (Char -> Int) -> Integer Source #

findHashOffset :: StringBuffer -> Int Source #

Find the offset of the # character in the StringBuffer.

Make sure that it contains one before calling this function!

Checking for bi-directional format characters

containsBidirectionalFormatChar :: StringBuffer -> Bool Source #

Returns true if the buffer contains Unicode bi-directional formatting characters.

https://www.unicode.org/reports/tr9/#Bidirectional_Character_Types

Bidirectional format characters are one of 'x202a' : "U+202A LEFT-TO-RIGHT EMBEDDING (LRE)" 'x202b' : "U+202B RIGHT-TO-LEFT EMBEDDING (RLE)" 'x202c' : "U+202C POP DIRECTIONAL FORMATTING (PDF)" 'x202d' : "U+202D LEFT-TO-RIGHT OVERRIDE (LRO)" 'x202e' : "U+202E RIGHT-TO-LEFT OVERRIDE (RLO)" 'x2066' : "U+2066 LEFT-TO-RIGHT ISOLATE (LRI)" 'x2067' : "U+2067 RIGHT-TO-LEFT ISOLATE (RLI)" 'x2068' : "U+2068 FIRST STRONG ISOLATE (FSI)" 'x2069' : "U+2069 POP DIRECTIONAL ISOLATE (PDI)"

This list is encoded in bidirectionalFormatChars

bidirectionalFormatChars :: [(Char, String)] Source #