text-2.0: An efficient packed Unicode text type.
Copyright(c) 2009 2010 Bryan O'Sullivan
LicenseBSD-style
Maintainerbos@serpentine.com
Portabilityportable
Safe HaskellTrustworthy
LanguageHaskell2010

Data.Text.Lazy.Encoding

Description

Functions for converting lazy Text values to and from lazy ByteString, using several standard encodings.

To gain access to a much larger family of encodings, use the text-icu package.

Synopsis

Decoding ByteStrings to Text

All of the single-parameter functions for decoding bytestrings encoded in one of the Unicode Transformation Formats (UTF) operate in a strict mode: each will throw an exception if given invalid input.

Each function has a variant, whose name is suffixed with -With, that gives greater control over the handling of decoding errors. For instance, decodeUtf8 will throw an exception, but decodeUtf8With allows the programmer to determine what to do on a decoding error.

decodeASCII :: ByteString -> Text Source #

Decode a ByteString containing 7-bit ASCII encoded text.

decodeLatin1 :: ByteString -> Text Source #

Decode a ByteString containing Latin-1 (aka ISO-8859-1) encoded text.

decodeUtf8 :: ByteString -> Text Source #

Decode a ByteString containing UTF-8 encoded text that is known to be valid.

If the input contains any invalid UTF-8 data, an exception will be thrown that cannot be caught in pure code. For more control over the handling of invalid data, use decodeUtf8' or decodeUtf8With.

decodeUtf16LE :: ByteString -> Text Source #

Decode text from little endian UTF-16 encoding.

If the input contains any invalid little endian UTF-16 data, an exception will be thrown. For more control over the handling of invalid data, use decodeUtf16LEWith.

decodeUtf16BE :: ByteString -> Text Source #

Decode text from big endian UTF-16 encoding.

If the input contains any invalid big endian UTF-16 data, an exception will be thrown. For more control over the handling of invalid data, use decodeUtf16BEWith.

decodeUtf32LE :: ByteString -> Text Source #

Decode text from little endian UTF-32 encoding.

If the input contains any invalid little endian UTF-32 data, an exception will be thrown. For more control over the handling of invalid data, use decodeUtf32LEWith.

decodeUtf32BE :: ByteString -> Text Source #

Decode text from big endian UTF-32 encoding.

If the input contains any invalid big endian UTF-32 data, an exception will be thrown. For more control over the handling of invalid data, use decodeUtf32BEWith.

Catchable failure

decodeUtf8' :: ByteString -> Either UnicodeException Text Source #

Decode a ByteString containing UTF-8 encoded text..

If the input contains any invalid UTF-8 data, the relevant exception will be returned, otherwise the decoded text.

Note: this function is not lazy, as it must decode its entire input before it can return a result. If you need lazy (streaming) decoding, use decodeUtf8With in lenient mode.

Controllable error handling

decodeUtf8With :: OnDecodeError -> ByteString -> Text Source #

Decode a ByteString containing UTF-8 encoded text.

decodeUtf16LEWith :: OnDecodeError -> ByteString -> Text Source #

Decode text from little endian UTF-16 encoding.

decodeUtf16BEWith :: OnDecodeError -> ByteString -> Text Source #

Decode text from big endian UTF-16 encoding.

decodeUtf32LEWith :: OnDecodeError -> ByteString -> Text Source #

Decode text from little endian UTF-32 encoding.

decodeUtf32BEWith :: OnDecodeError -> ByteString -> Text Source #

Decode text from big endian UTF-32 encoding.

Encoding Text to ByteStrings

encodeUtf8 :: Text -> ByteString Source #

Encode text using UTF-8 encoding.

encodeUtf16LE :: Text -> ByteString Source #

Encode text using little endian UTF-16 encoding.

encodeUtf16BE :: Text -> ByteString Source #

Encode text using big endian UTF-16 encoding.

encodeUtf32LE :: Text -> ByteString Source #

Encode text using little endian UTF-32 encoding.

encodeUtf32BE :: Text -> ByteString Source #

Encode text using big endian UTF-32 encoding.

Encoding Text using ByteString Builders

encodeUtf8Builder :: Text -> Builder Source #

Encode text to a ByteString Builder using UTF-8 encoding.

Since: text-1.1.0.0

encodeUtf8BuilderEscaped :: BoundedPrim Word8 -> Text -> Builder Source #

Encode text using UTF-8 encoding and escape the ASCII characters using a BoundedPrim.

Use this function is to implement efficient encoders for text-based formats like JSON or HTML.

Since: text-1.1.0.0