text-1.2.4.1: An efficient packed Unicode text type.
An efficient packed, immutable Unicode text type (both strict and lazy), with a powerful loop fusion optimization framework.
The Text
type represents Unicode character strings, in a time and
space-efficient manner. This package provides text processing
capabilities that are optimized for performance critical use, both
in terms of large data quantities and high speed.
The Text
type provides character-encoding, type-safe case
conversion via whole-string case conversion functions (see Data.Text).
It also provides a range of functions for converting Text
values to
and from ByteStrings
, using several standard encodings
(see Data.Text.Encoding).
Efficient locale-sensitive support for text IO is also supported (see Data.Text.IO).
These modules are intended to be imported qualified, to avoid name clashes with Prelude functions, e.g.
import qualified Data.Text as T
ICU Support
To use an extended and very rich family of functions for working with Unicode text (including normalization, regular expressions, non-standard encodings, text breaking, and locales), see the text-icu package based on the well-respected and liberally licensed ICU library.
Internal Representation: UTF-16 vs. UTF-8
Currently the text
library uses UTF-16 as its internal representation
which is neither a fixed-width nor always the most dense representation
for Unicode text. We're currently investigating the feasibility
of changing Text's internal representation to UTF-8
and if you need such a Text
type right now you might be interested in using the spin-off
packages text-utf8 and
text-short.