Data.Char

The character type Char is an enumeration whose values represent Unicode (or equivalently ISO/IEC 10646) characters (see http://www.unicode.org/ for details). This set extends the ISO 8859-1 (Latin-1) character set (the first 256 charachers), which is itself an extension of the ASCII character set (the first 128 characters). A character literal in Haskell has type Char.

To convert a Char to or from the corresponding Int value defined by Unicode, use toEnum and fromEnum from the Enum class respectively (or equivalently ord and chr).

Instances

type String = [Char]

Source

A String is a list of characters. String constants in Haskell are values of type String.

Character classification

Unicode characters are divided into letters, numbers, marks, punctuation, symbols, separators (including spaces) and others (including control characters).

isControl :: Char -> Bool

Source

Selects control characters, which are the non-printing characters of the Latin-1 subset of Unicode.

isSpace :: Char -> Bool

Source

Selects white-space characters in the Latin-1 range. (In Unicode terms, this includes spaces and some control characters.)

isLower :: Char -> Bool

Source

Selects lower-case alphabetic Unicode characters (letters).

isUpper :: Char -> Bool

Source

Selects upper-case or title-case alphabetic Unicode characters (letters). Title case is used by a small number of letter ligatures like the single-character form of Lj.

isAlpha :: Char -> Bool

Source

Selects alphabetic Unicode characters (lower-case, upper-case and title-case letters, plus letters of caseless scripts and modifiers letters). This function is equivalent to isLetter.

isAlphaNum :: Char -> Bool

Source

Selects alphabetic or numeric digit Unicode characters.

Note that numeric digits outside the ASCII range are selected by this function but not by isDigit. Such digits may be part of identifiers but are not used by the printer and reader to represent numbers.

isPrint :: Char -> Bool

Source

Selects printable Unicode characters (letters, numbers, marks, punctuation, symbols and spaces).

isDigit :: Char -> Bool

Source

Selects ASCII digits, i.e. '0'..'9'.

isOctDigit :: Char -> Bool

Source

Selects ASCII octal digits, i.e. '0'..'7'.

isHexDigit :: Char -> Bool

Source

Selects ASCII hexadecimal digits, i.e. '0'..'9', 'a'..'f', 'A'..'F'.

isLetter :: Char -> Bool

Source

Selects alphabetic Unicode characters (lower-case, upper-case and title-case letters, plus letters of caseless scripts and modifiers letters). This function is equivalent to isAlpha.

isMark :: Char -> Bool

Source

Selects Unicode mark characters, e.g. accents and the like, which combine with preceding letters.

isNumber :: Char -> Bool

Source

Selects Unicode numeric characters, including digits from various scripts, Roman numerals, etc.

isPunctuation :: Char -> Bool

Source

Selects Unicode punctuation characters, including various kinds of connectors, brackets and quotes.

isSymbol :: Char -> Bool

Source

Selects Unicode symbol characters, including mathematical and currency symbols.

isSeparator :: Char -> Bool

Source

Selects Unicode space and separator characters.

Subranges

isAscii :: Char -> Bool

Source

Selects the first 128 characters of the Unicode character set, corresponding to the ASCII character set.

isLatin1 :: Char -> Bool

Source

Selects the first 256 characters of the Unicode character set, corresponding to the ISO 8859-1 (Latin-1) character set.

isAsciiUpper :: Char -> Bool

Source

Selects ASCII upper-case letters, i.e. characters satisfying both isAscii and isUpper.

isAsciiLower :: Char -> Bool

Source

Selects ASCII lower-case letters, i.e. characters satisfying both isAscii and isLower.

Unicode general categories

data GeneralCategory

Source

Unicode General Categories (column 2 of the UnicodeData table) in the order they are listed in the Unicode standard.

Constructors

UppercaseLetter	Lu: Letter, Uppercase
LowercaseLetter	Ll: Letter, Lowercase
TitlecaseLetter	Lt: Letter, Titlecase
ModifierLetter	Lm: Letter, Modifier
OtherLetter	Lo: Letter, Other
NonSpacingMark	Mn: Mark, Non-Spacing
SpacingCombiningMark	Mc: Mark, Spacing Combining
EnclosingMark	Me: Mark, Enclosing
DecimalNumber	Nd: Number, Decimal
LetterNumber	Nl: Number, Letter
OtherNumber	No: Number, Other
ConnectorPunctuation	Pc: Punctuation, Connector
DashPunctuation	Pd: Punctuation, Dash
OpenPunctuation	Ps: Punctuation, Open
ClosePunctuation	Pe: Punctuation, Close
InitialQuote	Pi: Punctuation, Initial quote
FinalQuote	Pf: Punctuation, Final quote
OtherPunctuation	Po: Punctuation, Other
MathSymbol	Sm: Symbol, Math
CurrencySymbol	Sc: Symbol, Currency
ModifierSymbol	Sk: Symbol, Modifier
OtherSymbol	So: Symbol, Other
Space	Zs: Separator, Space
LineSeparator	Zl: Separator, Line
ParagraphSeparator	Zp: Separator, Paragraph
Control	Cc: Other, Control
Format	Cf: Other, Format
Surrogate	Cs: Other, Surrogate
PrivateUse	Co: Other, Private Use
NotAssigned	Cn: Other, Not Assigned

Instances

Bounded GeneralCategory

generalCategory :: Char -> GeneralCategory

Source

The Unicode general category of the character.

Case conversion

toUpper :: Char -> Char

Source

Convert a letter to the corresponding upper-case letter, if any. Any other character is returned unchanged.

toLower :: Char -> Char

Source

Convert a letter to the corresponding lower-case letter, if any. Any other character is returned unchanged.

toTitle :: Char -> Char

Source

Convert a letter to the corresponding title-case or upper-case letter, if any. (Title case differs from upper case only for a small number of ligature letters.) Any other character is returned unchanged.

Single digit characters

digitToInt :: Char -> Int

Source

Convert a single digit Char to the corresponding Int. This function fails unless its argument satisfies isHexDigit, but recognises both upper and lower-case hexadecimal digits (i.e. '0'..'9', 'a'..'f', 'A'..'F').

intToDigit :: Int -> Char

Source

Convert an Int in the range 0..15 to the corresponding single digit Char. This function fails on other inputs, and generates lower-case hexadecimal digits.

Numeric representations

ord :: Char -> Int

Source

The fromEnum method restricted to the type Char.

chr :: Int -> Char

Source

The toEnum method restricted to the type Char.

String representations

showLitChar :: Char -> ShowS

Source

Convert a character to a string using only printable characters, using Haskell source-language escape conventions. For example:

 showLitChar '\n' s  =  "\\n" ++ s

lexLitChar :: ReadS String

Source

Read a string representation of a character, using Haskell source-language escape conventions. For example:

 lexLitChar  "\\nHello"  =  [("\\n", "Hello")]

readLitChar :: ReadS Char

Source

Read a string representation of a character, using Haskell source-language escape conventions, and convert it to the character that it encodes. For example:

 readLitChar "\\nHello"  =  [('\n', "Hello")]

Produced by Haddock version 0.9