base-3.0.0.0: Basic librariesContentsIndex
Data.Char
Portabilityportable
Stabilitystable
Maintainerlibraries@haskell.org
Contents
Character classification
Subranges
Unicode general categories
Case conversion
Single digit characters
Numeric representations
String representations
Description
The Char type and associated operations.
Synopsis
data Char
type String = [Char]
isControl :: Char -> Bool
isSpace :: Char -> Bool
isLower :: Char -> Bool
isUpper :: Char -> Bool
isAlpha :: Char -> Bool
isAlphaNum :: Char -> Bool
isPrint :: Char -> Bool
isDigit :: Char -> Bool
isOctDigit :: Char -> Bool
isHexDigit :: Char -> Bool
isLetter :: Char -> Bool
isMark :: Char -> Bool
isNumber :: Char -> Bool
isPunctuation :: Char -> Bool
isSymbol :: Char -> Bool
isSeparator :: Char -> Bool
isAscii :: Char -> Bool
isLatin1 :: Char -> Bool
isAsciiUpper :: Char -> Bool
isAsciiLower :: Char -> Bool
data GeneralCategory
= UppercaseLetter
| LowercaseLetter
| TitlecaseLetter
| ModifierLetter
| OtherLetter
| NonSpacingMark
| SpacingCombiningMark
| EnclosingMark
| DecimalNumber
| LetterNumber
| OtherNumber
| ConnectorPunctuation
| DashPunctuation
| OpenPunctuation
| ClosePunctuation
| InitialQuote
| FinalQuote
| OtherPunctuation
| MathSymbol
| CurrencySymbol
| ModifierSymbol
| OtherSymbol
| Space
| LineSeparator
| ParagraphSeparator
| Control
| Format
| Surrogate
| PrivateUse
| NotAssigned
generalCategory :: Char -> GeneralCategory
toUpper :: Char -> Char
toLower :: Char -> Char
toTitle :: Char -> Char
digitToInt :: Char -> Int
intToDigit :: Int -> Char
ord :: Char -> Int
chr :: Int -> Char
showLitChar :: Char -> ShowS
lexLitChar :: ReadS String
readLitChar :: ReadS Char
Documentation
data Char

The character type Char is an enumeration whose values represent Unicode (or equivalently ISO/IEC 10646) characters (see http://www.unicode.org/ for details). This set extends the ISO 8859-1 (Latin-1) character set (the first 256 charachers), which is itself an extension of the ASCII character set (the first 128 characters). A character literal in Haskell has type Char.

To convert a Char to or from the corresponding Int value defined by Unicode, use toEnum and fromEnum from the Enum class respectively (or equivalently ord and chr).

show/hide Instances
type String = [Char]
A String is a list of characters. String constants in Haskell are values of type String.
Character classification
Unicode characters are divided into letters, numbers, marks, punctuation, symbols, separators (including spaces) and others (including control characters).
isControl :: Char -> Bool
Selects control characters, which are the non-printing characters of the Latin-1 subset of Unicode.
isSpace :: Char -> Bool
Selects white-space characters in the Latin-1 range. (In Unicode terms, this includes spaces and some control characters.)
isLower :: Char -> Bool
Selects lower-case alphabetic Unicode characters (letters).
isUpper :: Char -> Bool
Selects upper-case or title-case alphabetic Unicode characters (letters). Title case is used by a small number of letter ligatures like the single-character form of Lj.
isAlpha :: Char -> Bool
Selects alphabetic Unicode characters (lower-case, upper-case and title-case letters, plus letters of caseless scripts and modifiers letters). This function is equivalent to isLetter.
isAlphaNum :: Char -> Bool

Selects alphabetic or numeric digit Unicode characters.

Note that numeric digits outside the ASCII range are selected by this function but not by isDigit. Such digits may be part of identifiers but are not used by the printer and reader to represent numbers.

isPrint :: Char -> Bool
Selects printable Unicode characters (letters, numbers, marks, punctuation, symbols and spaces).
isDigit :: Char -> Bool
Selects ASCII digits, i.e. '0'..'9'.
isOctDigit :: Char -> Bool
Selects ASCII octal digits, i.e. '0'..'7'.
isHexDigit :: Char -> Bool
Selects ASCII hexadecimal digits, i.e. '0'..'9', 'a'..'f', 'A'..'F'.
isLetter :: Char -> Bool
Selects alphabetic Unicode characters (lower-case, upper-case and title-case letters, plus letters of caseless scripts and modifiers letters). This function is equivalent to isAlpha.
isMark :: Char -> Bool
Selects Unicode mark characters, e.g. accents and the like, which combine with preceding letters.
isNumber :: Char -> Bool
Selects Unicode numeric characters, including digits from various scripts, Roman numerals, etc.
isPunctuation :: Char -> Bool
Selects Unicode punctuation characters, including various kinds of connectors, brackets and quotes.
isSymbol :: Char -> Bool
Selects Unicode symbol characters, including mathematical and currency symbols.
isSeparator :: Char -> Bool
Selects Unicode space and separator characters.
Subranges
isAscii :: Char -> Bool
Selects the first 128 characters of the Unicode character set, corresponding to the ASCII character set.
isLatin1 :: Char -> Bool
Selects the first 256 characters of the Unicode character set, corresponding to the ISO 8859-1 (Latin-1) character set.
isAsciiUpper :: Char -> Bool
Selects ASCII upper-case letters, i.e. characters satisfying both isAscii and isUpper.
isAsciiLower :: Char -> Bool
Selects ASCII lower-case letters, i.e. characters satisfying both isAscii and isLower.
Unicode general categories
data GeneralCategory
Unicode General Categories (column 2 of the UnicodeData table) in the order they are listed in the Unicode standard.
Constructors
UppercaseLetterLu: Letter, Uppercase
LowercaseLetterLl: Letter, Lowercase
TitlecaseLetterLt: Letter, Titlecase
ModifierLetterLm: Letter, Modifier
OtherLetterLo: Letter, Other
NonSpacingMarkMn: Mark, Non-Spacing
SpacingCombiningMarkMc: Mark, Spacing Combining
EnclosingMarkMe: Mark, Enclosing
DecimalNumberNd: Number, Decimal
LetterNumberNl: Number, Letter
OtherNumberNo: Number, Other
ConnectorPunctuationPc: Punctuation, Connector
DashPunctuationPd: Punctuation, Dash
OpenPunctuationPs: Punctuation, Open
ClosePunctuationPe: Punctuation, Close
InitialQuotePi: Punctuation, Initial quote
FinalQuotePf: Punctuation, Final quote
OtherPunctuationPo: Punctuation, Other
MathSymbolSm: Symbol, Math
CurrencySymbolSc: Symbol, Currency
ModifierSymbolSk: Symbol, Modifier
OtherSymbolSo: Symbol, Other
SpaceZs: Separator, Space
LineSeparatorZl: Separator, Line
ParagraphSeparatorZp: Separator, Paragraph
ControlCc: Other, Control
FormatCf: Other, Format
SurrogateCs: Other, Surrogate
PrivateUseCo: Other, Private Use
NotAssignedCn: Other, Not Assigned
show/hide Instances
generalCategory :: Char -> GeneralCategory
The Unicode general category of the character.
Case conversion
toUpper :: Char -> Char
Convert a letter to the corresponding upper-case letter, if any. Any other character is returned unchanged.
toLower :: Char -> Char
Convert a letter to the corresponding lower-case letter, if any. Any other character is returned unchanged.
toTitle :: Char -> Char
Convert a letter to the corresponding title-case or upper-case letter, if any. (Title case differs from upper case only for a small number of ligature letters.) Any other character is returned unchanged.
Single digit characters
digitToInt :: Char -> Int
Convert a single digit Char to the corresponding Int. This function fails unless its argument satisfies isHexDigit, but recognises both upper and lower-case hexadecimal digits (i.e. '0'..'9', 'a'..'f', 'A'..'F').
intToDigit :: Int -> Char
Convert an Int in the range 0..15 to the corresponding single digit Char. This function fails on other inputs, and generates lower-case hexadecimal digits.
Numeric representations
ord :: Char -> Int
The fromEnum method restricted to the type Char.
chr :: Int -> Char
The toEnum method restricted to the type Char.
String representations
showLitChar :: Char -> ShowS

Convert a character to a string using only printable characters, using Haskell source-language escape conventions. For example:

 showLitChar '\n' s  =  "\\n" ++ s
lexLitChar :: ReadS String

Read a string representation of a character, using Haskell source-language escape conventions. For example:

 lexLitChar  "\\nHello"  =  [("\\n", "Hello")]
readLitChar :: ReadS Char

Read a string representation of a character, using Haskell source-language escape conventions, and convert it to the character that it encodes. For example:

 readLitChar "\\nHello"  =  [('\n', "Hello")]
Produced by Haddock version 0.8