Copyright | (c) The University of Glasgow 2001 |
---|---|
License | BSD-style (see the file libraries/base/LICENSE) |
Maintainer | libraries@haskell.org |
Stability | stable |
Portability | portable |
Safe Haskell | Trustworthy |
Language | Haskell2010 |
The standard IO library.
- data IO a :: * -> *
- fixIO :: (a -> IO a) -> IO a
- type FilePath = String
- data Handle
- stdin :: Handle
- stdout :: Handle
- stderr :: Handle
- withFile :: FilePath -> IOMode -> (Handle -> IO r) -> IO r
- openFile :: FilePath -> IOMode -> IO Handle
- data IOMode
- hClose :: Handle -> IO ()
- readFile :: FilePath -> IO String
- writeFile :: FilePath -> String -> IO ()
- appendFile :: FilePath -> String -> IO ()
- hFileSize :: Handle -> IO Integer
- hSetFileSize :: Handle -> Integer -> IO ()
- hIsEOF :: Handle -> IO Bool
- isEOF :: IO Bool
- data BufferMode
- hSetBuffering :: Handle -> BufferMode -> IO ()
- hGetBuffering :: Handle -> IO BufferMode
- hFlush :: Handle -> IO ()
- hGetPosn :: Handle -> IO HandlePosn
- hSetPosn :: HandlePosn -> IO ()
- data HandlePosn
- hSeek :: Handle -> SeekMode -> Integer -> IO ()
- data SeekMode
- hTell :: Handle -> IO Integer
- hIsOpen :: Handle -> IO Bool
- hIsClosed :: Handle -> IO Bool
- hIsReadable :: Handle -> IO Bool
- hIsWritable :: Handle -> IO Bool
- hIsSeekable :: Handle -> IO Bool
- hIsTerminalDevice :: Handle -> IO Bool
- hSetEcho :: Handle -> Bool -> IO ()
- hGetEcho :: Handle -> IO Bool
- hShow :: Handle -> IO String
- hWaitForInput :: Handle -> Int -> IO Bool
- hReady :: Handle -> IO Bool
- hGetChar :: Handle -> IO Char
- hGetLine :: Handle -> IO String
- hLookAhead :: Handle -> IO Char
- hGetContents :: Handle -> IO String
- hPutChar :: Handle -> Char -> IO ()
- hPutStr :: Handle -> String -> IO ()
- hPutStrLn :: Handle -> String -> IO ()
- hPrint :: Show a => Handle -> a -> IO ()
- interact :: (String -> String) -> IO ()
- putChar :: Char -> IO ()
- putStr :: String -> IO ()
- putStrLn :: String -> IO ()
- print :: Show a => a -> IO ()
- getChar :: IO Char
- getLine :: IO String
- getContents :: IO String
- readIO :: Read a => String -> IO a
- readLn :: Read a => IO a
- withBinaryFile :: FilePath -> IOMode -> (Handle -> IO r) -> IO r
- openBinaryFile :: FilePath -> IOMode -> IO Handle
- hSetBinaryMode :: Handle -> Bool -> IO ()
- hPutBuf :: Handle -> Ptr a -> Int -> IO ()
- hGetBuf :: Handle -> Ptr a -> Int -> IO Int
- hGetBufSome :: Handle -> Ptr a -> Int -> IO Int
- hPutBufNonBlocking :: Handle -> Ptr a -> Int -> IO Int
- hGetBufNonBlocking :: Handle -> Ptr a -> Int -> IO Int
- openTempFile :: FilePath -> String -> IO (FilePath, Handle)
- openBinaryTempFile :: FilePath -> String -> IO (FilePath, Handle)
- openTempFileWithDefaultPermissions :: FilePath -> String -> IO (FilePath, Handle)
- openBinaryTempFileWithDefaultPermissions :: FilePath -> String -> IO (FilePath, Handle)
- hSetEncoding :: Handle -> TextEncoding -> IO ()
- hGetEncoding :: Handle -> IO (Maybe TextEncoding)
- data TextEncoding
- latin1 :: TextEncoding
- utf8 :: TextEncoding
- utf8_bom :: TextEncoding
- utf16 :: TextEncoding
- utf16le :: TextEncoding
- utf16be :: TextEncoding
- utf32 :: TextEncoding
- utf32le :: TextEncoding
- utf32be :: TextEncoding
- localeEncoding :: TextEncoding
- char8 :: TextEncoding
- mkTextEncoding :: String -> IO TextEncoding
- hSetNewlineMode :: Handle -> NewlineMode -> IO ()
- data Newline
- nativeNewline :: Newline
- data NewlineMode = NewlineMode {}
- noNewlineTranslation :: NewlineMode
- universalNewlineMode :: NewlineMode
- nativeNewlineMode :: NewlineMode
The IO monad
A value of type
is a computation which, when performed,
does some I/O before returning a value of type IO
aa
.
There is really only one way to "perform" an I/O action: bind it to
Main.main
in your program. When your program is run, the I/O will
be performed. It isn't possible to perform I/O from an arbitrary
function, unless that function is itself in the IO
monad and called
at some point, directly or indirectly, from Main.main
.
IO
is a monad, so IO
actions can be combined using either the do-notation
or the >>
and >>=
operations from the Monad
class.
Files and handles
File and directory names are values of type String
, whose precise
meaning is operating system dependent. Files can be opened, yielding a
handle which can then be used to operate on the contents of that file.
Haskell defines operations to read and write characters from and to files,
represented by values of type Handle
. Each value of this type is a
handle: a record used by the Haskell run-time system to manage I/O
with file system objects. A handle has at least the following properties:
- whether it manages input or output or both;
- whether it is open, closed or semi-closed;
- whether the object is seekable;
- whether buffering is disabled, or enabled on a line or block basis;
- a buffer (whose length may be zero).
Most handles will also have a current I/O position indicating where the next
input or output operation will occur. A handle is readable if it
manages only input or both input and output; likewise, it is writable if
it manages only output or both input and output. A handle is open when
first allocated.
Once it is closed it can no longer be used for either input or output,
though an implementation cannot re-use its storage while references
remain to it. Handles are in the Show
and Eq
classes. The string
produced by showing a handle is system dependent; it should include
enough information to identify the handle for debugging. A handle is
equal according to ==
only to itself; no attempt
is made to compare the internal state of different handles for equality.
GHC note: a Handle
will be automatically closed when the garbage
collector detects that it has become unreferenced by the program.
However, relying on this behaviour is not generally recommended:
the garbage collector is unpredictable. If possible, use
an explicit hClose
to close Handle
s when they are no longer
required. GHC does not currently attempt to free up file
descriptors when they have run out, it is your responsibility to
ensure that this doesn't happen.
Standard handles
Three handles are allocated during program initialisation, and are initially open.
Opening and closing files
Opening files
withFile :: FilePath -> IOMode -> (Handle -> IO r) -> IO r Source
opens a file using withFile
name mode actopenFile
and passes
the resulting handle to the computation act
. The handle will be
closed on exit from withFile
, whether by normal termination or by
raising an exception. If closing the handle raises an exception, then
this exception will be raised by withFile
rather than any exception
raised by act
.
openFile :: FilePath -> IOMode -> IO Handle Source
Computation openFile
file mode
allocates and returns a new, open
handle to manage the file file
. It manages input if mode
is ReadMode
, output if mode
is WriteMode
or AppendMode
,
and both input and output if mode is ReadWriteMode
.
If the file does not exist and it is opened for output, it should be
created as a new file. If mode
is WriteMode
and the file
already exists, then it should be truncated to zero length.
Some operating systems delete empty files, so there is no guarantee
that the file will exist following an openFile
with mode
WriteMode
unless it is subsequently written to successfully.
The handle is positioned at the end of the file if mode
is
AppendMode
, and otherwise at the beginning (in which case its
internal position is 0).
The initial buffer mode is implementation-dependent.
This operation may fail with:
isAlreadyInUseError
if the file is already open and cannot be reopened;isDoesNotExistError
if the file does not exist; orisPermissionError
if the user does not have permission to open the file.
Note: if you will be working with files containing binary data, you'll want to
be using openBinaryFile
.
See openFile
Closing files
hClose :: Handle -> IO () Source
Computation hClose
hdl
makes handle hdl
closed. Before the
computation finishes, if hdl
is writable its buffer is flushed as
for hFlush
.
Performing hClose
on a handle that has already been closed has no effect;
doing so is not an error. All other operations on a closed handle will fail.
If hClose
fails for any reason, any further operations (apart from
hClose
) on the handle will still fail as if hdl
had been successfully
closed.
Special cases
These functions are also exported by the Prelude.
readFile :: FilePath -> IO String Source
The readFile
function reads a file and
returns the contents of the file as a string.
The file is read lazily, on demand, as with getContents
.
writeFile :: FilePath -> String -> IO () Source
The computation writeFile
file str
function writes the string str
,
to the file file
.
appendFile :: FilePath -> String -> IO () Source
The computation appendFile
file str
function appends the string str
,
to the file file
.
Note that writeFile
and appendFile
write a literal string
to a file. To write a value of any printable type, as with print
,
use the show
function to convert the value to a string first.
main = appendFile "squares" (show [(x,x*x) | x <- [0,0.1..2]])
File locking
Implementations should enforce as far as possible, at least locally to the Haskell process, multiple-reader single-writer locking on files. That is, there may either be many handles on the same file which manage input, or just one handle on the file which manages output. If any open or semi-closed handle is managing a file for output, no new handle can be allocated for that file. If any open or semi-closed handle is managing a file for input, new handles can only be allocated if they do not manage output. Whether two files are the same is implementation-dependent, but they should normally be the same if they have the same absolute path name and neither has been renamed, for example.
Warning: the readFile
operation holds a semi-closed handle on
the file until the entire contents of the file have been consumed.
It follows that an attempt to write to a file (using writeFile
, for
example) that was earlier opened by readFile
will usually result in
failure with isAlreadyInUseError
.
Operations on handles
Determining and changing the size of a file
hFileSize :: Handle -> IO Integer Source
For a handle hdl
which attached to a physical file,
hFileSize
hdl
returns the size of that file in 8-bit bytes.
hSetFileSize :: Handle -> Integer -> IO () Source
hSetFileSize
hdl
size
truncates the physical file with handle hdl
to size
bytes.
Detecting the end of input
hIsEOF :: Handle -> IO Bool Source
For a readable handle hdl
, hIsEOF
hdl
returns
True
if no further input can be taken from hdl
or for a
physical file, if the current I/O position is equal to the length of
the file. Otherwise, it returns False
.
NOTE: hIsEOF
may block, because it has to attempt to read from
the stream to determine whether there is any more data to be read.
Buffering operations
data BufferMode Source
Three kinds of buffering are supported: line-buffering, block-buffering or no-buffering. These modes have the following effects. For output, items are written out, or flushed, from the internal buffer according to the buffer mode:
- line-buffering: the entire output buffer is flushed
whenever a newline is output, the buffer overflows,
a
hFlush
is issued, or the handle is closed. - block-buffering: the entire buffer is written out whenever it
overflows, a
hFlush
is issued, or the handle is closed. - no-buffering: output is written immediately, and never stored in the buffer.
An implementation is free to flush the buffer more frequently, but not less frequently, than specified above. The output buffer is emptied as soon as it has been written out.
Similarly, input occurs according to the buffer mode for the handle:
- line-buffering: when the buffer for the handle is not empty, the next item is obtained from the buffer; otherwise, when the buffer is empty, characters up to and including the next newline character are read into the buffer. No characters are available until the newline character is available or the buffer is full.
- block-buffering: when the buffer for the handle becomes empty, the next block of data is read into the buffer.
- no-buffering: the next input item is read and returned.
The
hLookAhead
operation implies that even a no-buffered handle may require a one-character buffer.
The default buffering mode when a handle is opened is implementation-dependent and may depend on the file system object which is attached to that handle. For most implementations, physical files will normally be block-buffered and terminals will normally be line-buffered.
NoBuffering | buffering is disabled if possible. |
LineBuffering | line-buffering should be enabled if possible. |
BlockBuffering (Maybe Int) | block-buffering should be enabled if possible.
The size of the buffer is |
hSetBuffering :: Handle -> BufferMode -> IO () Source
Computation hSetBuffering
hdl mode
sets the mode of buffering for
handle hdl
on subsequent reads and writes.
If the buffer mode is changed from BlockBuffering
or
LineBuffering
to NoBuffering
, then
- if
hdl
is writable, the buffer is flushed as forhFlush
; - if
hdl
is not writable, the contents of the buffer is discarded.
This operation may fail with:
isPermissionError
if the handle has already been used for reading or writing and the implementation does not allow the buffering mode to be changed.
hGetBuffering :: Handle -> IO BufferMode Source
Computation hGetBuffering
hdl
returns the current buffering mode
for hdl
.
hFlush :: Handle -> IO () Source
The action hFlush
hdl
causes any items buffered for output
in handle hdl
to be sent immediately to the operating system.
This operation may fail with:
isFullError
if the device is full;isPermissionError
if a system resource limit would be exceeded. It is unspecified whether the characters in the buffer are discarded or retained under these circumstances.
Repositioning handles
hGetPosn :: Handle -> IO HandlePosn Source
Computation hGetPosn
hdl
returns the current I/O position of
hdl
as a value of the abstract type HandlePosn
.
hSetPosn :: HandlePosn -> IO () Source
hSeek :: Handle -> SeekMode -> Integer -> IO () Source
Computation hSeek
hdl mode i
sets the position of handle
hdl
depending on mode
.
The offset i
is given in terms of 8-bit bytes.
If hdl
is block- or line-buffered, then seeking to a position which is not
in the current buffer will first cause any items in the output buffer to be
written to the device, and then cause the input buffer to be discarded.
Some handles may not be seekable (see hIsSeekable
), or only support a
subset of the possible positioning operations (for instance, it may only
be possible to seek to the end of a tape, or to a positive offset from
the beginning or current position).
It is not possible to set a negative I/O position, or for
a physical file, an I/O position beyond the current end-of-file.
This operation may fail with:
isIllegalOperationError
if the Handle is not seekable, or does not support the requested seek mode.isPermissionError
if a system resource limit would be exceeded.
A mode that determines the effect of hSeek
hdl mode i
.
AbsoluteSeek | the position of |
RelativeSeek | the position of |
SeekFromEnd | the position of |
hTell :: Handle -> IO Integer Source
Computation hTell
hdl
returns the current position of the
handle hdl
, as the number of bytes from the beginning of
the file. The value returned may be subsequently passed to
hSeek
to reposition the handle to the current position.
This operation may fail with:
isIllegalOperationError
if the Handle is not seekable.
Handle properties
hIsReadable :: Handle -> IO Bool Source
hIsWritable :: Handle -> IO Bool Source
hIsSeekable :: Handle -> IO Bool Source
Terminal operations (not portable: GHC only)
hIsTerminalDevice :: Handle -> IO Bool Source
Is the handle connected to a terminal?
hSetEcho :: Handle -> Bool -> IO () Source
Set the echoing status of a handle connected to a terminal.
Showing handle state (not portable: GHC only)
Text input and output
Text input
hWaitForInput :: Handle -> Int -> IO Bool Source
Computation hWaitForInput
hdl t
waits until input is available on handle hdl
.
It returns True
as soon as input is available on hdl
,
or False
if no input is available within t
milliseconds. Note that
hWaitForInput
waits until one or more full characters are available,
which means that it needs to do decoding, and hence may fail
with a decoding error.
If t
is less than zero, then hWaitForInput
waits indefinitely.
This operation may fail with:
isEOFError
if the end of file has been reached.- a decoding error, if the input begins with an invalid byte sequence in this Handle's encoding.
NOTE for GHC users: unless you use the -threaded
flag,
hWaitForInput hdl t
where t >= 0
will block all other Haskell
threads for the duration of the call. It behaves like a
safe
foreign call in this respect.
hReady :: Handle -> IO Bool Source
Computation hReady
hdl
indicates whether at least one item is
available for input from handle hdl
.
This operation may fail with:
isEOFError
if the end of file has been reached.
hGetChar :: Handle -> IO Char Source
Computation hGetChar
hdl
reads a character from the file or
channel managed by hdl
, blocking until a character is available.
This operation may fail with:
isEOFError
if the end of file has been reached.
hGetLine :: Handle -> IO String Source
Computation hGetLine
hdl
reads a line from the file or
channel managed by hdl
.
This operation may fail with:
isEOFError
if the end of file is encountered when reading the first character of the line.
If hGetLine
encounters end-of-file at any other point while reading
in a line, it is treated as a line terminator and the (partial)
line is returned.
hLookAhead :: Handle -> IO Char Source
Computation hLookAhead
returns the next character from the handle
without removing it from the input buffer, blocking until a character
is available.
This operation may fail with:
isEOFError
if the end of file has been reached.
hGetContents :: Handle -> IO String Source
Computation hGetContents
hdl
returns the list of characters
corresponding to the unread portion of the channel or file managed
by hdl
, which is put into an intermediate state, semi-closed.
In this state, hdl
is effectively closed,
but items are read from hdl
on demand and accumulated in a special
list returned by hGetContents
hdl
.
Any operation that fails because a handle is closed,
also fails if a handle is semi-closed. The only exception is hClose
.
A semi-closed handle becomes closed:
- if
hClose
is applied to it; - if an I/O error occurs when reading an item from the handle;
- or once the entire contents of the handle has been read.
Once a semi-closed handle becomes closed, the contents of the associated list becomes fixed. The contents of this final list is only partially specified: it will contain at least all the items of the stream that were evaluated prior to the handle becoming closed.
Any I/O errors encountered while a handle is semi-closed are simply discarded.
This operation may fail with:
isEOFError
if the end of file has been reached.
Text output
hPutChar :: Handle -> Char -> IO () Source
Computation hPutChar
hdl ch
writes the character ch
to the
file or channel managed by hdl
. Characters may be buffered if
buffering is enabled for hdl
.
This operation may fail with:
isFullError
if the device is full; orisPermissionError
if another system resource limit would be exceeded.
hPutStr :: Handle -> String -> IO () Source
Computation hPutStr
hdl s
writes the string
s
to the file or channel managed by hdl
.
This operation may fail with:
isFullError
if the device is full; orisPermissionError
if another system resource limit would be exceeded.
hPrint :: Show a => Handle -> a -> IO () Source
Computation hPrint
hdl t
writes the string representation of t
given by the shows
function to the file or channel managed by hdl
and appends a newline.
This operation may fail with:
isFullError
if the device is full; orisPermissionError
if another system resource limit would be exceeded.
Special cases for standard input and output
These functions are also exported by the Prelude.
interact :: (String -> String) -> IO () Source
The interact
function takes a function of type String->String
as its argument. The entire input from the standard input device is
passed to this function as its argument, and the resulting string is
output on the standard output device.
print :: Show a => a -> IO () Source
The print
function outputs a value of any printable type to the
standard output device.
Printable types are those that are instances of class Show
; print
converts values to strings for output using the show
operation and
adds a newline.
For example, a program to print the first 20 integers and their powers of 2 could be written as:
main = print ([(n, 2^n) | n <- [0..19]])
getContents :: IO String Source
The getContents
operation returns all user input as a single string,
which is read lazily as it is needed
(same as hGetContents
stdin
).
Binary input and output
withBinaryFile :: FilePath -> IOMode -> (Handle -> IO r) -> IO r Source
opens a file using withBinaryFile
name mode actopenBinaryFile
and passes the resulting handle to the computation act
. The handle
will be closed on exit from withBinaryFile
, whether by normal
termination or by raising an exception.
openBinaryFile :: FilePath -> IOMode -> IO Handle Source
Like openFile
, but open the file in binary mode.
On Windows, reading a file in text mode (which is the default)
will translate CRLF to LF, and writing will translate LF to CRLF.
This is usually what you want with text files. With binary files
this is undesirable; also, as usual under Microsoft operating systems,
text mode treats control-Z as EOF. Binary mode turns off all special
treatment of end-of-line and end-of-file characters.
(See also hSetBinaryMode
.)
hSetBinaryMode :: Handle -> Bool -> IO () Source
Select binary mode (True
) or text mode (False
) on a open handle.
(See also openBinaryFile
.)
This has the same effect as calling hSetEncoding
with char8
, together
with hSetNewlineMode
with noNewlineTranslation
.
hPutBuf :: Handle -> Ptr a -> Int -> IO () Source
hPutBuf
hdl buf count
writes count
8-bit bytes from the
buffer buf
to the handle hdl
. It returns ().
hPutBuf
ignores any text encoding that applies to the Handle
,
writing the bytes directly to the underlying file or device.
hPutBuf
ignores the prevailing TextEncoding
and
NewlineMode
on the Handle
, and writes bytes directly.
This operation may fail with:
ResourceVanished
if the handle is a pipe or socket, and the reading end is closed. (If this is a POSIX system, and the program has not asked to ignore SIGPIPE, then a SIGPIPE may be delivered instead, whose default action is to terminate the program).
hGetBuf :: Handle -> Ptr a -> Int -> IO Int Source
hGetBuf
hdl buf count
reads data from the handle hdl
into the buffer buf
until either EOF is reached or
count
8-bit bytes have been read.
It returns the number of bytes actually read. This may be zero if
EOF was reached before any data was read (or if count
is zero).
hGetBuf
never raises an EOF exception, instead it returns a value
smaller than count
.
If the handle is a pipe or socket, and the writing end
is closed, hGetBuf
will behave as if EOF was reached.
hGetBuf
ignores the prevailing TextEncoding
and NewlineMode
on the Handle
, and reads bytes directly.
hGetBufSome :: Handle -> Ptr a -> Int -> IO Int Source
hGetBufSome
hdl buf count
reads data from the handle hdl
into the buffer buf
. If there is any data available to read,
then hGetBufSome
returns it immediately; it only blocks if there
is no data to be read.
It returns the number of bytes actually read. This may be zero if
EOF was reached before any data was read (or if count
is zero).
hGetBufSome
never raises an EOF exception, instead it returns a value
smaller than count
.
If the handle is a pipe or socket, and the writing end
is closed, hGetBufSome
will behave as if EOF was reached.
hGetBufSome
ignores the prevailing TextEncoding
and NewlineMode
on the Handle
, and reads bytes directly.
hGetBufNonBlocking :: Handle -> Ptr a -> Int -> IO Int Source
hGetBufNonBlocking
hdl buf count
reads data from the handle hdl
into the buffer buf
until either EOF is reached, or
count
8-bit bytes have been read, or there is no more data available
to read immediately.
hGetBufNonBlocking
is identical to hGetBuf
, except that it will
never block waiting for data to become available, instead it returns
only whatever data is available. To wait for data to arrive before
calling hGetBufNonBlocking
, use hWaitForInput
.
If the handle is a pipe or socket, and the writing end
is closed, hGetBufNonBlocking
will behave as if EOF was reached.
hGetBufNonBlocking
ignores the prevailing TextEncoding
and
NewlineMode
on the Handle
, and reads bytes directly.
NOTE: on Windows, this function does not work correctly; it
behaves identically to hGetBuf
.
Temporary files
:: FilePath | Directory in which to create the file |
-> String | File name template. If the template is "foo.ext" then the created file will be "fooXXX.ext" where XXX is some random number. |
-> IO (FilePath, Handle) |
The function creates a temporary file in ReadWrite mode. The created file isn't deleted automatically, so you need to delete it manually.
The file is creates with permissions such that only the current user can read/write it.
With some exceptions (see below), the file will be created securely
in the sense that an attacker should not be able to cause
openTempFile to overwrite another file on the filesystem using your
credentials, by putting symbolic links (on Unix) in the place where
the temporary file is to be created. On Unix the O_CREAT
and
O_EXCL
flags are used to prevent this attack, but note that
O_EXCL
is sometimes not supported on NFS filesystems, so if you
rely on this behaviour it is best to use local filesystems only.
openBinaryTempFile :: FilePath -> String -> IO (FilePath, Handle) Source
Like openTempFile
, but opens the file in binary mode. See openBinaryFile
for more comments.
openTempFileWithDefaultPermissions :: FilePath -> String -> IO (FilePath, Handle) Source
Like openTempFile
, but uses the default file permissions
openBinaryTempFileWithDefaultPermissions :: FilePath -> String -> IO (FilePath, Handle) Source
Like openBinaryTempFile
, but uses the default file permissions
Unicode encoding/decoding
A text-mode Handle
has an associated TextEncoding
, which
is used to decode bytes into Unicode characters when reading,
and encode Unicode characters into bytes when writing.
The default TextEncoding
is the same as the default encoding
on your system, which is also available as localeEncoding
.
(GHC note: on Windows, we currently do not support double-byte
encodings; if the console's code page is unsupported, then
localeEncoding
will be latin1
.)
Encoding and decoding errors are always detected and reported,
except during lazy I/O (hGetContents
, getContents
, and
readFile
), where a decoding error merely results in
termination of the character stream, as with other I/O errors.
hSetEncoding :: Handle -> TextEncoding -> IO () Source
The action hSetEncoding
hdl
encoding
changes the text encoding
for the handle hdl
to encoding
. The default encoding when a Handle
is
created is localeEncoding
, namely the default encoding for the current
locale.
To create a Handle
with no encoding at all, use openBinaryFile
. To
stop further encoding or decoding on an existing Handle
, use
hSetBinaryMode
.
hSetEncoding
may need to flush buffered data in order to change
the encoding.
hGetEncoding :: Handle -> IO (Maybe TextEncoding) Source
Return the current TextEncoding
for the specified Handle
, or
Nothing
if the Handle
is in binary mode.
Note that the TextEncoding
remembers nothing about the state of
the encoder/decoder in use on this Handle
. For example, if the
encoding in use is UTF-16, then using hGetEncoding
and
hSetEncoding
to save and restore the encoding may result in an
extra byte-order-mark being written to the file.
Unicode encodings
data TextEncoding Source
A TextEncoding
is a specification of a conversion scheme
between sequences of bytes and sequences of Unicode characters.
For example, UTF-8 is an encoding of Unicode characters into a sequence
of bytes. The TextEncoding
for UTF-8 is utf8
.
The Latin1 (ISO8859-1) encoding. This encoding maps bytes
directly to the first 256 Unicode code points, and is thus not a
complete Unicode encoding. An attempt to write a character greater than
'\255' to a Handle
using the latin1
encoding will result in an error.
The UTF-8 Unicode encoding
utf8_bom :: TextEncoding Source
The UTF-8 Unicode encoding, with a byte-order-mark (BOM; the byte
sequence 0xEF 0xBB 0xBF). This encoding behaves like utf8
,
except that on input, the BOM sequence is ignored at the beginning
of the stream, and on output, the BOM sequence is prepended.
The byte-order-mark is strictly unnecessary in UTF-8, but is sometimes used to identify the encoding of a file.
The UTF-16 Unicode encoding (a byte-order-mark should be used to indicate endianness).
utf16le :: TextEncoding Source
The UTF-16 Unicode encoding (litte-endian)
utf16be :: TextEncoding Source
The UTF-16 Unicode encoding (big-endian)
The UTF-32 Unicode encoding (a byte-order-mark should be used to indicate endianness).
utf32le :: TextEncoding Source
The UTF-32 Unicode encoding (litte-endian)
utf32be :: TextEncoding Source
The UTF-32 Unicode encoding (big-endian)
localeEncoding :: TextEncoding Source
The Unicode encoding of the current locale
This is the initial locale encoding: if it has been subsequently changed by
setLocaleEncoding
this value will not reflect that change.
An encoding in which Unicode code points are translated to bytes by taking the code point modulo 256. When decoding, bytes are translated directly into the equivalent code point.
This encoding never fails in either direction. However, encoding discards information, so encode followed by decode is not the identity.
Since: 4.4.0.0
mkTextEncoding :: String -> IO TextEncoding Source
Look up the named Unicode encoding. May fail with
isDoesNotExistError
if the encoding is unknown
The set of known encodings is system-dependent, but includes at least:
UTF-8
UTF-16
,UTF-16BE
,UTF-16LE
UTF-32
,UTF-32BE
,UTF-32LE
There is additional notation (borrowed from GNU iconv) for specifying how illegal characters are handled:
- a suffix of
//IGNORE
, e.g.UTF-8//IGNORE
, will cause all illegal sequences on input to be ignored, and on output will drop all code points that have no representation in the target encoding. - a suffix of
//TRANSLIT
will choose a replacement character for illegal sequences or code points. - a suffix of
//ROUNDTRIP
will use a PEP383-style escape mechanism to represent any invalid bytes in the input as Unicode codepoints (specifically, as lone surrogates, which are normally invalid in UTF-32). Upon output, these special codepoints are detected and turned back into the corresponding original byte.In theory, this mechanism allows arbitrary data to be roundtripped via a
String
with no loss of data. In practice, there are two limitations to be aware of:- This only stands a chance of working for an encoding which is an ASCII superset, as for security reasons we refuse to escape any bytes smaller than 128. Many encodings of interest are ASCII supersets (in particular, you can assume that the locale encoding is an ASCII superset) but many (such as UTF-16) are not.
- If the underlying encoding is not itself roundtrippable, this mechanism can fail. Roundtrippable encodings are those which have an injective mapping into Unicode. Almost all encodings meet this criteria, but some do not. Notably, Shift-JIS (CP932) and Big5 contain several different encodings of the same Unicode codepoint.
On Windows, you can access supported code pages with the prefix
CP
; for example, "CP1250"
.
Newline conversion
In Haskell, a newline is always represented by the character '\n'. However, in files and external character streams, a newline may be represented by another character sequence, such as '\r\n'.
A text-mode Handle
has an associated NewlineMode
that
specifies how to transate newline characters. The
NewlineMode
specifies the input and output translation
separately, so that for instance you can translate '\r\n'
to '\n' on input, but leave newlines as '\n' on output.
The default NewlineMode
for a Handle
is
nativeNewlineMode
, which does no translation on Unix systems,
but translates '\r\n' to '\n' and back on Windows.
Binary-mode Handle
s do no newline translation at all.
hSetNewlineMode :: Handle -> NewlineMode -> IO () Source
Set the NewlineMode
on the specified Handle
. All buffered
data is flushed first.
The representation of a newline in the external file or stream.
data NewlineMode Source
Specifies the translation, if any, of newline characters between internal Strings and the external file or stream. Haskell Strings are assumed to represent newlines with the '\n' character; the newline mode specifies how to translate '\n' on output, and what to translate into '\n' on input.
noNewlineTranslation :: NewlineMode Source
Do no newline translation at all.
noNewlineTranslation = NewlineMode { inputNL = LF, outputNL = LF }
universalNewlineMode :: NewlineMode Source
Map '\r\n' into '\n' on input, and '\n' to the native newline
represetnation on output. This mode can be used on any platform, and
works with text files using any newline convention. The downside is
that readFile >>= writeFile
might yield a different file.
universalNewlineMode = NewlineMode { inputNL = CRLF, outputNL = nativeNewline }
nativeNewlineMode :: NewlineMode Source
Use the native newline representation on both input and output
nativeNewlineMode = NewlineMode { inputNL = nativeNewline outputNL = nativeNewline }