- encode :: Buffer from -> Buffer to -> IO (CodingProgress, Buffer from, Buffer to)
encode function translates elements of the buffer
to the buffer
to. It should translate as many elements as possible
given the sizes of the buffers, including translating zero elements
if there is either not enough room in
from does not
contain a complete multibyte sequence.
The fact that as many elements as possible are translated is used by the IO
library in order to report translation errors at the point they
actually occur, rather than when the buffer is translated.
To allow us to use iconv as a BufferCode efficiently, character buffers are
defined to contain lone surrogates instead of those private use characters that
are used for roundtripping. Thus, Chars poked and peeked from a character buffer
must undergo surrogatifyRoundtripCharacter and desurrogatifyRoundtripCharacter
For more information on this, see Note [Roundtripping] in GHC.IO.Encoding.Failure.
- recover :: Buffer from -> Buffer to -> IO (Buffer from, Buffer to)
recover function is used to continue decoding
in the presence of invalid or unrepresentable sequences. This includes
both those detected by
InvalidSequence and those
that occur because the input byte sequence appears to be truncated.
Progress will usually be made by skipping the first element of the
buffer. This function should only be called if you are certain that you
wish to do this skipping and if the
to buffer has at least one element
of free space. Because this function deals with decoding failure, it assumes
that the from buffer has at least one element.
recover may raise an exception rather than skipping anything.
Currently, some implementations of
recover may mutate the input buffer.
In particular, this feature is used to implement transliteration.
- close :: IO ()
Resources associated with the encoding may now be released.
encode function may not be called again after calling
- getState :: IO state
Return the current state of the codec.
Many codecs are not stateful, and in these case the state can be
represented as '()'. Other codecs maintain a state. For
example, UTF-16 recognises a BOM (byte-order-mark) character at
the beginning of the input, and remembers thereafter whether to
use big-endian or little-endian mode. In this case, the state
of the codec would include two pieces of information: whether we
are at the beginning of the stream (the BOM only occurs at the
beginning), and if not, whether to use the big or little-endian
- setState :: state -> IO ()