Safe Haskell is an extension to the Haskell language that is implemented in GHC as of version 7.2. It allows for unsafe code to be securely included in a trusted code base by restricting the features of GHC Haskell the code is allowed to use. Put simply, it makes the types of programs trustable. Safe Haskell is aimed to be as minimal as possible while still providing strong enough guarantees about compiled Haskell code for more advance secure systems to be built on top of it.
While this is the use case that Safe Haskell was motivated by it is important to understand that what Safe Haskell is tracking and enforcing is a stricter form of type safety than is usually guaranteed in Haskell. As part of this, Safe Haskell is run during every compilation of GHC, tracking safety and inferring it even for modules that don't explicitly use Safe Haskell. Please refer to section Section 7.27.5, “Safe Haskell Inference” for more details of this. This also means that there are some design choices that from a security point of view may seem strange but when thought of from the angle of tracking type safety are logical. Feedback on the current design and this tension between the security and type safety view points is welcome.
The design of Safe Haskell covers the following aspects:
Safe Haskell, however, does not offer compilation safety. During compilation time it is possible for arbitrary processes to be launched, using for example the custom pre-processor flag. This can be manipulated to either compromise a users system at compilation time, or to modify the source code just before compilation to try to alter set Safe Haskell flags. This is discussed further in section Section 7.27.7, “Safe Compilation”.
Safe Haskell has been designed with two use cases in mind:
IO
monad. There are
several loop holes in the type system though, the most obvious offender
being the unsafePerformIO :: IO a -> a
function. The
safe language dialect of Safe Haskell disallows the use of such
functions. This can be useful for a variety of purposes as it makes
Haskell code easier to analyse and reason about. It also codifies an
existing culture in the Haskell community of trying to avoid using such
unsafe functions unless absolutely necessary. As such using the safe
language (through the -XSafe
flag) can be thought of as
a way of enforcing good style, similar to the function of
-Wall
.
Systems such as information flow control security, capability based
security systems and DSLs for working with encrypted data.. etc can be
built in the Haskell language simply as a library. However they require
guarantees about the properties of the Haskell language that aren't true
in the general case where uses of functions like unsafePerformIO
are allowed. Safe Haskell is designed to give users enough
guarantees about the safety properties of compiled code so that such
secure systems can be built.
As an example lets define an interface for a plugin system where the
plugin authors are untrusted, possibly malicious third-parties. We do
this by restricting the plugin interface to pure functions or to a
restricted IO
monad that we have defined that only
allows a safe subset of IO
actions to be executed. We
define the plugin interface here so that it requires the plugin module,
Danger
, to export a single computation,
Danger.runMe
, of type RIO ()
, where
RIO
is a new monad defined as follows:
-- Either of the following Safe Haskell pragmas would do {-# LANGUAGE Trustworthy #-} {-# LANGUAGE Safe #-} module RIO (RIO(), runRIO, rioReadFile, rioWriteFile) where -- Notice that symbol UnsafeRIO is not exported from this module! newtype RIO a = UnsafeRIO { runRIO :: IO a } instance Monad RIO where return = UnsafeRIO . return (UnsafeRIO m) >>= k = UnsafeRIO $ m >>= runRIO . k -- Returns True iff access is allowed to file name pathOK :: FilePath -> IO Bool pathOK file = {- Implement some policy based on file name -} rioReadFile :: FilePath -> RIO String rioReadFile file = UnsafeRIO $ do ok <- pathOK file if ok then readFile file else return "" rioWriteFile :: FilePath -> String -> RIO () rioWriteFile file contents = UnsafeRIO $ do ok <- pathOK file if ok then writeFile file contents else return ()We compile Danger using the new Safe Haskell
-XSafe
flag:
{-# LANGUAGE Safe #-} module Danger ( runMe ) where runMe :: RIO () runMe = ...
Before going into the Safe Haskell details, lets point out some of the reasons this design would fail without Safe Haskell:
RIO
type wrapper around IO
. The author of Danger can
subvert this though by simply writing arbitrary
IO
actions and using unsafePerformIO ::
IO a -> a
to execute them as pure functions.
UnsafeRIO
constructor.
Unfortunately Template Haskell can be used to subvert module
boundaries and so could be used to gain access to this constructor.
To stop these attacks Safe Haskell can be used. This is done by compiling
the RIO module with the -XTrustworthy
flag and compiling
the Danger module with the -XSafe
flag.
The use of the -XSafe
flag to compile the
Danger module restricts the features of Haskell that can be used
to a safe subset. This
includes disallowing unsafePerformIO
,
Template Haskell, pure FFI functions, RULES and restricting the
operation of Overlapping Instances. The -XSafe
flag also restricts the modules can be imported by Danger to
only those that are considered trusted. Trusted modules are
those compiled with -XSafe
, where GHC provides
a mechanical guarantee that the code is safe. Or those modules
compiled with -XTrustworthy
, where the module
author claims that the module is Safe.
This is why the RIO module is compiled with
-XTrustworthy
, to allow the Danger module to import it.
The -XTrustworthy
flag doesn't place any restrictions on
the module like -XSafe
does. Instead the module author
claims that while code may use unsafe features internally, it only
exposes an API that can used in a safe manner. The use of
-XTrustworthy
by itself marks the module as trusted.
There is an issue here as -XTrustworthy
may be used by
an arbitrary module and module author. To control the use of trustworthy
modules it is recommended to use the -fpackage-trust
flag. This flag adds an extra requirement to the trust check for
trustworthy modules, such that for trustworthy modules to be considered
trusted, and allowed to be used in -XSafe
compiled
code, the client C compiling the code must tell GHC that they trust the
package the trustworthy module resides in. This is essentially a way of
for C to say, while this package contains trustworthy modules that can be
used by untrusted modules compiled with -XSafe
, I trust
the author(s) of this package and trust the modules only expose a safe
API. The trust of a package can be changed at any time, so if a
vulnerability found in a package, C can declare that package untrusted so
that any future compilation against that package would fail. For a more
detailed overview of this mechanism see Section 7.27.4, “Trust and Safe Haskell Modes”.
In the example, Danger can import module RIO because RIO is marked trustworthy. Thus, Danger can make use of the rioReadFile and rioWriteFile functions to access permitted file names. The main application then imports both RIO and Danger. To run the plugin, it calls RIO.runRIO Danger.runMe within the IO monad. The application is safe in the knowledge that the only IO to ensue will be to files whose paths were approved by the pathOK test.
IO
monad
are still allowed and behave as usual. Any pure function though, as
according to its type, is guaranteed to indeed be pure. This property
allows a user of the safe language to trust the types. This means,
for example, that the unsafePerformIO :: IO a -> a
function is disallowed in the safe language.
These three properties guarantee that in the safe language you can trust the types, can trust that module export lists are respected and can trust that code that successfully compiles has the same meaning as it normally would.
Lets now look at the details of the safe language. In the safe language dialect (enabled by-XSafe
) we disable completely the
following features:
-XSafe
are
dropped. RULES defined in trustworthy modules that M imports are still
valid and will fire as usual.-XSafe
but restricted. While M
can define overlapping instance declarations, they can only overlap
other instance declaration defined in M. If in a module N that imports
M, at a call site that uses a type-class function there is a choice of
which instance to use (i.e. an overlap) and the most specific instances
is from M, then all the other choices must also be from M. If not, a
compilation error will occur. A simple way to think of this is a
same origin policy for overlapping instances
defined in Safe compiled modules.-XDeriveDataTypeable
extension). Hand crafted instances of the Typeable type class
are not allowed in Safe Haskell as this can easily be abused to
unsafely coerce between types.impdecl -> import [safe] [qualified] modid [as modid] [impspec]When used, the module being imported with the safe keyword must be a trusted module, otherwise a compilation error will occur. The safe import extension is enabled by either of the
-XSafe
,
-XTrustworthy
, or -XUnsafe
flags and corresponding PRAGMA's. When the -XSafe
flag
is used, the safe keyword is allowed but meaningless, every import
is required to be safe regardless.
-XSafe
can't import it.
The procedure to check if a module is trusted or not depends on if the
-fpackage-trust
flag is present. The check is very similar
in both cases with the presence of the -fpackage-trust
flag simply enabling an extra requirement for trustworthy modules to be
regarded as trusted.
A module M in a package P is trusted by a client C if and only if:
-XSafe
-XTrustworthy
The above definition of trust has an issue. Any module can be compiled
with -XTrustworthy and it will be trusted regardless of what it does. To
control this there is an additional definition of package trust (enabled
with the -fpackage-trust
flag). The point of package
trusts is to require that the client C explicitly say which packages are
allowed to contain trustworthy modules. That is, C establishes that it
trusts a package P and its author and so trust the modules in P that use
-XTrustworthy
. When package trust is enabled, any
modules that are considered trustworthy but reside in a package that
isn't trusted are not considered trusted. A more formal definition is
given in the next section.
When the -fpackage-trust
flag is enabled, whether or not
a module is trusted depends on a notion of trust for packages, which is
determined by the client C invoking GHC (i.e. you). A package P
is trusted when one of these hold:
In either case, C is the only authority on package trust. It is up to the client to decide which packages they trust.
When the -fpackage-trust
flag is used a module M from
package P is trusted by a client C if and only if:
-XSafe
-XTrustworthy
For the first trust definition the trust guarantee is provided by GHC
through the restrictions imposed by the safe language. For the second
definition of trust, the guarantee is provided initially by the
module author. The client C then establishes that they trust the
module author by indicating they trust the package the module resides
in. This trust chain is required as GHC provides no guarantee for
-XTrustworthy
compiled modules.
The reason there are two modes of checking trust is that the extra
requirement enabled by -fpackage-trust
causes the design
of Safe Haskell to be invasive. Packages using Safe Haskell when the flag
is enabled may or may not compile depending on the state of trusted
packages on a users machine. A maintainer of a package
foo
that uses Safe Haskell so that security conscious
Haskellers can use foo
now may have other users of
foo
who don't know or care about Safe Haskell
complaining about compilation problems they are having with
foo
because a package bar
that foo
requires, isn't trusted on their machine. In this sense, the
-fpackage-trust
flag can be thought of as a flag to
properly turn on Safe Haskell while without it, it's operating in a
covert fashion.
Having the -fpackage-trust
flag also nicely unifies the
semantics of how Safe Haskell works when used explicitly and how modules
are inferred as safe.
Package Wuggle: {-# LANGUAGE Safe #-} module Buggle where import Prelude f x = ...blah... Package P: {-# LANGUAGE Trustworthy #-} module M where import System.IO.Unsafe import safe Buggle
Suppose a client C decides to trust package P. Then does C trust module
M? To decide, GHC must check M's imports — M imports
System.IO.Unsafe. M was compiled with -XTrustworthy
, so
P's author takes responsibility for that import. C trusts P's author, so
C trusts M to only use its unsafe imports in a safe and consistent
manner with respect to the API M exposes. M also has a safe import of
Buggle, so for this import P's author takes no responsibility for the
safety, so GHC must check whether Buggle is trusted by C. Is it? Well,
it is compiled with -XSafe
, so the code in Buggle
itself is machine-checked to be OK, but again under the assumption that
all of Buggle's imports are trusted by C. Prelude comes from base, which
C trusts, and is compiled with -XTrustworthy
(While
Prelude is typically imported implicitly, it still obeys the same rules
outlined here). So Buggle is considered trusted.
Notice that C didn't need to trust package Wuggle; the machine checking
is enough. C only needs to trust packages that contain
-XTrustworthy
modules.
-XTrustworthy
language
extension for a module M should ensure that M's public API (the symbols
exposed by its export list) can't be used in an unsafe manner. This mean
that symbols exported should respect type safety and referential
transparency.
In the case where a module is compiled without one of
-XSafe
, -XTrustworthy
or
-XUnsafe
being used, GHC will try to figure out itself if
the module can be considered safe. This safety inference will never mark a
module as trustworthy, only as either unsafe or as safe. GHC uses a simple
method to determine this for a module M: If M would compile without error
under the -XSafe
flag, then M is marked as safe. If M
would fail to compile under the -XSafe
flag, then it is
marked as unsafe.
When should you use Safe Haskell inference and when should you use an
explicit -XSafe
flag? The later case should be used when
you have a hard requirement that the module be safe. That is, the
use cases outlined and the purpose
for which Safe Haskell is intended: compiling untrusted code. Safe
inference is meant to be used by ordinary Haskell programmers. Users who
probably don't care about Safe Haskell.
Say you are writing a Haskell library. Then you probably just want to use Safe inference. Assuming you avoid any unsafe features of the language then your modules will be marked safe. This is a benefit as now a user of your library who may want to use it as part of an API exposed to untrusted code can use the library without change. If there wasn't safety inference then either the writer of the library would have to explicitly use Safe Haskell, which is an unreasonable expectation of the whole Haskell community. Or the user of the library would have to wrap it in a shim that simply re-exported your API through a trustworthy module, an annoying practice.
-XTrustworthy
has no effect on the accepted range
of Haskell programs or their semantics, except that they allow the
safe import keyword.
-fpackage-trust
enabled) — Yes but only if the package the module
resides in is also trusted.-XSafe
. Also enable the Safe Import
extension so that a module can require a dependency to be trusted.
GHC includes a variety of flags that allow arbitrary processes to be run at compilation time. One such example is the custom pre-processor flag. Another is the ability of Template Haskell to execute Haskell code at compilation time, including IO actions. Safe Haskell does not address this danger (although, Template Haskell is a disallowed feature).
Due to this, it is suggested that when compiling untrusted source code that has had no manual inspection done, the following precautions be taken:
-XSafe
flag
being specified on the command line. This will ensure that modifications
to the source being compiled can't disable the use of the Safe Language
as the command line flag takes precedence over a source level
pragma.-fpackage-trust
flag is used with packages from untrusted sources being marked as
untrusted.
There is a more detailed discussion of the issues involved in compilation safety and some potential solutions on the GHC Wiki.