Haskell’s State
monad has a shameful secret. It’s not a monad
[State monads don’t respect the monad laws in Haskell, Simon Marlow]
.
Look:
type State s a = s → (a, s)
return x = λs → (x, s)
f >>= g = λs → case f s of (x, s') → g x s'
Remember the monad laws: x >>= return
must be equal to x
.
In particular, ⊥ >>= return
must be ⊥
.
State
breaks this law:
⊥ >>= return
= {- definition of >>= -}
λs → case ⊥ s of (x, s') → return x s'
= {- divergence -}
λs → ⊥
In Haskell, ⊥ `seq` x
is ⊥
, but (λs → ⊥) `seq` x
is x
. Hence λs → ⊥
is not the same as ⊥
, and State
is
not a monad
[The use of case
in the definition of >>=
makes this the strict state monad. The lazy state monad gives us the
even worse λs → (⊥, ⊥)
.]
.
An obvious question: is there an implementation of the state monad
that does respect the monad laws? I will prove that no, there isn’t,
there is no state monad in Haskell, and seq
is the culprit.
Eta equality
What is it about seq
that makes it break eta
equality — why can we not equate ⊥
with λx → ⊥
?
After all, seq
's defining equations seem perfectly innocent:
x `seq` y = y, if x ≠ ⊥
⊥ `seq` y = ⊥
How can adding more equations make fewer terms equal? The trouble
comes when we pass a function as the first argument to seq
.
When is a function ⊥? A natural answer is: if for all inputs it
returns ⊥. “But,” says the compiler writer, “I can’t check if a
function always returns ⊥. How can I implement seq
?”
“Never mind,” we say, “if x
is a function then have x `seq` y
return y
.”
“But,” the compiler writer objects, “I would like to erase types at
runtime, so until I’ve evaluated x
I don’t know if it’s a function.”
“Fine,” we cry, “then return y
as soon as you know x
is a
function.”
We really have no choice. If x
is a function, x `seq` y
must
reduce x
until it becomes a λ-abstraction — until then, it
doesn’t know whether x
is a function or not. Then there is nothing
more it can do to evaluate x
, and it must return y
. If our poor
compiler writer is to implement seq
, we must define:
(λx → _) `seq` y = y.
But now seq
must distinguish ⊥
from λx → ⊥
, so they cannot be
equal. But they are eta-equivalent. We have lost eta equality!
As it turns out, the state monad laws only make sense if we have eta equality.
We will find a consequence of the state monad laws that can only hold
up to eta equivalence and then use seq
to derive a lot of nonsense
from it.
Unintended consequences of the state monad laws
We will now prove that there is no state monad in
Haskell. By a state monad, we mean a monad satisfying Gibbons and
Hinze’s
[Just
do
It: Simple Monadic Equational Reasoning, Jeremy Gibbons and Ralf Hinze]
four laws:
put s >> put s' = put s'
put s >> get = put s >> return s
get >>= put = return ()
get >>= λs → get >>= k s = get >>= λs → k s s
The reader may object that there is a monad satisfying these four laws:
type State s a = ⊥
return = ⊥
(>>=) = ⊥
get = ⊥
put = ⊥
This monad lacks a certain something, though. We will disqualify it by
assuming that return () ≠ ⊥
.
One obstacle for our proof is that we may not eta-expand or eta-reduce
anything without justification. For example, the state monad laws say
that get >>= put = return ()
, but how do we know that get >>= λx
→ put x
is also equal to return ()
? Yikes! Thankfully, we can
freely eta-expand or eta-reduce the argument to >>=
:
m >>= f
= {- return is a unit for >>= -}
(m >>= f) >>= return
= {- associativity of >>= -}
m >>= (λx → f x >>= return)
= {- return is a unit for >>= -}
m >>= λx → f x
With that proved, we can apply the monad laws in the carefree way we
are used to. We start by observing that the fourth law gives us
get >> get >>= k = get >> k
(just expand the definition of
>>
). Now we can show that get >> return () = return ()
, for:
get >> return ()
= {- get >>= put = return () -}
get >> get >>= put
= {- get >> get >>= k = get >>= k -}
get >>= put
= {- get >>= put = return () -}
return ()
From the above equation, we get for any x
that
get >> return () >> x = return () >> x
or, simplifying,
get >> x = x
. Finally, letting x
be ⊥, we get get >> ⊥ =
⊥
, or, expanding the definition of >>
:
get >>= λx → ⊥ = ⊥.
This equation is the problem. Notice that the following similar-looking expression oughtn’t be ⊥:
get >>= λx → if x == "Hello" then ⊥ else return () ≠ ⊥
After all, if we run it in the state "Hello"
, it returns ()
,
while ⊥ diverges. In general, we have:
get >>= f = ⊥, if f x = ⊥ for all x
get >>= f ≠ ⊥, if f x = return () for some x
The first statement is just our problem equation restated; we will
prove the second one in the next section. From these two properties we
will derive the promised lot of nonsense. The idea is that applying
seq
to get >>= f
tells us whether f
is defined anywhere — something we cannot compute in Haskell.
We define a predicate over type a
to mean a function of type a
→ ()
. The idea is that a predicate returns ()
if its argument
satisfies the predicate, and otherwise diverges.
An existential quantifier for the type a
is a higher-order
predicate — a function of type (a → ()) → ()
— which holds of
a predicate p
if there exists a value satisfying p
.
In other words, if exists
is an existential quantifier, then
exists p = ()
if there exists an x
for which p x = ()
,
otherwise exists p = ⊥
.
Notice that in pure Haskell we cannot define an existential quantifier
for the vast majority of types. For example, we might try to define
one on booleans as exists p = p False || p True
, but the ||
here
must be “parallel or”
[LCF considered as a programming language, Gordon Plotkin]
, which we cannot write in Haskell.
We can only define an existential quantifier for a domain that has a top element,
in which case exists p = p ⊤
.
But, if we have a genuine state monad, we can define an existential quantifier for any type!
exists :: (a → ()) → ()
exists p = (get >>= f) `seq` ()
where
f x = case p x of () → return ()
We should check that this is indeed an existential quantifier.
-
If
p
is ⊥ everywhere, thenf = λx → ⊥
. The first of our two laws above then tells us thatget >>= f = ⊥
and thereforeexists p
is ⊥. -
On the other hand, if
p x
is()
for a certainx
, thenf x = return ()
, and from above we know thatget >>= f ≠ ⊥
, soexists p = ()
.
It checks out — exists p = ()
if and only if there is an x
for which p x = ()
.
Since exists
exists only for types that have a top element,
a state monad exists only when the state type has a top element.
Not only is there no polymorphic state monad, we can not even make one for a fixed state type like Bool
!
The icky bits
We still need to prove that if there is an x
for which f x =
return ()
, then get >>= f ≠ ⊥
. This seems obvious — as we
remarked above, evaluating get >>= f
in the state x
does not
diverge, while ⊥ always diverges — but is a bit tricky to prove
because the only inequality we have is return () ≠ ⊥
.
To apply our inequality, we had better find a context C[m]
such
that C[get >>= f] = return ()
and C[⊥] = ⊥
.
To stop C[get >>= f]
from diverging we will need to put
our
x
as the state before we execute m
, but since the whole thing
should evaluate to return ()
we will need to save the state
beforehand and restore it afterwards. Finally, to make sure that
C[⊥] = ⊥
, we should not restore the state if m
diverges.
Thus, a suitable context is
C[m] = get >>= λs → put x >> m >>= λ() → put s.
It is easy to see that C[get >>= f] = return ()
:
get >>= λs → put x >> get >>= f >>= λ() → put s
= {- put/get law -}
get >>= λs → put x >> f x >>= λ() → put s
= {- definition of f -}
get >>= λs → put x >> return () >>= λ() → put s
= {- monad laws -}
get >>= λs → put x >> put s
= {- put/put law -}
get >>= put
= {- get/put law -}
return ()
Proving that C[⊥] = ⊥
is trickier and involves some awkward
manipulation of ⊥
:
get >>= λs → put x >> ⊥ >>= λ() → put s
⊑ {- ⊥ ⊑ return ⊥ -}
get >>= λs → put x >> return ⊥ >>= λ() → put s
= {- monad laws; () doesn’t match ⊥ -}
get >>= λs → put x >> ⊥
⊑ {- ⊥ ⊑ put s >> ⊥ -}
get >>= λs → put x >> put s >> ⊥
= {- put/put law -}
get >>= λs → put s >> ⊥
= {- monad laws -}
(get >>= put) >> ⊥
= {- get/put law -}
return () >> ⊥
= {- monad laws -}
⊥
To summarise: assuming f x = return ()
, we have C[get >>= f] =
return ()
, C[⊥] = ⊥
, and return () ≠ ⊥
; therefore,
get >>= f ≠ ⊥
, the last piece of our proof.
(The end.)