Haskell has no state monad

Haskell’s State monad has a shameful secret. It’s not a monad
[State monads don’t respect the monad laws in Haskell, Simon Marlow]
. Look:

type State s a = s → (a, s)
return x = λs → (x, s)
f >>= g  = λs → case f s of (x, s') → g x s'

Remember the monad laws: x >>= return must be equal to x. In particular, ⊥ >>= return must be ⊥. State breaks this law:

  ⊥ >>= return
= {- definition of >>= -}
  λs → case ⊥ s of (x, s') → return x s'
= {- divergence -}
  λs → ⊥

In Haskell, ⊥ `seq` x is ⊥, but (λs → ⊥) `seq` x is x. Hence λs → ⊥ is not the same as ⊥, and State is not a monad
[The use of case in the definition of >>= makes this the strict state monad. The lazy state monad gives us the even worse λs → (⊥, ⊥).]
.

An obvious question: is there an implementation of the state monad that does respect the monad laws? I will prove that no, there isn’t, there is no state monad in Haskell, and seq is the culprit.

Eta equality

What is it about seq that makes it break eta equality — why can we not equate ⊥ with λx → ⊥? After all, seq's defining equations seem perfectly innocent:

x `seq` y = y, if x ≠ ⊥
⊥ `seq` y = ⊥

How can adding more equations make fewer terms equal? The trouble comes when we pass a function as the first argument to seq. When is a function ⊥? A natural answer is: if for all inputs it returns ⊥. “But,” says the compiler writer, “I can’t check if a function always returns ⊥. How can I implement seq?”

“Never mind,” we say, “if x is a function then have x `seq` y return y.”

“But,” the compiler writer objects, “I would like to erase types at runtime, so until I’ve evaluated x I don’t know if it’s a function.”

“Fine,” we cry, “then return y as soon as you know x is a function.”

We really have no choice. If x is a function, x `seq` y must reduce x until it becomes a λ-abstraction — until then, it doesn’t know whether x is a function or not. Then there is nothing more it can do to evaluate x, and it must return y. If our poor compiler writer is to implement seq, we must define:

(λx → _) `seq` y = y.

But now seq must distinguish ⊥ from λx → ⊥, so they cannot be equal. But they are eta-equivalent. We have lost eta equality!

As it turns out, the state monad laws only make sense if we have eta equality. We will find a consequence of the state monad laws that can only hold up to eta equivalence and then use seq to derive a lot of nonsense from it.

Unintended consequences of the state monad laws

We will now prove that there is no state monad in Haskell. By a state monad, we mean a monad satisfying Gibbons and Hinze’s
[Just do It: Simple Monadic Equational Reasoning, Jeremy Gibbons and Ralf Hinze]
four laws:

put s >> put s'            = put s'
put s >> get               = put s >> return s
get  >>= put               = return ()
get  >>= λs → get >>= k s = get >>= λs → k s s

The reader may object that there is a monad satisfying these four laws:

type State s a = ⊥
return = ⊥
(>>=)  = ⊥
get    = ⊥
put    = ⊥

This monad lacks a certain something, though. We will disqualify it by assuming that return () ≠ ⊥.

One obstacle for our proof is that we may not eta-expand or eta-reduce anything without justification. For example, the state monad laws say that get >>= put = return (), but how do we know that get >>= λx → put x is also equal to return ()? Yikes! Thankfully, we can freely eta-expand or eta-reduce the argument to >>=:

  m >>= f
= {- return is a unit for >>= -}
  (m >>= f) >>= return
= {- associativity of >>= -}
  m >>= (λx → f x >>= return)
= {- return is a unit for >>= -}
  m >>= λx → f x

With that proved, we can apply the monad laws in the carefree way we are used to. We start by observing that the fourth law gives us get >> get >>= k = get >> k (just expand the definition of >>). Now we can show that get >> return () = return (), for:

  get >> return ()
= {- get >>= put = return () -}
  get >> get >>= put
= {- get >> get >>= k = get >>= k -}
  get >>= put
= {- get >>= put = return () -}
  return ()

From the above equation, we get for any x that get >> return () >> x = return () >> x or, simplifying, get >> x = x. Finally, letting x be ⊥, we get get >> ⊥ = ⊥, or, expanding the definition of >>:

get >>= λx → ⊥ = ⊥.

This equation is the problem. Notice that the following similar-looking expression oughtn’t be ⊥:

get >>= λx → if x == "Hello" then ⊥ else return () ≠ ⊥

After all, if we run it in the state "Hello", it returns (), while ⊥ diverges. In general, we have:

get >>= f = ⊥, if f x = ⊥ for all x
get >>= f ≠ ⊥, if f x = return () for some x

The first statement is just our problem equation restated; we will prove the second one in the next section. From these two properties we will derive the promised lot of nonsense. The idea is that applying seq to get >>= f tells us whether f is defined anywhere — something we cannot compute in Haskell.

We define a predicate over type a to mean a function of type a → (). The idea is that a predicate returns () if its argument satisfies the predicate, and otherwise diverges. An existential quantifier for the type a is a higher-order predicate — a function of type (a → ()) → () — which holds of a predicate p if there exists a value satisfying p. In other words, if exists is an existential quantifier, then exists p = () if there exists an x for which p x = (), otherwise exists p = ⊥.

Notice that in pure Haskell we cannot define an existential quantifier for the vast majority of types. For example, we might try to define one on booleans as exists p = p False || p True, but the || here must be “parallel or”
[LCF considered as a programming language, Gordon Plotkin]
, which we cannot write in Haskell. We can only define an existential quantifier for a domain that has a top element, in which case exists p = p ⊤.

But, if we have a genuine state monad, we can define an existential quantifier for any type!

exists :: (a → ()) → ()
exists p = (get >>= f) `seq` ()
  where
    f x = case p x of () → return ()

We should check that this is indeed an existential quantifier.

If p is ⊥ everywhere, then f = λx → ⊥. The first of our two laws above then tells us that get >>= f = ⊥ and therefore exists p is ⊥.
On the other hand, if p x is () for a certain x, then f x = return (), and from above we know that get >>= f ≠ ⊥, so exists p = ().

It checks out — exists p = () if and only if there is an x for which p x = (). Since exists exists only for types that have a top element, a state monad exists only when the state type has a top element. Not only is there no polymorphic state monad, we can not even make one for a fixed state type like Bool!

The icky bits

We still need to prove that if there is an x for which f x = return (), then get >>= f ≠ ⊥. This seems obvious — as we remarked above, evaluating get >>= f in the state x does not diverge, while ⊥ always diverges — but is a bit tricky to prove because the only inequality we have is return () ≠ ⊥.

To apply our inequality, we had better find a context C[m] such that C[get >>= f] = return () and C[⊥] = ⊥. To stop C[get >>= f] from diverging we will need to put our x as the state before we execute m, but since the whole thing should evaluate to return () we will need to save the state beforehand and restore it afterwards. Finally, to make sure that C[⊥] = ⊥, we should not restore the state if m diverges. Thus, a suitable context is

C[m] = get >>= λs → put x >> m >>= λ() → put s.

It is easy to see that C[get >>= f] = return ():

  get >>= λs → put x >> get >>= f >>= λ() → put s
= {- put/get law -}
  get >>= λs → put x >> f x >>= λ() → put s
= {- definition of f -}
  get >>= λs → put x >> return () >>= λ() → put s
= {- monad laws -}
  get >>= λs → put x >> put s
= {- put/put law -}
  get >>= put
= {- get/put law -}
  return ()

Proving that C[⊥] = ⊥ is trickier and involves some awkward manipulation of ⊥:

  get >>= λs → put x >> ⊥ >>= λ() → put s
⊑ {- ⊥ ⊑ return ⊥ -}
  get >>= λs → put x >> return ⊥ >>= λ() → put s
= {- monad laws; () doesn’t match ⊥ -}
  get >>= λs → put x >> ⊥
⊑ {- ⊥ ⊑ put s >> ⊥ -}
  get >>= λs → put x >> put s >> ⊥
= {- put/put law -}
  get >>= λs → put s >> ⊥
= {- monad laws -}
  (get >>= put) >> ⊥
= {- get/put law -}
  return () >> ⊥
= {- monad laws -}
  ⊥

To summarise: assuming f x = return (), we have C[get >>= f] = return (), C[⊥] = ⊥, and return () ≠ ⊥; therefore, get >>= f ≠ ⊥, the last piece of our proof.

(The end.)