Saturday, May 10, 2008

Bad strictness

Haskell has one primitive construct for enforcing strictness, seq :: a -> b -> b. The idea is that the first argument is evaluated to weak-head normal form (WHNF), then the second argument is evaluated to WHNF and returned. WHNF is reduction until the outermost bit is available - either a function value, or an outer constructor.

You can model the behaviour by introducing an evaluate function, in a lower-level language, and showing how to perform reduction:

evaluate (seq a b) = do
a' <- evaluate a
b' <- evaluate b
return b'

The evaluate function must return an evaluated argument, and it wants to return b which is not already evaluated, so it must make a recursive call. The evaluate function for id, which simply returns its argument, is:

evaluate (id a) = do
a' <- evaluate a
return a'

Notice that even though id does "nothing", it still has to evaluate its argument. Of course, evaluate (id x) is the same as evaluate x, so id performs no additional work.

Haskell is lazy, so if an expression has already been evaluated, then the evaluate call will be incredibly cheap, and just return the previous result. So let's consider the result of calling seq with the same argument twice.

evaluate (seq a a) = do
a' <- evaluate a
return a'

This time the second evaluation of a is skipped, as a is already evaluated. We can easily see that evaluation of seq a a is exactly equivalent to a. This means that any code which writes a `seq` a is wrong. There is plenty of this code around, and one example (which prompted me to write this) is on slide 15 of this talk.

The other classic seq related mistake is id $! x. The ($!) operator is for strict application, and is defined:

f $! x = x `seq` f x

For the particular instance of id $! x, we obtain x `seq` id x. Of course, all that id x does is evaluate x, so again there is no change from just writing x.

There are valid uses of seq, but any time you see either of the following constructs, you know the programmer got it wrong:

x `seq` x
id $! x


Anonymous said...

Check out slide 15 for an example:

Unknown said...

Ah, thank you for your PSA. Now I have a url to send folks instead of needing to explain this myself :)

Unknown said...

The source of all this confusion is that `seq` and ($!) are in fact lazy about being strict. Whenever they're used, they create a thunk just like anything else, it's just that when someone pulls on that thunk, the `seq` will pull on something else first before giving the answer. For folks who find reasoning about laziness difficult, this lazy strictness is doubly hard to grasp.

Chris Done said...

Thanks! I'm implementing a lazy language interpreter in Haskell and got stuck on how to force order when using impure functions (e.g. print). I suspected seq might do the trick so I googled it and found your article. I've made a simple built-in function

seq' [v1,v2] = return v2

(All builtins receive their arguments fully evaluated to normal form at the moment.)

letBuiltin "seq" 2 seq' -- 2 is arity

So I can use thusly:

Haskell> prettyEval "seq (print 1) (print 2)"

Or maybe define a `then':

Haskell> prettyEval "(╬╗then·then (print 1) (print 2)) seq"

Of course, a proper simple type system with Monads or uniqueness types would be better. I'm already familiar with monads, so I might try implementing uniqueness types because they seem quite interesting.

Neil Mitchell said...

Chris: Cool, glad it was useful. Good luck with your interpreter!

Jimvibe said...

What does your "evaluate a" function do exactly? Does it reduce to WHNF? It's a bit hard to see that seq a a does nothing because we still call evaluate from a and get a reduction a' and return that. What am I missing?

Neil Mitchell said...

Jimvibe: evaluate is from Control.Exception, and has type a -> IO a. It guarantees that after the IO has been executed the value will be reduced to WHNF. The point to note about let a' = seq a a is that when a' is evaluated it will evaluate a. But that would be equally true if you wrote let a' = a, so the seq hasn't added more strictness.

Anonymous said...

Please note that seq does specifically not guarantee the evaluation order described here ("..., then the second argument") -- pseq from the "parallel" package does this. The documentation of the Prelude coming with GHC 8.6.5 states this.
With seq, you can be sure that both sides will be evaluated before it returns - but the right side could be evaluated first.
With pseq, you can be sure that the left side is evaluated first, regardless of the expressions used.
I guess $! is defined using seq because in that special case the compiler would always evaluate x first - but the general case isn't guaranteed.