Saturday, May 10, 2008

Bad strictness

Haskell has one primitive construct for enforcing strictness, seq :: a -> b -> b. The idea is that the first argument is evaluated to weak-head normal form (WHNF), then the second argument is evaluated to WHNF and returned. WHNF is reduction until the outermost bit is available - either a function value, or an outer constructor.

You can model the behaviour by introducing an evaluate function, in a lower-level language, and showing how to perform reduction:


evaluate (seq a b) = do
a' <- evaluate a
b' <- evaluate b
return b'


The evaluate function must return an evaluated argument, and it wants to return b which is not already evaluated, so it must make a recursive call. The evaluate function for id, which simply returns its argument, is:


evaluate (id a) = do
a' <- evaluate a
return a'


Notice that even though id does "nothing", it still has to evaluate its argument. Of course, evaluate (id x) is the same as evaluate x, so id performs no additional work.

Haskell is lazy, so if an expression has already been evaluated, then the evaluate call will be incredibly cheap, and just return the previous result. So let's consider the result of calling seq with the same argument twice.


evaluate (seq a a) = do
a' <- evaluate a
return a'


This time the second evaluation of a is skipped, as a is already evaluated. We can easily see that evaluation of seq a a is exactly equivalent to a. This means that any code which writes a `seq` a is wrong. There is plenty of this code around, and one example (which prompted me to write this) is on slide 15 of this talk.

The other classic seq related mistake is id $! x. The ($!) operator is for strict application, and is defined:


f $! x = x `seq` f x


For the particular instance of id $! x, we obtain x `seq` id x. Of course, all that id x does is evaluate x, so again there is no change from just writing x.

There are valid uses of seq, but any time you see either of the following constructs, you know the programmer got it wrong:

x `seq` x
id $! x

5 comments:

Anonymous said...

Check out slide 15 for an example:

http://www.red-bean.com/~bos/slides/BayFP-2008-05-08.pdf
--
aycan

wren said...

Ah, thank you for your PSA. Now I have a url to send folks instead of needing to explain this myself :)

wren said...

The source of all this confusion is that `seq` and ($!) are in fact lazy about being strict. Whenever they're used, they create a thunk just like anything else, it's just that when someone pulls on that thunk, the `seq` will pull on something else first before giving the answer. For folks who find reasoning about laziness difficult, this lazy strictness is doubly hard to grasp.

Chris Done said...

Thanks! I'm implementing a lazy language interpreter in Haskell and got stuck on how to force order when using impure functions (e.g. print). I suspected seq might do the trick so I googled it and found your article. I've made a simple built-in function

seq' [v1,v2] = return v2

(All builtins receive their arguments fully evaluated to normal form at the moment.)
..

letBuiltin "seq" 2 seq' -- 2 is arity

So I can use thusly:

Haskell> prettyEval "seq (print 1) (print 2)"
1
2
()

Or maybe define a `then':

Haskell> prettyEval "(╬╗then·then (print 1) (print 2)) seq"
1
2
()

Of course, a proper simple type system with Monads or uniqueness types would be better. I'm already familiar with monads, so I might try implementing uniqueness types because they seem quite interesting.

Neil Mitchell said...

Chris: Cool, glad it was useful. Good luck with your interpreter!