Summary: The foldr
function seems simple, but is actually very complex, with lots of layers. This post dives through the layers.
The foldr
function takes a list and replaces all :
(cons) and []
(nil) values with functions and a final value. It's available in the Haskell Prelude
and described on Wikipedia. As some examples:
sum = foldr (+) 0
map f = foldr (\x xs -> f x : xs) []
But the simple foldr
described on Wikipedia is many steps away from the one in the Haskell Prelude
. In this post we'll peel back the layers, learning why foldr
is a lot more complicated under the hood.
Layer 1: Wikipedia definition
The definition on Wikipedia is:
foldr :: (a -> b -> b) -> b -> [a] -> b
foldr f z [] = z
foldr f z (x:xs) = f x (foldr f z xs)
This recursive definition directly describes what foldr
does. Given a list [1,2,3]
we get f 1 (f 2 (f 3 z))
.
Layer 2: Static argument transformation
The problem with this definition is that it is recursive, and GHC doesn't like to inline recursive functions, which prevents a lot of optimisation. Taking a look at sum
, it's a real shame that operations like (+)
are passed as opaque higher-order functions, rather than specialised to the machine instruction ADD
. To solve that problem, GHC defines foldr
as:
foldr f z = go
where go [] = z
go (x:xs) = f x (go xs)
The arguments f
and z
are constant in all sucessive calls, so they are lifted out with a manually applied static argument transformation.
Now the function foldr
is no longer recursive (it merely has a where
that is recursive), so foldr
can be inlined, and now +
can meet up with go
and everything can be nicely optimised.
Layer 3: Inline later
We now have foldr
that can be inlined. However, inlining foldr
is not always a good idea. In particular, GHC has an optimisation called list fusion based on the idea that combinations of foldr
and build
can be merged, sometimes known as short-cut deforestation. The basic idea is that if we see foldr
applied to build
we can get rid of both (see this post for details). We remove foldr
using the GHC rewrite rule:
{-# RULES "my foldr/build" forall g k z. foldr k z (build g) = g k z #-}
The most interesting thing about this rule (for this post at least!) is that it matches foldr
by name. Once we've inlined foldr
we have thrown away the name, and the rule can't fire anymore. Since this rule gives significant speedups, we really want it to fire, so GHC adds an extra pragma to foldr
:
{-# INLINE [0] foldr #-}
This INLINE
pragma says don't try and inline foldr
until the final stage of the compiler, but in that final stage, be very keen to inline it.
Layer 4: More polymorphism
However, the foldr
function in the Prelude
is not the one from GHC.List
, but actually a more general one that works for anything Foldable
. Why limit yourself to folding over lists, when you can fold over other types like Set
. So now foldr
is generailsed from []
to t
with:
foldr :: Foldable t => (a -> b -> b) -> b -> t a -> b
Where foldr
on []
is GHC.List.foldr
.
Layer 5: A default implementation
But foldr
is actually in the type class Foldable
, not just defined on the outside. Users defining Foldable
can define only foldr
and have all the other methods defined for them. But they can equally define only foldMap
, and have an implicit version of foldr
defined as:
foldr :: (a -> b -> b) -> b -> t a -> b
foldr f z t = appEndo (foldMap (Endo . f) t) z
Where Endo
is defined as:
newtype Endo = Endo {appEndo :: a -> a}
instance Monoid (Endo a) where
mempty = Endo id
Endo a <> Endo b = Endo (a . b)
The function foldMap f
is equivalent to mconcat . map f
, so given a list [1,2,3]
the steps are:
- First apply
map (Endo . f)
to each element to get[Endo (f 1), Endo (f 2), Endo (f 3)]
. - Next apply
mconcat
to the list to getEndo (f 1) <> Endo (f 2) <> Endo (f 3)
. - Inline all the
<>
definitions to getEndo (f 1 . f 2 . f 3)
. - Apply the
appEndo
at the beginning andz
at the end for(f 1 . f 2 . f 3) z
. - Inline all the
.
to givef 1 (f 2 (f 3 z))
, which is what we had at layer 1.
Layer 6: Optimising the default implementation
The real default implementation of foldr
is:
foldr f z t = appEndo (foldMap (Endo #. f) t) z
Note that the .
after Endo
has become #.
. Let's first explain why it's correct, then why it might be beneficial. The definition of #.
is:
(#.) :: Coercible b c => (b -> c) -> (a -> b) -> (a -> c)
(#.) _ = coerce
Note that it has the same type as .
(plus a Coercible
constraint), but ignores it's first argument entirely. The coerce
function transforms a value of type a
into a value of type b
with zero runtime cost, provided they have the same underlying representation. Since Endo
is a newtype
, that means Endo (f 1)
and f 1
are implemented identically in the runtime, so coerce
switches representation "for free". Note that the first argument to #.
only serves to pin down the types, so if we'd passed an interesting function as the first argument it would have been ignored.
Of course, in normal circumstances, a newtype
is free anyway, with no runtime cost. However, in this case we don't have a newtype
, but a function application with a newtype
. You can see the gory details in GHC ticket 7542, but at one point this impeeded other optimisations.
I tried a simulated version of foldr
and found that if GHC can't tell that []
is the Foldable
the code looks pretty bad, but if it can, at -O1
and above, the two implementations are 100% equivalent (to the point that common subexpression elimination makes them actually the same). It's possible this final layer is a vestigial optimisation, or perhaps it's still important in some circumstances.