Friday, June 29, 2007

Functional composition

Haskell has great support for abstraction: small functions can be written which do one general thing. Haskell also has great support for composition: putting small functions together is easy.

On IRC today someone asked for a function that computes the mode of a list. This is the same as finding the most common element in a list. One simple solution is a nice display of composition of smaller abstraction functions:


mostCommon :: Ord a => [a] -> a
mostCommon = head . maximumBy (comparing length) . group . sort


And a single example:


; mostCommon "Haskell"
'l'


To read the function mostCommon, it is best to read from right to help:


  1. sort: put the elements in order, i.e. "Haeklls".

  2. group: group consecutive elemnts, i.e. ["H","a","e","k","ll","s"]. It would be entirely reasonable to have a groupSet function that did grouping of non-consecutive elements, and I often use such a function.

  3. maximumBy (comparing length): find the maximum value, looking only at the number of elements in each list, i.e. "ll".

  4. head: take a representative value from the list, i.e. 'l'.



If asked to write the same function in Javascript (say), I would have started with a Hash table that looped counting the number of occurrences of each letter, then done a loop through this to find the maximum. The code would have been longer, and more prone to bugs.

It is easy to see that by combining functions that have already been written, tested and documented, it is possible to come up with very succinct operations.

9 comments:

Anonymous said...

Um ... you should add that the fact that many other languages don't have this feature (or, at least, lambda forms) renders them un-usable, when one is well-Haskellised.

These nifty features are what should carry a warning label: `ADDICTIVE! >8<'

Anonymous said...

Nice :) I like these kind of functions too! To be complete, here's a definition of comparing:

comparing f l r = compare (f l) (f r)

Neil Mitchell said...

lecamarade: that's for others to learn once they've been seduced by Haskell :)

chris: Thanks for including that. comparing is in GHC 6.6.1 in Data.Ord, but not in 6.4.

Anonymous said...

I made a small test in Javascript. Once you have the functions defined, you can also call it rather nicely (although it's not nearly as cool as the haskell version).

You can call it like this:

alert("haskell".toArray().sort().group().maximumByLength()[0]);

mostCommon.html

Anonymous said...

I'm not sure that I understand why this feature should be considered so "addictive". In fact, I'm not sure why it is even in the language at all. The code above can be trivially expressed without the . operator simply by specifying a name for an argument. In fact, if maximumBy took its arguments in the opposite order, the above composition would not even have been possible. Perhaps I'm missing some critical functionality that this provides, but it seems to be one more unneeded way to express the same thing without imparting much additional clarity.

Anonymous said...

@anonymous :
> In fact, if maximumBy took its
> arguments in the opposite order,
> the above composition would not
> even have been possible.

then you'd do :

flip maximumBy

Justin said...

@anonymous :
> I'm not sure that I understand why
> this feature should be considered so
> "addictive".

It's not just the composition, but the currying that makes this so nice. A brand new function that does what you want, built out of smaller pieces of polymorphic, general purpose functions.

I think the "maximumBy (comparing length)" bit is the sweetest of the bunch. maximumBy lets you hand in your own "value" function, appropriate for the list you wish to find a maximum in. "comparing" is itself a general purpose function that takes a list and compares all elements together, by whatever criteria you choose to give.


Finally, notice the function will also work over any list type that can be "ordered".

Sure, without composition the function definition has more paranetheses, but the other features at play - currying, polymorphism, and even "point-free" style - conspire to give a very readable and concise definition of the function "mostCommon".

Anonymous said...

groupSet wouldn't really be in the lazy-evaluation spirit, I think. It would have to digest the whole list before emitting any output. "sort" has no choice but groupBy works even on infinite lists, which is cool.

Anonymous said...

I'd have used
last . transpose . group . sort

instead. It handles the empty list and multiple mode cases better (by providing a list of results instead of a single one).