Sunday, March 20, 2011

Experience Report: Functional Programming through Deep Time

My wife has just completed a draft of the experience report she's intending to submit to ICFP 2011. It's called Functional Programming through Deep Time:

This experience report describes how Haskell was used to model the beginnings of complex life on Earth. My work combines ecological modeling in Haskell with statistical analysis in R, to answer some long standing paleontological questions. For my work, I found that neither Haskell nor R was suffcient - statistical analysis in Haskell is overly burdensome, while R lacks the structure to express complex algorithms in a maintainable manner. The reaction from my colleagues has ranged from indifferent to excited - but I have yet to tempt any of them over to the pure side!

I initially persuaded my wife to switch to Haskell, but since then, I have had little involvement with her code. If you have any feedback for her (ideally before Wednesday!) please leave it in the comments.


Unknown said...

I look forward to hearing any comments.

Many thanks, Emily

Heinrich Apfelmus said...

I was somewhat confused by the statement "statistical analysis in Haskell is overly burdensome" because strictly speaking, that is a property of the available libraries, not of the language proper. Indeed, Emily is thinking about writing her own libraries at the end of the paper. I understand where the notion comes from, but perhaps a small change in wording might make it clearer.

In that light, Emily offers an interesting design suggestion for a plotting library: offer one function, called plot, which tries to guess everything and does a reasonable plot of the data. Additional options or functions can be used for customization, but the rapid experimentation made possible by a dead-simple plot function is extremely valuable.

Ketil said...

What he (Heinrich) said.

Although I only have a little experience with R, the functional aspects of it appeals to me, but I found myself wishing it was a bit more like Haskell.

I think that with a good library, Haskell would make a great statistics environment, especially for more programming-oriented folks.

Colin said...

Emily, I found your points of comparison between Haskell and R spot on.

Last year, I had the task of writing an Eiffel (the language we use at work - I use Haskell at home, when I am doing any programming, though that has been over a year now) class to do logisitic regression. Not knowing anything about the technique, I decided to convert an R package to Eiffel.

But I found all the automatic coercions made R impossible to read. In the end, I had to step through every stage in the debugger, to inspect the arguments that were actually being passed, to see what shape they were. Only that way was I able to understand he program!

P.S. At one point in your program you mention Strict Types. I suspect this is a typo for Static Types.

Vinod G said...

I use R a lot, mostly for graphing, data analysis and performance visualization of the compilers I work on.

I am also very interested in functional languages and programming languages in general.

Recently I was pleasantly shocked to discover that R supports higher order functions (or at least closures)!

e.g. one can write the following fragment in R:

> closure <- function(a) { return (function(c) {return (a+c)})}
> tmp <- closure(21)
> tmp(45)
[1] 66

Unknown said...

Heinrich: Thanks for the comment, I will clarify where I am talking about the libraries vs the language. A plot function would be great :)

Ketil: A good statistics library would be a start, but R has thousands of different libraries, it would take a while to catch up!

Colin: Typo fixed, thanks :) Nice to hear someone else has similar problems.

Vinod: I used a few higher order functions in R, but without types they are much harder to get right - I am just spoilt by the ease of Haskell's higher order functions.

Thanks for all the comments, they are very much appreciated. Emily

sclv said...

This was an enjoyable read. One small point -- if possible, it would be very nice if the code was made available on some website (with appropriate caveats about being in development, etc.) or if the paper stated plans in that direction. Whether or not this is done, it would be nice to name the program just so it doesn't have to be referred to throughout the paper as "my program".

Also, in 2nd para, 2nd col, 1st page, you write "I perform Baysian.." which I think should be "To perform"