Neil Mitchell's Blog (Haskell etc): r

Summary: This post describes how to write a function in Haskell, and then call it from R.

R is a very popular language for statistics, particular with biologists (and computational paleobiologists). For writing high performance code, the R developers recommend the use of C or Fortran - not languages that are particularly easy for beginners. However, you can instead write a Haskell function that can be called directly from R. The basic idea is to create a C compatible library using Haskell (as described in the GHC users manual) and then call that library from R (as described in this document). As a simple example, let's write a function that adds up the square roots of a list of numbers.

Create an R-compatible Haskell library

In normal Haskell, we would define the function to add up the square roots of a list as:

sumRoots :: [Double] -> Double
sumRoots xs = sum (map sqrt xs)

However, to make a function that is compatible with R, we have to follow two rules:

Every argument must be a Ptr to a C compatible type, typically Int, Double or CString. (To be pedantic, we should probably use CInt or CDouble, but using GHC on Windows these types are equivalent - keeping things simpler.)
The result must be IO ()

Obeying these restrictions, we need to use the type:

sumRootsR :: Ptr Int -> Ptr Double -> Ptr Double -> IO ()
sumRootsR n xs result = ...

Instead of passing in the list xs, we now pass in:

n, the length of the list xs
xs, the elements of the list
result, a space to put the result

We can implement sumRootsR by using the functions available in the Foreign module:

sumRootsR :: Ptr Int -> Ptr Double -> Ptr Double -> IO ()
sumRootsR n xs result = do
n <- peek n
xs <- peekArray n xs
poke result $ sumRoots xs

This function first gets the value for n, then for each element in 0..n-1 gets the element out of the pointer array xs and puts it in a nice list. We then call the original sumRoots, and store the value in the space provided by result. As a general rule, you should put all the logic in one function (sumRoots), and the wrapping in another (sumRootsR). We can then export this function with the definition:

foreign export ccall sumRootsR :: Ptr Int -> Ptr Double -> Ptr Double -> IO ()

Putting everything together, we end up with the Haskell file:

-- SumRoots.hs
{-# LANGUAGE ForeignFunctionInterface #-}
module SumRoots where

import Foreign

foreign export ccall sumRootsR :: Ptr Int -> Ptr Double -> Ptr Double -> IO ()

sumRootsR :: Ptr Int -> Ptr Double -> Ptr Double -> IO ()
sumRootsR n xs result = do
n <- peek n
xs <- peekArray n xs
poke result $ sumRoots xs

sumRoots :: [Double] -> Double
sumRoots xs = sum (map sqrt xs)

We also need a C stub file. The one described in the GHC users guide works well:

// StartEnd.c
#include <Rts.h>

void HsStart()
{
int argc = 1;
char* argv[] = {"ghcDll", NULL}; // argv must end with NULL

// Initialize Haskell runtime
char** args = argv;
hs_init(&argc, &args);
}

void HsEnd()
{
hs_exit();
}

We can now compile our library with the commands:

ghc -c SumRoots.hs
ghc -c StartEnd.c
ghc -shared -o SumRoots.dll SumRoots.o StartEnd.o

This creates the library SumRoots.dll.

Calling Haskell from R

At the R command prompt, we can load the library with:

dyn.load("C:/SumRoots.dll") # use the full path to the SumRoots library
.C("HsStart")

We can now invoke our function:

input = c(9,3.5,5.58,64.1,12.54)
.C("sumRootsR", n=as.integer(length(input)), xs=as.double(input), result=as.double(0))$result

This prints out the answer 18.78046.

We can make this function easier to use on the R side by writing a wrapper, for example:

sumRoots <- function(input)
{
return(.C("sumRootsR", n=as.integer(length(input)), xs=as.double(input), result=as.double(0))$result)
}

Now we can write:

sumRoots(c(12,444.34))

And get back the answer 24.54348. With a small amount of glue code, it's easy to call Haskell libraries from R programs.

Update: See the comments below from Alex Davis for how to do such things on newer versions of Mac OS.

Neil Mitchell's Blog (Haskell etc)

Sunday, October 30, 2011

Calling Haskell from R

Blog Archive

About Me