Wednesday, January 31, 2007

Writing code by types

In the past few days I've found that while I'm writing the code, the type checker is the one deciding what the code should be. To take an example, I was recently writing some SYB code using gfoldl:

> gfoldl :: (forall a b . Data a => c (a -> b) -> a -> c b) -> (forall g . g -> c g) -> a -> c a

I decided what c had to be:

> newtype C x a = C ([x], [x] -> (a, [x]))

(where "C x" was the instantiated type of c)

From there I didn't have much of a clue what I was doing, but by the time you've got that far, there is really virtually only one type correct program that can possibly work! Finding what that program was took a while, but once you have got this far you can start getting the type checker to write the code for you.

The first step is to write the basic definition, in such a way that Haskell will give you error messages if you don't get it right:

> myexpr :: a -> C x a
> myexpr = gfoldl cont base
>
> cont = undefined
> base = undefined

At this point the Haskell type checker can already check that you haven't made a mistake anywhere else. Once all is good you can start trying to "fill in the blanks". The first step is to tackle base:

> base :: a -> C x a
> base = undefined

Note that no code has been written, but you've checked that the type is valid. A :ttype on base would have given you the same information, assuming base was at the top level of the program, but this step is useful to check.

Next you can start filling in the undefined bits:

> base :: a -> C x a
> base a = C ([], \y -> (a,y))

By working this way it is relatively easy to build up a correct program, at each stage using small steps and regular type feedback to check you are going in the right direction.

The second time I was using this technique was when I was replacing one data structure with another, with some similar operations and some completely different ones. It wasn't as simple as changing the imports, but thanks to regular feedback from the type checker this went from being an exercise in thought to just going around mopping up the errors.

Programming like this is very powerful, giving lots of hints about what the user needs to do next, but would really benefit from some intelligent development tools. Epigram has some similar ideas built into the development environment, but Haskell doesn't.

6 comments:

Thomas Schilling said...

I've been thinking about similar tool-support. I wonder how Visual Haskell is doing with this kind of stuff. Since, I use Emacs and have no intention to use any Windows machine at all in the near future, that doesn't play a major role, either. Hacking this into the Emacs Haskell mode would be a huge effort, too, so I guess we'll just have to wait for The Haskell Editor (Yi or Qi or something ..)

Jim Apple said...

On 1/3/07, Neil Mitchell <ndmitchell@gmail.com> wrote:

As for beginner issues with rank-2 types, I've been learning Haskell for years now, and have never felt the need for a rank-2 type. If the interface for some feature requires rank-2 types I'd call that an abstraction leak in most cases.
--------------------------------
A change of heart? :-)

Pepe Iborra said...

Thomas, take a look at Shim, a project to integrate the ghc-api in an Emacs mode based on Slime:
http://shim.haskellco.de/trac/shim

Neil Mitchell said...

Thomas: The Epigram editor is based on Emacs, so it can be done.

Jim: I was writing an abstraction/alternative to Data.Generics which doesn't use rank-2 types, and giving an implementation in terms of gfoldl because it has native compiler support. I see it as removing some rank-2 types from a conceptually useful API.

Anonymous said...

Rank-2 types are mostly useful when you need to prove certain things about a program.

For example, both the foldr/build and destroy/unfoldr optimizations and runST would be dangerously incorrect if you could apply them to functions with non-polymorphic types. But thanks to the rank-2 types in the signatures of build, destroy and runST, this is impossible.

All a rank-2 type says, in essence, is that "this argument must be a genuinely polymorphic function". This corresponds to the notion of a "natural transformation" in category theory, i.e., a mapping which doesn't know anything about the types it manipulates, and which therefore can't go poking around inside them.

Wadler's "free theorems" paper (useful when optimizing) also relies on the same guarantees.

Anonymous said...

I've hacked the emacs haskell-mode a bit to give more support for quick type lookup / display / insertion and jumping around definitions. I definitely work like this sometimes, leaving "undefined" stubs around to fill in as it becomes clear. haskell-mode really can do a lot more than it is currently though, maybe sometime I'll get around to it.