Sunday, December 07, 2008

F# from a Haskell perspective

I've recently started a full-time job at Standard Chartered. Before that I was doing an internship with Credit Suisse, where I spent a reasonable amount of time doing F# programming. Before I started F# I had 6 years of Haskell experience, plenty of C# experience, but little exposure to ML. I've now had 3 months to experiment with F#, using an old version (the one before the latest Community Technology Preview) and here are my impressions.

F# is a functional language from Microsoft, previously a Microsoft Research language, which is moving towards a fully supported language. F# is based on ML, and some (perhaps many) ML programs will compile with F#. At the same time, F# has complete access to the .NET framework and can interoperate with languages such as C#. F# is a hybrid language - at one extreme you can write purely functional ML, and at the other extreme you can write imperative C#, just using a different syntax. F# seems to be designed as a practical language - it isn't elegant or small, but does interoperate very nicely with every .NET feature.

The Language

The F# language is based on ML, with the addition of indentation based layout, and many of the weaknesses in F# come from ML. F# is certainly more verbose than Haskell: in some places you need an extra keyword (often a let); pattern matching is not nearly as complete; the indentation isn't as natural as Haskell. However there are some nice syntactic features in F# that are not in Haskell, including generalised list/array/sequence comprehensions and active patterns.

The type checker in F# is powerful, but unpredictable. I often get surprised by where type annotations need to go, particularly when working with .NET object types. The tuple type is treated specially in many cases, and this also leads to surprise - inserting or removing a pair of brackets can effect the type checker. Much of this complexity is necessary to manage the interaction with .NET, but it does complicate the language. Unfortunately, even with the advanced type features in F#, there are no type classes. The lack of type classes precludes the standard implementation of things such as Uniplate and QuickCheck. However, F# does have some nice reflection capabilities, and often entirely generic implementations can be given using reflection. There is certainly an interesting design trade-off between reflection based operations and type classes, something I have looked at in the past and hope to explore again in future.

F# is an impure language, which offers some practical benefits from Haskell, but also encourages a less functional style. In Haskell I sometimes work within a localised state monad - F# makes this much more natural. The impurity also allows simple interaction with .NET. Having programmed with an impure language I did find myself reaching for localised state much more often - and was occasionally tempted into using global state. In most cases, this state became problematic later. Before using F# I thought purity was a good thing, now I'm convinced that purity is a good thing but that impurity is often very useful!

Haskell could learn some things from F#. Every statement in F# must either be bound to a value or evaluate to (). In Haskell it is possible to write do mapM f xs; return 1. Any experienced Haskell programmer should spot that the mapM is a space leak (it should be mapM_), but the type system doesn't enforce it. In F# the type system does. The change in Haskell is simple, and in my opinion, desirable.

The Platform

F# is a fully fledged member of the .NET platform. You can write a class in VB.NET, write a derived class in F#, and then derive from that class in C#. All the languages can produce and consume the same libraries. This integration with .NET allows companies that already use the Microsoft tools to easily migrate - even on a function by function basis. However, the combination of both an imperative framework and a functional language at some times leads to confusing choices. All of the standard .NET libraries work with arrays, but for a functional program the list is a more natural type. F# provides both, and it was never clear which I should use where, leading to lots of conversions. The .NET libraries are very powerful, but often are overly imperative. For example, the XSD libraries (Xml Schema Description) are very imperative - you have to create objects, mutate properties, then make calls. However, in Haskell, I probably wouldn't have had any XSD support, certainly nothing as well-supported as in .NET.

The Tool Chain

The most impressive aspect of F# is the integration with the Visual Studio environment. F# contains a debugger, profiler, auto-completion, identifier lookup and many other tools. While other functional languages have some of these tools, the Visual Studio environment tends to have very refined and polished implementations. The integration with F# is sometimes a little fragile, or at least was in the version I was using, but the tools are already very powerful and are likely to continue to improve.

Overall

The F# language isn't the most beautiful language ever, but it's not bad. The integration with .NET is incredible, and while this requires compromises in the language, the benefits are considerable. I still prefer Haskell as a language, but for many users the tool chain is a more important consideration, and here F# excels.

This post was brought to you by Ctrl and v, as the computer I am currently using doesn't have a # key!

Update: I'd recommend reading Vesa Karvonen's comment below - he has additional perspectives on F# from a more ML perspective.

25 comments:

  1. Could you explain why exactly "do mapM f xs; return 1" is a space leak?

    ReplyDelete
  2. @Shin

    It isn't a space leak. It constructs a result value, only to throw it away. So it is just more inefficient. No memory actually "leaks". So I think ndm mispoke.

    The one question with this strategy for always binding to value is the interaction with laziness. I like to be able to not evaluate values I don't care about. I like to be able to write things like:

    foo >>= \_ -> bar

    So perhaps it is really more of an opimisation problem than anything else.

    ReplyDelete
  3. Anonymous10:05 PM

    I was wondering the same. I can see that it will cause unnecessary use of space during evaluation, but once done I think it would be GCed.

    ReplyDelete
  4. I too have been using F# for a few months. Before doing any programming
    in F# I've programmed in both SML and (somewhat less in) O'Caml.

    > many of the weaknesses in F# come from ML

    Depends on which weaknesses you are referring to and what do you mean by
    ML. Personally, I think that F# is an ML wannabee language that has been
    intentionally bastardized for both legacy .NET and legacy O'Caml
    interoperability for obvious reasons.

    > The type checker in F# is powerful, but unpredictable.

    I'm not sure whether I would call F#'s type checker powerful. I find F#'s
    type system overall less expressive than O'Caml's or Alice ML's, for
    example (although the type systems are likely to be technically
    incomparable). But, particularly compared to SML, I'd say that F#'s type
    system feels very fragile and rough. Calling it unpredictable would be a
    misnomer, because it is quite deterministic and after a while you learn
    the rough spots.

    There are many idioms that work in both SML and O'Caml (and,
    unsurprisingly, even in Haskell due to the H-M heritage), but fail
    spectacularly in F# due to the fragility of the type system. For example,
    in both SML and O'Caml, one can define the auxiliary function (here in SML
    syntax)

    fun undefined _ = raise Failure "undefined"

    with the spec

    val undefined : 'a -> 'b

    which is quite useful, among other things, as a dummy value for functions
    that you've specified in a module signature, but haven't yet implemented
    in the corresponding module (again in SML syntax):

    signature FOO = sig
    val bar : int -> int
    end
    structure Foo :> FOO = struct
    val bar = undefined
    (* Fine, as 'a -> 'b can be instantiated to int -> int. *)
    end

    Unfortunately, in F#, this doesn't type check.

    Another thing I took note of recently, is that F#'s type system is rather
    unprincipal, so to say. Suppose you have a variable, ls, which is a list
    of lists. One might assume that one could then write

    List.map (fun l -> l.Length) ls

    But this doesn't type check (without further annotations). On the other
    hand, the following does

    ls |> List.map (fun l -> l.length)

    This just basically follows from the fact that F#, with those pesky
    nominal .NET object types, has no notion of principal types or typings to
    speak of.

    These are just two isolated examples that I could recall immediately.
    While using F# for a few months, I've run into *many* rough corners like
    these. Compared to SML, it feels like I have to constantly spoon feed
    F#'s crybaby type checker.

    > The tuple type is treated specially in many cases, and this also leads
    > to surprise - inserting or removing a pair of brackets can effect the
    > type checker.

    This collection of deficiencies is inherited from both O'Caml and .NET.
    First of all, O'Caml's syntax allows one to omit parentheses from tuples
    (in both patterns and expressions) in a number of places, which can be
    quite confusing. In addition, what looks like a tuple type in a data
    constructor isn't actually a tuple type, even if it looks like one.
    (Neither of these warts exist in SML.) The same goes for multiple
    argument .NET functions.

    > In most cases, this state became problematic later.

    Details please. Where and how did you use state and when and how did it
    become problematic?

    > All of the standard .NET libraries work with arrays, but for a
    > functional program the list is a more natural type. F# provides both,
    > and it was never clear which I should use where, leading to lots of
    > conversions.

    I think that answer to that is to use sequences (IEnumerable) rather than
    arrays or lists in interfaces intended for interoperable code. Both
    arrays and lists can be trivially converted to sequences. Inside the
    abstraction boundary, you can then use whatever sequence type (arrays,
    lists, both, or something else) that has the desired properties.

    ReplyDelete
  5. @ Vesa:

    I'm a rank ML family beginner, so please pardon me if my question is silly.

    Whilst I realise you are making a valid point about the F# type system with your nested list example, would you really use a lambda in real life here? Can't one just use:

    ls |> List.map List.length

    Am I missing something here?

    ReplyDelete
  6. @Genji

    It is just an example that reveals a fundamental weakness of F#'s type system. In fact, the issue has nothing to do with any of the operations (|> and List.map) or the specific member (.Length) used in the example. You can try googling for "principal types" and "principal typings", but I'm afraid the papers you'll find that way will require a background in PLT to appreciate.

    ReplyDelete
  7. Anonymous7:55 AM

    I think 'mapM' vs 'mapM_' problem whould be solved by giving 'mapM_' more suitable and readable name like 'forEach'.

    ReplyDelete
  8. Phew. No problem with it being way over my head... just wanted to make sure I wasn't missing something trivial. Thanks for clarification.

    ReplyDelete
  9. @Shin, Dons, Kamal:

    main = mapM putChar (replicate 10000 'a') >> return ()

    This has a maximum residency of 226Kb.

    main = mapM_ putChar (replicate 10000 'a') >> return ()

    This has a maximum residency of 2Kb.

    I mean space leak as in it uses much more space than necessary, and particularly holds on to that memory at GC time. I don't mean that that memory will be overallocated and never reclaimed - I'd term that a memory leak.

    @Dons I don't mind if people bind to a, or _, I really just mean that >> should be m () -> m b -> m b.

    ReplyDelete
  10. mapM can impact execution time as well; see GHC Ticket #2236.

    ReplyDelete
  11. Hi Neil, which programming language(s) are you using at Standard Chartered?

    ReplyDelete
  12. felixmar: C++ and Haskell seem to be the prevailing tools.

    ReplyDelete
  13. Thanks for the blog, Neil. Was interesting to see your perspective.

    I'm kind of worried by your definitive statement that "the lack of type classes precludes the standard implementation of things such as Uniplate and QuickCheck" as this is precisely what I have been trying to do for a couple of weekends now :) I find that the generics/reflection do substitute typeclasses to a great degree; the sad part is that you cannot express monadic idioms with generics such as 'm<'x>, but that does not seem to be a problem specifically for Uniplate and QuickCheck.

    I wonder if there are any theoretical reasons why F# .NET integration cannot be combined with Haskell's strengths such as the type checker? Is a .NET Haskell compiler anywhere on the horizon? A YHC backend? :)

    @Vesa

    Absolutely agree about spoon-feeding the type checker! Just a nitpick about the undefined function - what's exactly the problem?

    let undefined<'a,'b> (x: 'a) : 'b = failwith "Not imlpemented";;

    ReplyDelete
  14. @Anton Tayanovskyy

    Try putting

    module Foo
    val bar : int -> int

    into a foo.fsi file and

    module Foo
    let undefined _ = failwith "undefined"
    let bar = undefined

    into a foo.fs file. Then compile both with fsc and see what you get.

    ReplyDelete
  15. @Anton:

    >Is a .NET Haskell compiler anywhere on
    >the horizon? A YHC backend? :)

    Is anybody interested to "revive" the existing Yhc backend for .NET (whatever is in http://darcs.haskell.org/yhc/src/compiler98/DotNet) with Hugs as Yhc Core producer? The Hugs -> Yhc Core part seems to be ready for somebody to experiment with ;)

    ReplyDelete
  16. @Anton: By "the lack of type classes precludes the standard implementation of things such as Uniplate and QuickCheck" I just mean you have to be a bit creative - and I think the right direction probably does rely on using generics/reflection.

    ReplyDelete
  17. Hi

    I tested the lazyness of F#, and found something interesting, which i can not understand.
    First declare an infinite seq:
    let infinite_seq = Seq.init_infinite (fun i -> i+1)

    After create a span, and groupBy function's in Haskell's style.
    let rec span p l =
    (seq { if (Seq.is_empty l) then
    ()
    else
    let x = Seq.hd l
    let xs = Seq.skip 1 l
    let ys,zs = span p xs
    if (p x) then
    yield x
    else
    ()
    }, l)

    let rec groupBy eq l =
    (seq{
    if Seq.is_empty l then
    ()
    else
    let x = Seq.hd l
    let xs = Seq.skip 1 l
    let (ys,zs) = (span (eq x) xs)
    yield! seq{yield (Seq.append (Seq.singleton x) ys) }
    yield! (groupBy eq zs)
    })

    And iter the grouped infinite seq.
    Seq.iter (fun a -> printfn "%A" a) (groupBy (fun a b -> (a-b)%2=0) infinite_seq)

    It is lazy, but to reach the nth element of the seq is not linear, and looks more worse like exponential.

    Why? I messed up something in gropuby, or the F# seq is messed up?

    P.S.: Oh, it looks like my code snippets lost their indentation:(

    ReplyDelete
  18. Anonymous12:02 AM

    Hi Daniel,

    F# sequences are not the same as Haskell's lists. For example, Seq.skip returns a sequence that restarts at the start of the list each time it is iterated. That causes the result you see. It's something you have to know to use "seq" effectively, like List.nth being O(n).

    You might use a "LazyList<_>" for this (from the F# Power Pack - ignore the bit about it being deprecated - we're not deprecating it any more)

    ReplyDelete
  19. @vesa

    For your example, I hope you are aware of the way to spoonfeed the typechecker the missing info, but for the benefits of others, the following compiles:

    (* foo.fsi *)
    module Foo
    val bar : int -> int

    (* foo.fs *)
    module Foo
    let undefined = failwith "undefined"
    let bar (_ : int) : int = undefined


    So the problem (one of many) really is that the typechecker does not use signature files to infer the types of procedures in the implementation files.

    ReplyDelete
  20. Hi Neil,

    I'm currently trying to learn Haskell (GHCI & the book "Real World Haskell"). And I'm a bit lost... I come from the OO/MSFT/VisualStudio world, therefore I'm a lazy programmer... I love to have a single IDE with all the development tools integrated in a single place.
    Question: what is the best set of tools to work efficiently in Haskell (IDE, syntax highlighting, debugger, auto.completion maybe, ...).

    Thanks,

    Juanjo

    ReplyDelete
  21. Juanjo: I'm sorry but Haskell lacks the kind of integrated environment you describe - I just use a plain text editor (Textpad.com)

    ReplyDelete
  22. @Juan

    "I come from the OO/MSFT/VisualStudio world, therefore I'm a lazy programmer"

    There are many types of lazy programmers, including Perl scripters, non-strict programmers and IDE users. ;)

    Anyway, a *nix shell always felt to me like a "single IDE with all the development tools integrated in a single place": you want something, just call it. Couple that with a truly remarkable text editor like vim or emacs and you're done.

    ReplyDelete
  23. Anonymous11:04 AM

    Vesa, the following works in my copy of F#. (F# Version 1.9.7.4)


    let undefined _ = raise (exn "undefined")
    let (bar: int->int) = undefined

    -M

    ReplyDelete
  24. SkyTreeBird4:08 AM

    I'm coming to F# from Haskell and to be honest I'm finding it bloody awful. I am using the book 'Programming F#' by Chris Smith. Given that Haskell is my first and only language, I am finding that to learn F# I am also having to learn C# as the programs are littered with C# methods and C# syntax and code that does not fit with F#. They call it 'interoperability' but I call it schizophrenia. I also get the feeling feels like you couldn't write an reasonably complex F# program without it being littered with C# methods and code. To me thats an incomplete language. Therefore, I'm just going to learn C#, and maybe even this Nemerle. At least they're complete and consistent and I don't have to start flip over to writing Python or Lisp in the middle. I also find the syntax inconsistent. This is usually the hallmark of a comittee language, but didn't Don Syme design F#? It's probably cos they were constrained by OCAML but wanted to add some new stuff. It really is a frankenstiens monster.

    ReplyDelete
  25. SkyTreeBid: Even though Haskell is by far my favourite language, it's still worth learning C# - just to communicate with Java/C# devs. Once you know C# and Haskell, then F# is pretty much midway between, and it should be relatively easy to understand. However, I can see your point about F# - being two different styles in one, it does sound like a bigger learning curve is to be expected.

    I guess people coming to F# know Haskell, C#, both or neither. This review was very much coming from the both perspective, but I think it would be interesting to see what each group thought.

    ReplyDelete