Saturday, September 18, 2010

Three Closed GHC Bugs I Wish Were Open

Summary: I want three changes to the Haskell standard libraries. System.Info.isWindows should be added, Control.Monad.concatMapM should be added, and Control.Concurrent.QSem should work with negative quantities.

Over the last few hours I've been going through my inbox, trying to deal with some of my older emails. In that process, I've had to admit defeat on three GHC bugs that I'd left in my inbox to come back to. All these bugs relate to changes to the Haskell standard libraries, that were opened as bugs, and that got resolved as closed/wontfix. I will never get time to tackle these bugs, but perhaps someone will? The bugs are:

Add System.Info.isWindows - bug 1590

module System.Info where

-- | Check if the operating system is a Windows derivative. Returns True on
-- all Windows systems (Win95, Win98 ... Vista, Win7), and False on all others
isWindows :: Bool
isWindows = os == "mingw"

Currently the recognised way to test at runtime if your application is being run on Windows is:

import System.Info

.... = os == "mingw"

This is wrong for many reasons:

  • The return result of os is not an operating system, but a ported toolchain.

  • The result "mingw" does not imply that MinGW is installed on the computer.

  • String comparisons are unsafe and unchecked, a simple typo breaks this code.

  • In GHC this comparison will take place at runtime, even though the result is a constant.

The Haskell abstractions and command line tools for almost all non-Windows operating systems have converged to the point where most programs with operating system specific behaviour have two cases - one for Windows and one for everything else. It makes sense to directly support what is probably the most common usage of the os function, and to encourage people away from the C preprocessor where possible.

Add Control.Monad.concatMapM - bug 2042

module Control.Monad where

-- | The 'concatMapM' function generalizes 'concatMap' to arbitrary monads.
concatMapM :: (Monad m) => (a -> m [b]) -> [a] -> m [b]
concatMapM f xs = liftM concat (mapM f xs)

I've personally defined this function in binarydefer, catch, derive, hlint, hoogle, my website generator and yhc. There's even a copy in GHC. If a function has been defined identically that many times, it clear deserves to be in the standard library. We have mapM, filterM, zipWithM, but concatMapM is suspiciously absent.

Make Control.Concurrent.QSem work with negatives - bug 3159

The QSem module defines a quantity semaphore, where the quantity semaphores must be natural numbers. Attempts to construct semaphores with negative numbers raise an error. There is, however, a perfectly sensible and obvious interpretation if negative numbers are allowed. It is a shame that this module could provide total functions, which never raise an error, but does not. In addition, for some problems the use of negative quantity semaphores is more natural.

What Now?

I've removed all these bugs from my inbox, and invite someone else to take up the cause - I just don't have the time. Until these issues are resolved, I will test for Windows in a horrible way, define concatMapM whenever I start a new project, and lament the lack of generality in QSem. None of the issues is particularly serious, but all are slightly annoying.

Email etiquette: Today I've cleared about 50 emails from my inbox. Unfortunately my inbox remains big and unwieldy. If you ever email me, and I don't get back to you, email me again a week later. As long as you reply to the first message, Gmail will collapse the reply in to the original conversation, and there won't be any additional load on my inbox - it will just remind me that I should have dealt with your email. I apologise for any emails that have fallen through the cracks.


Sjoerd Visscher said...

In the FMList package I've added foldMapA, which is concatMapM but a bit more general.

Josef said...

I'd be interested to hear about use cases for semaphores with negative quantities. Do you have any pointers?

sclv said...

foldMap should be a general purpose replacement for concatMapM as is, I think?

Anonymous said...

Negative semaphores can be used if your main thread has, say, dispatched some job out to three other threads and wants to wake up when they are done. Create a semaphore with -2, lock on it (bringing it to -3), and wait for the other three threads to unlock the semaphore, bringing it back up to zero and unlocking the original thread, which now knows the task is done. (Tune numbers as appropriate for the way the library triggers on semaphores.)

Whether this is ever the best solution is Haskell specifically I can't attest to, but I've used this trick in non-Haskell situations with an impoverished concurrency story to wring some relatively sophisticated control-inversion behaviors out of environments where people thought this wasn't possible.

Neil Mitchell said...

Sjoerd and sclv: I don't want foldMap or similar, I want concatMapM. I want the type to be as restrictive as concatMap (but generalised to monads) and I want the symmetry of having concatMapM. I realise lots of functions generalised it (over monoids, functors etc), but there is an advantage to having the simple one as well (i.e. map and fmap).

Josef: I used it in the implementation of a thread pool, to trigger when all the threads had finished - very similarly to thejerfb.

Duncan said...

In Cabal we have an enumeration for the OS which is clearly better than a string. Asking for isWindows is too much though, for purely political reasons, it elevates one over all others (though it is true that it's the most different). But with an enumeration it's still safe and easy to check for windows or other OSs.

saynte said...

I think you really have to make a case why having a negative quantity is both obvious and sensible, which you haven't done in your bug report. You say it is, but provide no argument for why it is obvious and sensible. I think the reverse argument would be that it's better to have a partial function that behaves as it is supposed to, than a total one that may not meet the general definition of semaphore.

In any case, wouldn't QSemN fit your use-case (with positive initial resources)? This was also mentioned in the bug report by igloo.

Neil Mitchell said...

Duncan: I'm perfectly happy with an enumeration, whatever people are happy with.

saynte: I disagree. The semantics of negative semaphores are obvious - you wait if there are <= 0 resources available, instead of == 0. I think you should have to argue to make a function partially undefined (i.e. crash) instead of just work. There are always alternatives to any concurrency problem - this is more about aiming for total functions over partial ones. In actual fact, I solved my particular problem in an entirely different way, without semaphores, in the end.

saynte said...

Neil: You make a good point for the obvious extension, but not sensibility (likely the harder of the two).

For example; it is a general interpretation of semaphores that they restrict usage of a resource up to a maximum: you now allow that maximum to be negative. Under your proposed extension, this fairly common explanation of semaphores is now nonsensical.

Neil Mitchell said...

saynte: Since you're replacing a crash with defined behaviour you can still explain the semaphore the old way, it will just work in more cases. I find it quite natural that semaphores can be negative, since I programmed using as if they were without even thinking they might not be!