Sunday, May 26, 2013

Three types of build-system dependency

Summary: There are three types of dependencies you might want to express in a build system, all of which are supported by Shake.

A build system, at its heart, is a system which runs commands in an order satisfying user-specified dependencies. But what kind of dependencies can be expressed? This post describes three different types of dependency, only one of which is available in Make, but all of which are available in both Shake and Ninja.

Feature 1: Static dependencies (available in every build system)

The most basic form of dependency is a static dependency, where a rule produces an output from some inputs:

-- In Make --
result.tar : file1 file2
    tar -cf result.tar file1 file2

-- In Shake --
"result.tar" *> \out -> do
    let deps = ["file1","file2"]
    need deps
    cmd "tar -cf" [out] deps

This rule says that the file result.tar depends on the inputs file1 and file2, and provides a command to build result.tar. Whenever file1 or file2 change, the command will be run, and result.tar will be built.

Static dependencies occur in almost every build rule, and are supported by all build tools, including Make and Shake.

Feature 2: Dynamic dependencies (available in Shake, Ninja, Redo and tup)

A more advanced dependency is where the list of dependencies itself depends on the results of previous dependencies. Imagine we want to build result.tar from the list of files stored in list.txt. The dependencies of result.tar cannot be specified statically, but depend on the contents of list.txt. In Shake we can write:

"result.tar" *> \out -> do
    need ["list.txt"]
    contents <- readFileLines "list.txt"
    need contents
    cmd "tar -cf" [out] contents

This rule describes how to build result.tar. We depend on (need) the file list.txt. We read each line from list.txt into the variable contents - being a list of the files that should go into result.tar. Next, we depend on all the files in contents, and finally call the tar program. If either list.txt changes, or any of the files listed by list.txt change, then result.tar will be rebuilt.

This feature is necessary in almost every build system, yet is shockingly lacking from most build tools - I am only aware of it being available in Shake, Ninja, Redo and tup. As a common example, in Make you might write:

result.o : result.c result_header1.h result_header2.h
    gcc ...

The file result.o depends on both the C source file result.c and all headers that file includes. But listing the headers both in result.c with #include directives, and in the Makefile, is a brittle form of duplication. A better approach is for the build system to run gcc -M result.c and extract the includes from there. In Shake we can write:

"result.o" *> \out -> do
    let src = "result.c"
    Stdout stdout <- cmd "gcc -MM" [src]
    need $ src : drop 2 (words stdout)
    cmd "gcc -o" [out] "-c" [src]

My experience is that about a quarter of rules require some kind of additional dependency based on previous dependencies. While you can hack round some of the issues in Make, and people have become disturbingly adept at doing so, the result often only approximates the dependencies - building either too much or too little.

Feature 3: Multiple outputs from one rule (available in Shake and Ninja)

The final feature is producing multiple outputs from one command, and is used far more rarely (perhaps one or two rules in a complex build system) - but when needed, is essential. Some programs, such as GHC, can produce two outputs with one command - compiling Foo.hs produces both Foo.o and Foo.hi. As a first approximation, the .o file depends on the entire contents of the source file, while the .hi file depends only on the type signatures. A single ghc invocation needs to do all the work to produce both, but often the .hi file will be left unchanged. In Shake we can write:

["Foo.hi","Foo.o"] *>> \_ -> do
    need ["Foo.hs"]
    cmd "gcc -c Foo.hs"

While it is often possible to construct a series of dependencies to approximate a single rule producing multiple outputs, it only works in some cases, and is brittle. The only build systems I am aware of which support multiple outputs are Shake and Ninja.

Essential features

My standard advice when people ask about writing a build system is "don't". If some existing build system (e.g. ghc --make or Cabal) is capable of building your project, use that instead. Custom build systems are necessary for many complex projects, but many projects are not complex. If you have decided your project is complex, you should use a build tool that can express complex dependencies, both for writing the initial system and to provide the flexibility to make the inevitable changes required.

Looking only at dependency features, I would consider it unwise to start a complex build system using a tool other than Shake or Ninja, or perhaps either Redo or tup (if you accept the absence of multiple outputs from one rule).

Weak dependency specification in build tools, particularly Make, has left its mark on many programs. I recently talked to some OCaml hackers complaining that their tools were not "Make friendly" because they produced multiple output files. I wonder what lengths other tools have gone to in order to cope with weak dependency specification...

Update: The relative power of tup was reported as a comment, and it appears to have the necessary power, but I haven't yet checked. Following further research into Ninja I suspect it may not be as powerful as originally stated and may not have Feature 2, but am not yet sure. Make gained support for multiple outputs after this post was published.

Wednesday, April 17, 2013

Buffer smashing in NSIS

Summary: I've identified two fixed-size buffer errors in the NSIS compiler generator, one of which is a buffer overflow leading to a segfault. My nsis library works round them to some extent.

My nsis Haskell library provides a layer over the NSIS installer generator. NSIS scripts can be viewed as assembly code for a virtual machine with 16 general purpose registers, the ability to define new registers, and a stack - where all locations store strings and instructions are register-to-register. My nsis library abstracts over these details to provide something with more traditional flow control (e.g. while, if), compound expressions and type safety, using NSIS as an assembler.

A high-level language tends to tickle areas of an assembler that a human would not. In particular, NSIS has two fixed-buffer size errors that nsis has triggered.

Bug 1: String literals >= 4096 bytes cause segfaults

If you write a string literal in the NSIS source which is 4096 characters or longer, the NSIS generator segfaults. As an example (ignoring lots of NSIS boilerplate):

Var foo
StrCpy $foo "XXX...XXX"

If the string has 4095 X characters it works. As soon as you have 4096 X characters or more the NSIS generator segfaults. My guess is that the NSIS lexer has a 4096 character buffer that is overflowed.

As a workaround, you can do:

Var foo
StrCpy $foo1 "XXX...XXX"
StrCpy $foo2 "XXX...XXX"
StrCpy $foo $foo1$foo2

As long as both $foo1 and $foo2 are less than 4096 characters, you can combine them to produce $foo without error (as far as I can tell).

Bug 2: fileWrite truncates its output at 1023 bytes

When writing a file, all FileWrite calls are truncated to 1023 bytes. As an example:

FileOpen $h "output.txt" w
FileWrite $h "XXX...XXX"

If there are more than 1023 X characters, only the first 1023 will be written. My guess is that FileWrite has a 1024 character buffer for output.

As a workaround, you can write the file in smaller chunks, using multiple FileWrite instructions.

Workarounds in nsis-0.2.2

Manipulating long strings in NSIS is not that common. The example that caused me to look at buffer sizes was writing out a configuration file line-by-line, for example:

writeFileLines "$INSTDIR/config.ini"
    ["[config1]"
    ,"InstallDir=$INSTDIR"
    ,...
    ]

In nsis-0.2.1 writeFileLines was defined as:

writeFileLines a b = writeFile' a $ strUnlines b

This function merges all lines together then writes them to the file in one go, which truncates if the whole output is longer than 1023 characters. In addition, the nsis optimiser will often perform strUnlines at compile time, so the NSIS assembler gets a single literal, potentially exceeding 4095 characters. To avoid both these problems, in nsis-0.2.2 I have defined:

writeFileLines a b = withFile' ModeWrite a $ \hdl ->
    forM_ b $ \s -> fileWrite hdl $ s & "\r\n"

This revised definition writes the lines one by one. If any line is longer than 1023 characters it will still be truncated, but that is less likely than before. I will be reporting these issues to the NSIS team, so hopefully they can be fixed at source.

It would be possible to apply the workarounds directly in the nsis library, but I have not yet done so. Using long strings in an installer is rare, so hopefully the problem will not impact anyone. In addition, while investigating the file truncation bug I found the string literal bug, so I wonder what other buffer bugs might lurk in NSIS.

Tuesday, February 26, 2013

Shake Links

Summary: Shake is now hosted on GitHub and there is a mailing list.

I've just converted Shake over to Git, hosted on GitHub, and created a mailing list. The full set of Shake related links is now:

  • Download the Haskell package from Hackage and install it using Cabal.
  • Documentation, including examples and a list of all the available functions.
  • Mailing list for any questions/thoughts on Shake.
    • Questions may also be asked on StackOverflow with the tag shake-build-system.
    • Bugs may also be reported on the GitHub issue tracker, but if you aren't sure, use the mailing list.
  • Source code in a git repo, stored at GitHub.

If you want to talk to me about Shake you can now write an email to the mailing list, create a question on StackOverflow, or open a GitHub issue (or still contact me by personal email). I'm available on any means of communication, so pick what suits you best - I'm expecting most users will start with a StackOverflow question if it's a "how do I..." question, or the mailing list if it's more of a discussion/suggestion.

I'm very interested in hearing tales of people who have successfully used Shake, so if Shake works for you, perhaps write it up on a blog, or just send an email to the mailing list. I'm also interested in what doesn't work, what was difficult, what documentation was lacking etc.

Why the sudden Shake activism? Some readers may have noticed a sudden surge in Shake related activity. The reason is that I was reminded of the problems with bad build systems, after using only shake/ghc for many years. I had to work with a team for several hours with a build system written in SCons. The "null build" (with nothing to do) took 38 seconds. The build was structured as three separate SCons build systems, one which compiled a code generator, one which ran the code generator, then another that compiled the result. The build could only run single threaded because it made use of the current directory. A project that was not particularly complex was suffering at the hands of their build system. Some of these are limitations in SCons, some are weaknesses in their use of SCons, but the build system was hindering them, not helping them. I wasted a lot of my time waiting for rebuilds that should have taken seconds. Developers deserve better.

Monday, February 25, 2013

Chasing a Space Leak in Shake

Summary: Shake v0.3 had a serious space leak that went undiagnosed for over a year, this post describes how I tracked it down.

Introduction to Space Leaks

One of the downsides of a lazy language is the possibility of space leaks. A space leak is when your program uses more memory than it should, typically because lazy evaluation is holding on to things that if evaluated would disappear. As an example, let's build up a set:

myset = Set.delete dead $ Set.fromList [alive, dead]

Consider the value myset. We first create a set with alive and dead, then delete dead. Assuming the values dead and alive are used nowhere else in the program, which values does the garbage collector consider to be alive?

In a strict language only alive would be considered alive. But in a lazy language, as we have not yet evaluated myset, both alive and dead are kept alive. You can think of myset being represented in memory as Thunk (Set.delete dead) (Value (Set.fromList [alive,dead])), instead of just Value (Set.fromList [alive]). You are storing the list of operations that will be applied to the set, rather than the result of those operations. These operations reference dead, so we cannot garbage collect it. Once we have identified the problem, we can apply strictness operations to force the value, for example:

evaluate myset

The evaluate function forces evaluation of the first part of myset. If our set data type is strict, that will force evaluation of the entire myset value. After evaluate there will be no references to dead, and it will be garbage collected. In other situations bang patterns, strict fields or seq can be used for eliminating space leaks.

Finding the Space Leak in Shake

The space leak in Shake was found over a period of a few weeks, ultimately with me waking up at 4am on New Years Day with knowledge of where an exclamation mark should be placed. However, I will present the account somewhat more methodically...

Step 1: Admitting you have a problem

The first step to fixing a space leak is the realisation that a space leak exists. My suspicion is that many (most?) large Haskell programs have space leaks, but they often go unnoticed. My first report of "memory issues" with Shake was a few months before I tracked it down. Shake is a library that makes use of user supplied rules, and in this example, the user supplied rules were far from trivial. Complicating the problem is the fact that Shake stores a list of all rules it has run in memory, which naturally grows over time. What finally convinced me that there was a problem was when several clean builds failed by exhausting memory, taking greater than 2Gb on a 32bit system.

Step 2: Seeing the problem with your own eyes

After suspecting a space leak, the first thing to do is measure memory usage, which I did using Process Explorer. The graph to the left shows a clear increase in memory. The memory doesn't grow all at once, but in steps - typical of a garbage collected language. The time between memory growth is not evenly spaced, which is reasonable for a program that runs different user rules all of which take different amounts of time. I reran the program several times, and while memory always increased, the shape of the graph varied quite considerably - as expected when running a non-deterministic program such as Shake. While I still didn't have any real information on what caused the memory leak, I could at least observe the memory leak myself.

Step 3: Hit it with your tools

Before thinking too hard, it is worth applying whatever tools are to hand. In the case of space leaks, the GHC manual describes how to use the heap profiling tools. My standard pattern is to compile with -rtsopts -prof -auto-all -caf-all, run with +RTS -h and view with hp2ps -c. Since the test case was huge (over an hour), I always terminated it early with Ctrl-C, which produces an invalid .hp file. Fortunately, you can delete everything after the last END_SAMPLE to recover a profile that hp2ps can understand.

The results on the right show steadily increasing memory allocated to PINNED values, of type ARR_WORDS, which I suspect are bytestrings. However, the total memory usage as reported by GHC and Process Explorer far exceeded that shown in the profile. It seemed like there was something missing from the profile. I upgraded to the latest GHC version but memory still seemed to be missing. The tools were not giving me any insight - perhaps due to the high degree of parallelism or other complexities in the build system.

Step 4: Simplify

The next step was to produce a simpler example - one I could run to completion in a feasible amount of time. My standard approach is to take the full test case and remove things, checking the problem has not gone away, until nothing more can be removed - leaving a minimal test case. There are two reasons why that was infeasible in this instance: 1) With a program which naturally consumes memory over time, it is not clear if the space leak has disappeared or merely become smaller as things are removed. 2) The first reduction steps would have each taken over an hour.

After deciding reduction was infeasible, I decided to try and produce a test case from scratch (which is usually a terrible idea). The original example used many different types of build rule, in many different ways - it was entirely possible that only one particular variant led to the memory leak. I decided to start with a small example, then (if necessary) try adding features until the space leak returned. Since I had been meaning to produce a benchmark for Shake at some point, I figured I could write a benchmark test which would hopefully show the bug, and even if not, be useful elsewhere.

For benchmarking, generating a random build tree is not so useful, so I attempted to define a regular shaped but interesting dependency pattern. The pattern I settled on was parameterised by breadth and depth, where every node a depth n depended on every node at depth n+1, to a limit of the given depth, with a given number of nodes at each level given by breadth. To accurately benchmark I used the file rules, as these are by far the most common.

I found that with breadth=1000 depth=1000 I was able to exhaust the memory. In order to try and simplify the test case I tried passing flags to the test case to turn off certain features and try and make the problem easier to investigate. I was able to turn off multithreading (-j1), profile reports (--no-report), always build from scratch (--clean) and to build rules in a deterministic order (--deterministic). The final command was:

$ shake benchmark breadth=1000 depth=1000 --deterministic -j1 --clean --no-report

This command works using a released version of Shake if you install the test program (install with --flags=testprog).

With the knowledge that 1000x1000 exceeded available memory I start reducing the numbers so the benchmark would complete in a reasonable timeframe (< 15s), but use an unreasonable amount of memory (> 500Mb). I found that 100x100 gave a reasonable balance.

Step 5: Hit it with your tools (harder)

With a simple example, I once again turned to the heap profiling tools available in GHC. Using +RTS -h I still saw memory unaccounted for, as before. The two obvious guesses are that GHC knows about the memory, but has decided to not show it in the heap profile, or that GHC does not know about the memory (for example, allocated on a different heap). Using +RTS -s I saw that GHC was aware of the additional memory, suggesting GHC had at least some knowledge of the memory. With nothing else to try, I ran through the list of heap profiling flags trying each in turn.

The magic flags turned out to be -xt -hy, producing the graph on the left. The profile shows STACK takes up the majority of the space, and starts to give memory usage in about the right ballpark. I concluded that the space leak must include stack values.

Step 6: Hypothesize and test

A stack in GHC is created by a thread, and a thread is typically created by forkIO. There are three reasons stacks could take too much memory - too large, too many, too long:

  • The stacks might be becoming too large. I added -K1K to limit all threads to 1Kb of stack. The small test case ran to completion, without any threads exceeding their stack, so the stacks were not growing too large.
  • There might be too many threads active in the program. I added logging of every call to forkIO, and every time a thread finished. I found I was churning through 1000's of threads, but at most 60 were alive at any one time.
  • The threads might be keeping their stacks alive after they had finished for too long. Having eliminated the other possibilities, this seemed likely.

Looking for places threads were referenced, and thus potentially kept alive, I found a set I was building up containing ThreadIds. The set is used so that if one Shake thread throws an exception all other threads are killed immediately. When a thread started it was added to the set. When a thread finished it was removed. Under most circumstances the set was never evaluated, but simply thrown away when the build finished. This situation correponds to the example at the start of this post, and was causing a space leak of ThreadId.

I guessed that if the ThreadId is kept alive then the stack is also. As a result, leaking a small amount of memory for a thread identifier was amplified by also leaking the stack for that thread. GHC HQ could certainly modify the runtime system so that the ThreadId did not keep the stack alive, but holding on to completed ThreadIds is rarely a good idea, so I imagine it is not worth it. For more details, including a profiling showing that leaking ThreadIds does leak stacks, see this blog post on the GHC scheduler.

Step 7: Fix the problem

Having identified the likely culprit, I simply needed to force evaluation of the set at each step. The set was held within a larger data structure:

data S = S {threads :: Set.HashSet ThreadId, ...}

To always force the set, I added an exclamation mark, turning it into a strict record:

data S = S {threads :: !(Set.HashSet ThreadId), ...}

Since I had worked for days to find somewhere to insert a single character, I also added a comment, just in case I ever thought about removing the exclamation mark.

Step 8: Check the problem went away

To check the problem had gone away I reran my reduced test case, producing the graph on the right, and showing far less memory usage in Process Explorer. I then reran the full build system and saw a typical reduction in memory of around 1Gb - a most welcome improvement!

Shake-0.4 and later contain that important exclamation mark, so an upgrade is strongly advised.

Does your Haskell program need an additional exclamation mark? Here are some you can use: !!!!!!!!!!!!!!

Sunday, February 17, 2013

Finite Resources in Shake

Summary: Management of finite resources is an important part of any modern build system, only properly available in Shake and Ninja.

I've just released Shake 0.9, a build system library, with a few bug fixes and a bunch of new features (the change log has a complete list). This release contains an incompatible change which makes the Resource feature easier to use, so I thought I'd describe the motivation and use of Resources in Shake. A full upgrade guide is at the bottom of this post.

What are Resources for?

When you run -j10 (shakeThreads=10) you are asking the build system to limit computation so it uses no more than ten CPU resources at a time. The CPU is certainly a precious resource, but there are other resource limitations a build system may need to obey:

  • Some APIs are global in nature, if you run two programs that access the Excel API at the same time things start to fail.
  • Many people have large numbers of CPUs, but only one slow rotating hard drive. If you run ten hard-drive thrashing linkers simultaneously the computer is likely to grind to a halt.
  • Some proprietary software requires licenses, a fixed number of which can be purchased and managed using a license manager. As an example, the Kansas Lava team only have access to 48 licenses for modelsim.

Resources in other build systems

I know of two approaches used by other build systems to obey resource constraints:

  • Limit the number of CPUs to hit your target - for example, the Lava build system could cap the number of CPUs to the number of licenses. People with 24 CPUs might ask the build system to use only 8, so the linkers do not make their machines unusable (and even then, a link heavy rebuild may still harm interactive performance). This solution wastes CPU resources, leaving CPUs that could be building your code idling.
  • Add locks to suspend jobs that are competing for the shared resource. For example any rule using Excel could take the Excel lock, either a mutex/MVar in some build systems, or creating a file to serve as the lock in make based build systems. Locking can be made to work, but is tricky if you have to fake locks using the file system, and still squanders CPU resources - instead of blocking the CPU should be running another rule.

The one exception is the Ninja build system which has a concept of "pools", which properly model finite resources.

Resources in Shake

In Shake the Resource type represents a finite resource, which multiple build rules can use. Resource values are created with newResource and used with withResource. As an example, only one set of calls to the Excel API can occur at one time, therefore Excel is a finite resource of quantity 1. You can write:

shake shakeOptions{shakeThreads=2} $ do
    want ["a.xls","b.xls"]
    excel <- newResource "Excel" 1
    "*.xls" *> \out ->
        withResource excel 1 $
            system' "excel" [out,...]

Now we will never run two copies of excel simultaneously. Moreover, it will never block waiting for excel if there are other rules that could be run.

For most programming languages the compiler is CPU bound but the linker is disk bound. Running 8 linkers will often cause an 8 CPU system to grid to a halt. We can limit ourselves to 4 linkers with:

disk <- newResource "Disk" 4
want [show i <.> "exe" | i <- [1..100]]
    "*.exe" *> \out ->
        withResource disk 1 $
            system' "ld" ["-o",out,...]
    "*.o" *> \out ->
        system' "cl" ["-o",out,...]

Now we can use 7 or 8 CPUs while still leaving the computer responsive enough to browse the web.

Software licenses are another finite resource and can be managed in the same way. For a complete example see the Kansas Lava test program, which uses Shake.

Porting from Shake 0.8

In Shake 0.9 the newResource function has been renamed to newResourceIO - rename newResource to newResourceIO everywhere and your code will work again.

However, you may have noticed that newResourceIO (as it is now called) forces you to create the resource before calling the shake function, meaning that often the creation and use of the resource are far apart. I have introduced a function newResource which runs in the Rules monad, allowing you to create a resource and then use it nearby. Moving the creation and use of resources closer together makes it much easier to check your resource constraints are met.

The only other breaking change is that shakeVersion has become a String rather than an Int, allowing you to store more precise information about the version (for example, your build system might want to encode the GHC version and the version of the build system in the string).

Updated 18 Feb 2013: Ninja also supports finite resources.

Thursday, February 07, 2013

A Nofib build system using Shake

Last February I had a few discussions with David Terei about replacing the Nofib benchmark suite build system with something based on Shake. Naturally, like all Make to Shake conversions, I thought this was an excellent idea. In further discussions with Simon Peyton Jones and Simon Marlow I refined the build system to match the useful features of the old Make based system.

I have now put the build system online: Shake build system for Nofib.

Unfortunately, I don't think much has happened with the build system in the last year. I invite people to play with it and use it for whatever purpose they want. The rest of this post is divided into two sections - the first section is how to use the build system (for people running nofib) the second half is how it works and how it was written (for people writing build systems with Shake).

Running the nofib suite

Grab the nofib suite, take the above file and put it in the root directory. You can then run:

$ runhaskell Nofib -j2 imaginary --way="-O2 -fvia-C" --run
... build related output ...
Build completed
Running imaginary/bernouilli...501ms
Running imaginary/digits-of-e1...513ms
... more tests ...
Running imaginary/wheel-sieve2...238ms
Running imaginary/x2n1...27ms

The -j2 is the parallelism to use when building (not when testing), the --way is which flags to pass to the compiler (so it should be trivial to experiment LLVM vs not), the "imaginary" is which section to run, but naming an individual test such as "x2n1" or just leaving it blank for all also works. The --run says run the tests afterwards, defaulting to norm, but --run=fast/--run=slow also works.

The system uses whatever GHC is on your path first, and stores the output under that GHC version, but if you want two GHC's built with differing bits, you can pass --compiler=ghc-with-a-tweak to do the build under a separate directory.

A few more examples, assuming the build system is compiled as runner, with the nofib command and the new equivalent:

Run a single test:

$ cd nofib/imaginary/exp3_8
$ make

runner exp3_8 --run

Run a test with some extra GHC flags:

$ cd nofib/imaginary/exp3_8
$ make EXTRA_HC_OPTS=-fllvm

runner exp3_8 --way="-O1 -fllvm" --run

Run a test with a different GHC:

$ cd nofib/imaginary/exp3_8
$ make HC=ghc-7.0.4

runner exp3_8 --compiler=ghc-7.0.4 --run

Just build a test, don't run it:

$ make NoFibRuns=0

runner

Writing the build system

This build system is well commented, so I encourage people to read the code. It makes use of CmdArgs for command line parsing (lines 60-114) combined with Shake for dependency based tracking (lines 187-281). The total system is 358 lines long, and the rest is mostly pieces for running tests.

When writing the build system I had no idea what the old Makefile system did - I read what docs I could find, took a brief stab and reading the Makefile, but failed to even run it on the laptop I was working on. Therefore, I wrote the new system from scratch, and then looked back to see what the Makefiles did that the new system did not.

As is common is Make based build systems, the Makefile stored a combination of build rules (how to compile files, line 187), static confirmation data (which targets to build, line 31) and dynamic configuration data (what inputs to give when testing, line 289). The build rules and static configuration data can be easily moved into the new build system, either as function rules or top-level constants. The dynamic configuration can be moved over as top-level constants, but that means the build system requires modifying more than it should, and it is harder to track the dependencies on this data. Instead I followed an approach I've used in several build systems, by writing a converter which sucks the dynamic data out of the Makefile and produces a much simpler config file. The converter is in the build system, so the config files are generated from the Makefiles, and updated appropriately. To actually run the tests, I query the config files. The hope is that in future everyone will decide to delete the Makefiles, and the config files can be checked in as source files, and the translation aspect can be deleted.

When writing the build system I ran into a GHC bug that when two linkers ran simultaneously they failed. To work round this bug I created a resource of a linker, with quantity 1 (line 152). Whenever I call the GHC linker I acquire this resource (line 215). Now Shake will run the rest of the build system with maximum parallelism, but ensure no linkers ever run in parallel, working round the bug.

Thursday, January 31, 2013

Shake 0.8 is out

Summary: Shake 0.8 is out, ask any questions on StackOverflow

Shake is a build system I have been working on sporadically for the last four years, think of it as a better alternative to writing Makefiles. In the past few weeks I've released versions 0.6, 0.7 and 0.8.

Questions about Shake

Unlike many of my other libraries, Shake invites user questions. It's a complex tool with lots of power to wield, and lots of aspects that emerge from the API, rather than being obvious from it. Therefore, I encourage anyone with any questions about Shake to ask them against the shake-build-system StackOverflow tag (thanks to Lennart Augustsson for creating the tag, as my reputation is too low). I've already asked one question, but I'm sure there are lots of others - "how does Shake/Make compare to monad/arrow?", "why did the Oracle change type?", "how would I define a rule that downloads from the web?". The more questions the easier it will be for future Shake users to find the information they need.

API Change: getDirectoryFiles

There is only one real breaking API change in the above series of versions, getDirectoryFiles now takes a list of FilePatterns. So you can write:

getDirectoryFiles "Configuration" ["*.xml","*.json"]

to find all XML and JSON files in your Configuration directory. The reason for this change is to introduce a new and more powerful matching capability, so you can also write:

getDirectoryFiles "Configuration" ["//*.xml","//*.json"]

And that will find all XML and JSON files anywhere under the Configuration directory. Shake tries hard to issue the minimum number of directory traversals, so searching for a list of patterns results in fewer file system queries than searching for each pattern individually.

Sunday, January 06, 2013

Shake 0.5 is out

Summary: Shake 0.5 is out, a strongly recommended upgrade.

Shake is a build system I have been working on sporadically for the last four years, think of it as a better alternative to writing Makefiles. In the past week I've released version 0.4 followed by version 0.5. I strongly recommend all users upgrade to the latest version of Shake as it fixes a massive space leak compared to versions 0.2 and 0.3, I have measured memory savings of over 1Gb in some build systems (I intend to write a post on the space leak later). There are two notable API changes for people upgrading from version 0.3.

Change 1: The oracle is now strongly typed

Shake has a notion of oracle rules, which store information about the system, for example which GHC version you are running. The intent is to allow users to track this extra information, so if you upgrade GHC, the build system will automatically rebuild all Haskell files, but leave the C files alone, without requiring users to perform a wipe.

An Oracle is essentially a question/answer pair. In shake-0.3 both questions and answers were of type [String], the documentation attempted to justify this abomination by saying:

"This type is a compromise. Questions will often be the singleton list, but allowing a list of strings allows hierarchical schemes such as ghc-pkg shake, ghc-pkg base etc. The answers are often singleton lists, but sometimes are used as sets - for example the list of packages returned by ghc-pkg."

Shake required you to impose a global namespace on questions, and to encode results in an impoverished type. No more! The oracle now allows arbitrary question/answer types, namespacing is automatic since the type definitions act as unique questions, and the answer can be any type required.

The API for working with the oracle has changed, but the necessary modifications should be fairly localised. As an example conversion:

addOracle ["ghc"] $ return ["7.2.1"]
askOracle ["ghc"]

Would become:

newtype GhcVersion = GhcVersion () deriving (Show,Typeable,Eq,Hashable,Binary,NFData)
addOracle $ \GhcVersion{} -> return "7.2.1"
askOracleWith (GhcVersion ()) ""

There is a little more boilerplate with the new version, but all the problems caused by [String] are gone, and in larger projects it can lead to a significant reduction in complexity and cross-module namespace issues.


Change 2: validStored becomes storedValue

When defining your own rules (a rare occurrence) previously you had to supply a definition:

validStored :: key -> value -> IO Bool

Given a key, look up that value on disk and check it matches the value you were given. This API is very much what Shake needs to do at runtime - it needs to check that the value it has is still valid. Almost all definitions followed the form:

validStored key value = do
    v <- lookupMyKey key
    case v of
        Nothing -> return False -- not stored, so doesn't match
        Just v -> return $ v == value

In the new version of Shake, you would instead supply:

storedValue :: key -> IO (Maybe value)
storedValue = lookupMyKey

This revised definition simplifies most rules, still has the same power a old system (because you can define a custom Eq instance for your value type), and has allowed the addition of the AssumeClean feature to Shake (more about that in a future post, or read the Shake Haddock documentation).

Friday, November 02, 2012

Henry David Mitchell

I am delighted to announce that Henry David Mitchell was born 29th October 2012 weighing 4.02kg (8lb 14oz). Mother and baby are both doing well. Here is a picture of him in his first outfit.


Sunday, October 21, 2012

Haskell Exchange 2012 talks

Summary: The talks from the Haskell Exchange 2012 are online.

The Haskell Exchange 2012 took place last week. We had a good range of speakers, who all gave excellent talks. I would like to say thanks to all the speakers and to Skills Matter for organising and running the event. The talks are all available online, and I thought I'd just go through them now:

Simon Peyton Jones argued that purity and types are the key to Haskell. The purity thing is without doubt true - people cheat if they are allowed to, but Haskell discourages impurity both through the IO monad (effects are explicit) and community pressure (don't use unsafePerformIO unsafely). I also think that laziness combined with an aggressively optimising that messes up unsafePerformIO at the slightest opportunity have also helped (impurity always comes back to bite).

Simon Marlow explained the async package, showing how you can turn complex problems into simple ones with a simple mechanism that looks highly reusable. I have previously used a similar idea in my F# coding (although not nearly as thoroughly thought out), and been happy with the results - I suspect Simon's version will be even more useful. I particularly enjoyed the use of STM - it's rarely a concurrency solution I think of (I'm a threads and locks kind of person), but seeing it used so elegantly makes me want to experiment more.

Lennart Augustsson talked about embedded languages - where you write a language that looks a lot like Haskell, inside a Haskell program, only to find out later that it wasn't Haskell at all. Lennart is clearly the expert at these languages, having covered C, Basic and now financial combinators. Once you have these inner languages, you can take them in all kinds of directions, including right down to optimised assembly code.

Blake Rain gave an introduction to Yesod. I had read the Yesod documentation a while back, but the had trouble figuring out where to start, and what it was that mattered. Blake did both the sales pitch, and the beginners guide - I'll certainly be trying a Yesod at some point. In particular I really liked the type-safe routing, that would certainly have some in handy in my ASP developer days.

Duncan Coutts gave updates on the Cloud Haskell work, which in only a year, has gone from a research project to a practical library/tool that can be distributed at scale. The work covers details like how you distribute, why you distribute and how the model was designed (basically, copy Erlang). Another interesting aspect was how the real world development challenges, both the diverse nature of network/cloud computing, and how you can fund turning an academic idea into a user tool.

Rob Harrop showed how Ruby, Erlang and Haskell can all communicate using a message passing framework (AMQP). I certainly prefer Haskell, and go fully Haskell where possible, but a heterogeneous environment provides an easier migration path. Rob showed how to start and stop Haskell processes, switch elements from Ruby to Haskell and back again, all while a website was continuously responding to requests. This development style certainly provides an easy route in for Haskell, but also highlighted that Haskell still lacks some of the nicer sugar over AMQP that Ruby provides.

The final session was an open session where people discussed topics relating to Haskell. This session was fun, but I felt it could be better. I think it meandered at some points, and had a core of people who talked, but a large number of people who just watched. I'm not sure how I'd do things better, but it felt like some of the questions after the talks (Simon Peyton Jones and Duncan Coutts talks in particular) lead to more thorough discussions. I know this can be a problem for Haskell Symposium "Future of Haskell" discussions too, so perhaps there is some scope for tweaking the format?

Thursday, October 04, 2012

Haskell Exchange next Wednesday

The Haskell Exchange 2012 is happening all day next Wednesday, October 10th, in London. There is still time to register, and with the discount code HASKELLX-2012-TE1 you save £50, making it £175. This event is very much about learning how people use Haskell, and how you can use Haskell. It isn't academics giving their plans for the future, or perfecting the small details of the language, it is all about what you can do today.

I've been helping Skills Matter organise the program since the start of this year. The final list of speakers is Simon Peyton Jones, Simon Marlow, Lennart Augustsson, Duncan Coutts, Blake Rain and Rob Harrop. You can find details of their talks on the main page. I'm very happy with both the variety and quality of the speakers. It includes people who were there at the beginning of lazy functional languages and also people actively making their living developing things for clients with Haskell.

I've seen four of these speakers talk many times before, and they are always fun and informative. The two speakers who might be less familiar to some in the Haskell community are talking about topics which sound particularly interesting:

  • Blake Rain will be talking about Yesod, one of the three Haskell web frameworks. I've read the excellent documentation on Yesod, but I've never seen the overview. I ran an ASP web development company many years ago, and want to see how I could have avoided the problems of untyped development. I want to see how the ideas behind Haskell web frameworks work in practice, with real end-users who care far more about the shade of blue than type safe URL's.
  • Rob Harrop will be talking about integrating Haskell components into an existing system via message passing over the network. I hear more and more people structuring complex systems as separate processes that talk to each other with messaging interfaces. Once you have process boundaries, then using a different language for each piece becomes easy. Translating 1 million lines of code to Haskell isn't usually an option, but prototyping one new process might be - this approach seems like a great gateway for Haskell.

I look forward to seeing some of you there! Register now (discount code HASKELLX-2012-TE1).

Monday, September 03, 2012

Haskell eXchange 2012

Skills Matter are having a 1-day Haskell conference in London on October 10th:

Haskell Exchange 2012

The conference is dedicated to practical Haskell, not academic Haskell. Each session will teach you about how Haskell is used in the real world, and include techniques you can apply to your Haskell programming. Speakers include Simon Peyton Jones, Simon Marlow, Lennart Augustsson, Duncan Coutts, Blake Rain and Rob Harrop. Topics include types, parallelism, high-performance concurrency, LLVM, EDSLs, Yesod and Cloud Haskell.

Registration is already open, and the early registration discount applies until 9th September (6 more days). I've been working with Skills Matter since May developing the program, and am very happy with the end result. I'll be there with 69% probability, and am really looking forward to it.

Wednesday, August 29, 2012

Shake User Feedback

I'll be at ICFP 2012 in a few weeks, talking about Shake. One thing I'd like to include on the slides are some information about how Shake is used by other people. Do you use Shake? If so:

  • What is the project? How do you use Shake?
  • Any comments on Shake, what you like, what you didn't?

Please either leave a comment on the blog, or email me - I'm mainly looking for snippets to put on slides. As always, if you have any other feedback, I'm always interested (this applies to any of my projects).

Shake Updates

Since I'm writing about Shake, I've made a number of enhancements in the darcs repo, which will be included in the next version. These changes include:

  • Faster database serialisation
  • Faster modification time checking
  • GHC 7.6 compatibility
  • Significant improvements to profiling output
  • Improved error messages

Anyone is welcome to try the darcs version out, but I may change the database format again before the final release (causing complete rebuilds), so I don't particularly recommend it - these are all refinements, not rewrites.

I look forward to seeing some Shake users in Copenhagen!

Tuesday, August 07, 2012

Standard Chartered is hiring

Standard Chartered, the bank where I work, is hiring – there are two vacancies for functional programmers in either London or Singapore. We’re looking for programmers to use Haskell full time (or more specifically, the Mu Haskell dialect), with the results used by many people, often the very next day. We write compilers, libraries, applications, web servers, DSLs and much more besides. All of the work is for use in house, and is usually geared towards finance, but no finance background is required. Standard Chartered has been using Haskell for over four years, and has hired lots of well-known Haskell programmers, including Lennart Augustsson, Ravi Nanavati, Malcolm Wallace, Roman Leshchinskiy and Don Stewart.

We’re looking for proper hackers – people who can quickly write top-quality code in response to what our users need. We have a large existing Haskell code base, which lots of people are continually developing, so experience with larger Haskell projects would be a plus. To apply, send a CV to neville.dwyer AT sc DOT com, and make sure the CV includes links to anything you’ve written on programming (Twitter, StackOverflow, blogs, academic papers) and links to any open-source software you may have written (github, hackage). If you have any more general questions feel free to email me.

Sunday, July 29, 2012

Adding the Lens derivation to Derive

Derive is a tool for generating code from Haskell data types - you specify the data type, and it automatically derives code, often instance declarations. I was asked how to add derivations for the data-lens library, and this post is the answer. I have also released derive-2.5.10 which supports Lens derivations. The Derive tool should contain derivations for as many patterns as possible, and I welcome new contributions.

Step 1: Checkout and run Derive

The first step is to fetch the code with darcs:

$ darcs get http://community.haskell.org/~ndm/darcs/derive

Then follow the instructions in the README.txt file to regenerate the derivations locally and run the test suite:

$ cd derive
$ ghci
> :main --generate
> :reload
> :main --test

Like all my projects, Derive contains a .ghci file which sets up include paths and other useful utilities for working with each package. We first run :main --generate which autogenerates all the boilerplate in the Derive tool. We then run :reload to reload the freshly generated code, and finally :main --test which runs the test suite. In order to run the test suite it is likely that you will need to install some packages, see derive.html for a list of the likely packages.

Step 2: Add the Lens derivation

The first step when adding a derivation is to find a similar derivation (there are 35 existing derivations, so there is probably something reasonably close) and copy it. For the Lens derivation, it turns out that Ref is quite similar, so we copy Data/Derive/Ref.hs to Lens.hs and start modifying. There are several sections we need to tweak.

Firstly, we update the Haddock comments at the top of the file. These comments are purely for the users of our library, and should explain how to use the derivation.

Secondly, we modify the test section. This section is in comments just below the module header. We add import "data-lens" Data.Lens.Common then, working from the data types listed in Data/Derive/Internal/Test.hs, we modify the test to match what we expect as the output. This test section is tested with :main --test and contributes to the documentation (mainly the import line).

Thirdly, we modify the top-level names in this module, so the module name is Data.Derive.Lens and the main function is makeLens.

Step 3: Test and fix

When developing new derivations I follow test driven development. The tests are written, but the code has not been changed from the Ref derivation, so we expect the tests to fail. We follow the three standard steps in GHCi and :main --test complains with:

*** Exception: Results don't match!
Expected:
lensSpeed :: Lens Computer Int
lensSpeed = lens speed (\ x v -> v{speed = x})

Got:
refSpeed :: Ref Computer
refSpeed = Ref{select = speed, update = \ f v -> v{speed = f (speed v)}}

As expected, the test fails, showing us that the copied Ref code does not do what we want the Lens code to do. Modifying the code, it takes a few minutes to arrive at:

lensSpeed :: Lens Computer
lensSpeed = lens speed (\ x v -> v{speed = x})

The result is not quite right - the Int argument is missing from Lens, but so far we have been renaming and juggling existing code. Now we need to find the type of the field at hand, given the name of the field and the DataDecl of the type (DataDecl is from the haskell-src-exts library). Hunting around some of the Derive utility modules, particularly Language.Haskell, we can come up with:

typ = tyApps (tyCon "Lens") [dataDeclType d, fromBangType t]
Just t = lookup field $ concatMap ctorDeclFields $ dataDeclCtors d

We rerun :main --test and the test passes. This command checks that the examples match, then checks that the result of the derivation type-checks for a larger range of examples. We have now added Lens derivations to Derive.

(If you are lucky, and your derivation can be added by example, then you might not have to write any code at all - simply modifying the test case automatically generates the right code. See the Eq type class for such an example.)

Step 4: Final test

While we have satisfied the test suite, it is always reassuring to run some tests by hand. Using the Example.hs file in the root of the repo we can try:

> :main Example.hs --derive=Lens

This command prints out the expected result:

lensMemory :: Lens Computer Int
lensMemory = lens memory (\ x v -> v{memory = x})

lensSpeed :: Lens Computer Int
lensSpeed = lens speed (\ x v -> v{speed = x})

lensWeight :: Lens Computer Int
lensWeight = lens weight (\ x v -> v{weight = x})

Step 5: Send upstream

Before sending a patch, update CHANGES.txt with a one-line summary of what you have done, then you should see:

$ darcs add Data\Derive\Lens.hs

$ darcs w -s
M ./CHANGES.txt +1
M ./Data/Derive/All.hs -1 +2
A ./Data/Derive/Lens.hs
M ./derive.cabal +1
M ./derive.htm +1

Create a patch (using the standard darcs workflow) and send it to me. The more derivations the merrier.

Sunday, July 08, 2012

Shake ICFP paper

My ICFP 2012 Shake paper is now online: Shake Before Building - Replacing Make with Haskell. From the abstract:

Most complex software projects are compiled using a build tool (e.g. make), which runs commands in an order satisfying user-defined dependencies. Unfortunately, most build tools require all dependencies to be specified before the build starts. This restriction makes many dependency patterns difficult to express, especially those involving files generated at build time. We show how to eliminate this restriction, allowing additional dependencies to be specified while building. We have implemented our ideas in the Haskell library Shake, and have used Shake to write a complex build system which compiles millions of lines of code.

There are two primary sources of documentation for Shake, the ICFP paper (as above) and the package documentation. The ICFP paper covers the theory, including how Shake relates to other tools (specifically make) and general remarks about how Shake is designed/implemented and how you can build things on top of it. The package documentation gives concrete examples of using the package and an exhaustive list of all functions available.

Sunday, June 17, 2012

Shake Storage Layer

Summary: Shake maintains metadata as the build progresses. This metadata must remain up to date and consistent even if the process is killed. This post describes my new strategy for doing that.

Shake is an advanced build system, and in common with nearly all advanced build systems, it maintains extra metadata about rules - when the rule was last run, what the dependencies were, how long it took etc. If the metadata associated with a rule is not available, the rule must be rerun, which is often expensive. Any build system is likely to be interrupted on a regular basis - both due to failing rules (compile errors) and the user aborting a build. As a result, it is important that the metadata is robustly stored to disk as soon as it is produced.

In this post, I outline the old solution to maintaining metadata, along with the new solution available in shake-0.3, which I just released. The new solution has a number of benefits:

  • Reduces time loading/saving metadata by up to 75%. In practice this is unlikely to make a significant difference unless no rules need running.
  • Exceptions at any point will not cause file handles to be left open.
  • Previously there were very small windows where if the process died suddenly all metadata would be corrupted. These have been eliminated.
  • I removed all knowledge of the build system from the storage layer, making it properly decoupled.

Most of these improvements have been driven by people using Shake in new ways. When used as a replacement for Make, with one invocation per run, many of these issues are theoretical. Now people are running Shake in background threads and forcibly killing and restarting it on a regular basis, these issues can be observed in practice. However, the improvements will benefit everyone.

The Old Solution

The old solution has remained basically the same since the very first version of Shake, over three years ago. Shake maintains two files - the database contains the metadata, while the journal contains a list of metadata updates that can be appended to. The sequence of steps is:

  • Load the database
  • If the journal exists then:
    • Replay the journal into the database
    • Save the database
    • Delete the journal
  • Run the build, storing any updates to the journal
  • Save the database
  • Delete the journal

This solution works well, but has a couple of flaws. Whenever we save the database, if it gets corrupted half-way through, we lose the entire database, causing the build to start from scratch. Another problem is that if we are building nothing, we read in all the metadata, then write it all out again with only one single modification (incrementing the build time step). Since serialisation takes 3x longer than deserialisation (in benchmarks on the Shake metadata) about 75% of the time associated with the metadata is wasted. Even when we have made many updates, the data is already stored in the journal, so rewriting the database is not strictly necessary.

The New Solution

The new solution keeps a single database, containing a list of key/value pairs, which can be appended to. At certain points a backup file is made, simply a copy of an existing database. The sequence of steps is:

  • If the backup file exists, delete the database and use the backup file
  • Read all records from the database
  • Put the records into a Map
  • If the Map is significantly smaller than the number of records then
    • Rename the database to the backup
    • Resave the database
    • Delete the backup
  • Run the build, storing any updates to the database

In this method we never save the data after a successful run, but just close the file handles. The database accumulates key/value pairs, but only the last value associated with any key in the database is useful - earlier values are ignored. At some point the database will contain a significant number of keys that are no longer useful, and at that point we rewrite the database, taking care to make a backup before starting.

This post outlines the general steps, omitting details such as version stamps and consistency checks, which are highly important for a robust build system. These details are taken care of in the full implementation, available in the source as Development.Shake.Storage, taking about 100 lines.

Monday, June 04, 2012

The Flavours of MVar

Update: These functions have been cleaned up, improved with respect to error conditions and async exceptions, and put in the extra package.

The MVar is a flexible and powerful locking primitive, used extensively in Haskell. An MVar is like a box which is empty (has zero elements inside) or full (has one element inside). You block when trying to take from an empty MVar or put to a full MVar. On top of MVars, lots of interesting concurrent programs can be written. However, with such a flexible mechanism, there is scope for confusion. Every MVar can block on either a take or a put, but for any individual MVar it is likely you expect it to block on only one of those operations. In my programs I usually restrict my MVars to one of three flavours, each of which is described below.

Lock


The Lock guarantees single-threaded access, typically to some system resource.

type Lock = MVar ()

newLock :: IO Lock
newLock = newMVar ()

withLock :: Lock -> IO a -> IO a
withLock x = withMVar x . const

And as an example:

lock <- newLock
let output = withLock . putStrLn
forkIO $ do ...; output "hello"
forkIO $ do ...; output "world"

Here we are creating a lock to ensure that when writing output our messages do not get interleaved. This use of MVar never blocks on a put. It is permissible, but rare, that a withLock contains a withLock inside it - but if so, watch out for deadlocks.

Var


The Var operates on a mutable variable in a thread-safe way.

type Var a = MVar a

newVar :: a -> IO (Var a)
newVar = newMVar

modifyVar :: Var a -> (a -> IO (a, b)) -> IO b
modifyVar = modifyMVar

modifyVar_ :: Var a -> (a -> IO a) -> IO ()
modifyVar_ = modifyMVar_

readVar :: Var a -> IO a
readVar = readMVar

And as an example:

hits <- newVar 0
forkIO $ do ...; modifyVar_ hits (+1); ...
i <- readVar hits
print ("HITS",i)

Here we have a variable which we modify atomically, so modifications are not interleaved. This use of MVar never blocks on a put. No modifyVar operation should ever block, and they should always complete in a reasonable timeframe. A Var should not be used to protect some external resource, only the variable contained within. Information from a readVar should not be subsequently inserted back into the Var.

Barrier


A barrier starts with no value, is written to once, and read one or more times.

type Barrier a = MVar a

newBarrier :: IO (Barrier a)
newBarrier = newEmptyMVar

signalBarrier :: Barrier a -> a -> IO ()
signalBarrier = putMVar

waitBarrier :: Barrier a -> IO a
waitBarrier = readMVar

And as an example:

bar <- newBarrier
forkIO $ do ...; val <- ...; signalBarrier bar val
print =<< waitBarrier bar

Here we create a barrier which will contain some computed value. A thread is forked to fill the barrier, while the main thread waits for it to complete. A barrier has similarities to a future or promise from other languages, has been known as an IVar in other Haskell work, and in some ways is like a manually managed thunk. It is an error to signal a barrier more than once and a deadlock to never signal it. Since the barrier goes from empty to full, it never blocks on a put, unless you incorrectly call signal more than once.

Combining MVar Flavours - Once


The previous three MVar wrappers are the flavours of MVar which I use regularly. These can be combined into higher-level abstractions specific to certain situations. I now give two examples, intended to show how to combine these primitives.

The once function takes an action, and returns a new action. If the action is never called the argument action will never be executed, but if it is called more than once, it will only be executed once. We can write this function as:

once :: IO a -> IO (IO a)
once act = do
    var :: Var (Maybe (Barrier a)) <- newVar Nothing
    return $ join $ modifyVar var $ \v -> case v of
        Nothing -> do b <- newBarrier; return (Just b, do x <- act; signalBarrier b x; return x)
        Just b -> return (Just b, waitBarrier b)

Here we create a variable to store the result, whose state is either Nothing (we have not yet started computing) or Just a barrier (we have started computing, use this barrier to get the result out). I have found 'join $ modifyVar' is a common idiom, used to defer a blocking action (often waitBarrier) until after a modifyVar has completed, ensuring we preserve our invariant of not blocking inside a modifyVar. When running the resulting action, if the variable is a Nothing we create a new barrier, store it, and then start an action (after leaving the modifyVar) to compute the result, signal the barrier and return. If we already have a barrier, we just wait for this barrier.

[Note that you can implement once in terms of MVar directly, using only one MVar, but that violates the simple rules of the restricted MVars - rightly so, you have to use the MVar empty state to mean both atomic access to shared state, and to mean computation in progress.]

Combing MVar Flavours - Queue


As another practical example of using these restricted MVars, let us consider a special kind of queue. Message arrive individually, but are collected in bulk. When someone tries to retrieve message, if there are any messages waiting they are sent immediately. If there are no messages, the read blocks until either a message arrives or until a new reader arrives, in which case the old reader is sent away with nothing. This can be implemented as:

type Queue a = Var (Either [a] (Barrier [a]))

arrive :: Queue a -> a -> IO ()
arrive q x = modifyVar_ q $ \q -> case q of
    Left xs -> return $ Left $ xs ++ [x]
    Right b -> do signalBarrier b [x]; return $ Left []

collect :: Queue a -> IO [a]
collect q = join $ modifyVar q $ \q -> case q of
    Left xs@(_:_) -> return (Left [], return xs)
    _ -> do
        case q of Right b -> signalBarrier b []; _ -> return ()
        b <- newBarrier
        return (Right b, waitBarrier b)

The type of Queue tells us most of what we need to know about the invariants - Queue has a mutable state, which is either Left (zero or more messages waiting) or a Right (someone waiting to collect messages). If we had used MVar instead of both Var and Barrier, the invariant and design would be far less clear. With these invariants clearly stated, the code just follows directly.

Creating New Flavours


I find the three MVar wrappers (Lock, Var, Barrier) much easier to understand since the rules are simpler, making maintenance easier. I have also found that most projects benefit from higher-level abstractions in some places. As an example, I defined Queue in one recent project, and Shake defines a Resource type, on top of which the resources feature is implemented. Concurrency is hard, but robust abstractions split the complexity, and thus simplify the programs.

Sunday, June 03, 2012

Hoogle Update

Summary: I just updated the Hoogle website. It looks nicer on the iPhone and the source is on GitHub.

The Website

The Hoogle website is (as always) at http://haskell.org/hoogle. I've just uploaded a fresh Hackage index (currently a manual operation, but one I'm intending to automate imminently). I've also made a number of improvements if you are using Hoogle over the iPhone - to simulate the iPhone experience click here.

The Source

The source code has moved to Github: https://github.com/ndmitchell/hoogle. While darcs is a much nicer version control system than Git, GitHub offers a lot of nice features, so I'm using Hoogle as an experiment. I've been promised that projects on GitHub get lots of contributions, so now I wait!

I'm leaving the bug tracker in Google code for the moment, and am considering where the Hoogle manual should live, but a GitHub wiki site is currently looking likely.

Monday, March 26, 2012

Finding Two unordered-containers Bugs

Summary: Over the past few weeks I found two bugs in unordered-containers. I strongly recommend everyone upgrades to unordered-containers-0.2.1.0 or later, which fixes these bugs.

Both Shake and Uniplate use the unordered-containers library, which provides fast Map and Set types. Both my libraries had bugs reported against them which turned out to be introduced by unordered-containers-0.2.0.0. As soon as these bugs were identified Johan released a new version (0.2.1.0) fixing the bugs, and both my packages now work as expected.

Bug 1: Shake breaks because delete fails

Just over two weeks ago Evan Laforge reported he'd started seeing "thread blocked indefinitely in an MVar operation" if there was a compile error. Shake goes to a lot of effort to clean up nicely, so something was going wrong. Since Evan's project compiles on Linux, and I only have access to Windows machines, I tried replicating it on one of my projects. I perturbed the project in several ways and did manage to replicate the error once, but the same perturbation next time made it succeed - something was clearly non-deterministic (which is the norm in a highly concurrent program such as Shake).

I'd been wanting to write a random build system generator and tester for a while, to thoroughly test Shake, and used this bug as motivation. I wrote the tester, and usually within 5 random runs (about 60 seconds of churning) the error was reproduced on my desktop. However, the error wouldn't reproduce on my laptop even after an hour (because I was on an older version of unordered-containers, I now know), and as I was off travelling I had to put the bug away to look into on my return.

On my return I investigated further. Figuring out the command line to run the random test took about an hour (note to self - write these things down). Once I had that, it still reproduced reliably in about 60 seconds. I cut out the random parts of the test that I didn't think were contributing to the bug (tests for building successfully, minimal rebuilding etc) and cut the reproduction time down to about 5 seconds. I then started augmenting the code with print statements. The cleanup code for exceptions is almost exclusively within the thread pool implementation (see Development.Shake.Pool if you are reading the source). In particular, the code that cleans up after an exception is:


t <- myThreadId
mapM_ killThread $ Set.toList $ Set.delete t $ threads s
signalBarrier done $ Just e


This code finds all threads in the thread pool, with the exception of this thread (the Set.delete t), and kills them. After that has finished, signal to the thread who first called the thread pool that everything has finished with an exception.

Adding trace statements at every step showed that the exception started being cleaned up, a handful of threads were killed, but not all the threads were killed and the barrier was never signaled. An initial guess was that my thread was being killed, and thus that Set.delete had failed to remove t. Since I had debugged a separate unordered-containers bug a week before (bug 2 started after and finished before this one), I went to the unordered-containers commit list and immediately found a bug fixed against delete. I upgraded, and ran my random tester for 3 hours without failure.

Bug 2: Uniplate breaks because filter fails

The Uniplate bug was reported just over a week ago, while I was travelling, by Aleksey Khudyakov. This bug report included a relatively small test case demonstrating a clear bug, along with the vital piece of information that the test case worked with unordered-containers-0.1, but not 0.2. With that information the question became whether Uniplate was using some non-guaranteed behaviour (such as assuming the result from Set.toList is ordered), or whether unordered-containers had a bug. The supplied test case was very sensitive - it involved three modules, and simply changing any module name caused the bug to disappear. The code was also exercising a very sensitive and complex part of Uniplate. After 12 hours of cooperative debugging between myself, Aleksey and Johan, I came up with the reduced test case:


let broken = Set.filter (`elem` good) $ Set.fromList $ good ++ bad
working = Set.fromList $ Set.toList broken
print $ Set.member (head good) broken
print $ Set.member (head good) working


This printed False/True, when it should print True/True. Of course, the choice of the good and bad lists is critical, and was based on the Hashable instance for TypeRep, which uses the fully qualified type name, and thus changes if you rename the modules.

Once I had provided Johan with the reduced test case, it was fixed the same day.