Thursday, February 07, 2013

A Nofib build system using Shake

Last February I had a few discussions with David Terei about replacing the Nofib benchmark suite build system with something based on Shake. Naturally, like all Make to Shake conversions, I thought this was an excellent idea. In further discussions with Simon Peyton Jones and Simon Marlow I refined the build system to match the useful features of the old Make based system.

I have now put the build system online: Shake build system for Nofib.

Unfortunately, I don't think much has happened with the build system in the last year. I invite people to play with it and use it for whatever purpose they want. The rest of this post is divided into two sections - the first section is how to use the build system (for people running nofib) the second half is how it works and how it was written (for people writing build systems with Shake).

Running the nofib suite

Grab the nofib suite, take the above file and put it in the root directory. You can then run:

$ runhaskell Nofib -j2 imaginary --way="-O2 -fvia-C" --run
... build related output ...
Build completed
Running imaginary/bernouilli...501ms
Running imaginary/digits-of-e1...513ms
... more tests ...
Running imaginary/wheel-sieve2...238ms
Running imaginary/x2n1...27ms

The -j2 is the parallelism to use when building (not when testing), the --way is which flags to pass to the compiler (so it should be trivial to experiment LLVM vs not), the "imaginary" is which section to run, but naming an individual test such as "x2n1" or just leaving it blank for all also works. The --run says run the tests afterwards, defaulting to norm, but --run=fast/--run=slow also works.

The system uses whatever GHC is on your path first, and stores the output under that GHC version, but if you want two GHC's built with differing bits, you can pass --compiler=ghc-with-a-tweak to do the build under a separate directory.

A few more examples, assuming the build system is compiled as runner, with the nofib command and the new equivalent:

Run a single test:

$ cd nofib/imaginary/exp3_8
$ make

runner exp3_8 --run

Run a test with some extra GHC flags:

$ cd nofib/imaginary/exp3_8
$ make EXTRA_HC_OPTS=-fllvm

runner exp3_8 --way="-O1 -fllvm" --run

Run a test with a different GHC:

$ cd nofib/imaginary/exp3_8
$ make HC=ghc-7.0.4

runner exp3_8 --compiler=ghc-7.0.4 --run

Just build a test, don't run it:

$ make NoFibRuns=0


Writing the build system

This build system is well commented, so I encourage people to read the code. It makes use of CmdArgs for command line parsing (lines 60-114) combined with Shake for dependency based tracking (lines 187-281). The total system is 358 lines long, and the rest is mostly pieces for running tests.

When writing the build system I had no idea what the old Makefile system did - I read what docs I could find, took a brief stab and reading the Makefile, but failed to even run it on the laptop I was working on. Therefore, I wrote the new system from scratch, and then looked back to see what the Makefiles did that the new system did not.

As is common is Make based build systems, the Makefile stored a combination of build rules (how to compile files, line 187), static confirmation data (which targets to build, line 31) and dynamic configuration data (what inputs to give when testing, line 289). The build rules and static configuration data can be easily moved into the new build system, either as function rules or top-level constants. The dynamic configuration can be moved over as top-level constants, but that means the build system requires modifying more than it should, and it is harder to track the dependencies on this data. Instead I followed an approach I've used in several build systems, by writing a converter which sucks the dynamic data out of the Makefile and produces a much simpler config file. The converter is in the build system, so the config files are generated from the Makefiles, and updated appropriately. To actually run the tests, I query the config files. The hope is that in future everyone will decide to delete the Makefiles, and the config files can be checked in as source files, and the translation aspect can be deleted.

When writing the build system I ran into a GHC bug that when two linkers ran simultaneously they failed. To work round this bug I created a resource of a linker, with quantity 1 (line 152). Whenever I call the GHC linker I acquire this resource (line 215). Now Shake will run the rest of the build system with maximum parallelism, but ensure no linkers ever run in parallel, working round the bug.


David Terei said...

Hey Neil!

I was actually talking about the Shake alternative with Johan just yesterday! His recent work has re-sparked my nofib interest so I plan on looking at the Shake alternative again likely merging it in... only 13 months later :). Funny timing for this blog post, although we were both probably prompted by the same things.

Neil Mitchell said...

Yes, Simon Marlow pointed me at Johan's stuff, and I realised I had exactly one well commented Shake script in the world and I hadn't shared it widely! Good luck with your efforts.