Sunday, February 17, 2013

Finite Resources in Shake

Summary: Management of finite resources is an important part of any modern build system, only properly available in Shake and Ninja.

I've just released Shake 0.9, a build system library, with a few bug fixes and a bunch of new features (the change log has a complete list). This release contains an incompatible change which makes the Resource feature easier to use, so I thought I'd describe the motivation and use of Resources in Shake. A full upgrade guide is at the bottom of this post.

What are Resources for?

When you run -j10 (shakeThreads=10) you are asking the build system to limit computation so it uses no more than ten CPU resources at a time. The CPU is certainly a precious resource, but there are other resource limitations a build system may need to obey:

  • Some APIs are global in nature, if you run two programs that access the Excel API at the same time things start to fail.
  • Many people have large numbers of CPUs, but only one slow rotating hard drive. If you run ten hard-drive thrashing linkers simultaneously the computer is likely to grind to a halt.
  • Some proprietary software requires licenses, a fixed number of which can be purchased and managed using a license manager. As an example, the Kansas Lava team only have access to 48 licenses for modelsim.

Resources in other build systems

I know of two approaches used by other build systems to obey resource constraints:

  • Limit the number of CPUs to hit your target - for example, the Lava build system could cap the number of CPUs to the number of licenses. People with 24 CPUs might ask the build system to use only 8, so the linkers do not make their machines unusable (and even then, a link heavy rebuild may still harm interactive performance). This solution wastes CPU resources, leaving CPUs that could be building your code idling.
  • Add locks to suspend jobs that are competing for the shared resource. For example any rule using Excel could take the Excel lock, either a mutex/MVar in some build systems, or creating a file to serve as the lock in make based build systems. Locking can be made to work, but is tricky if you have to fake locks using the file system, and still squanders CPU resources - instead of blocking the CPU should be running another rule.

The one exception is the Ninja build system which has a concept of "pools", which properly model finite resources.

Resources in Shake

In Shake the Resource type represents a finite resource, which multiple build rules can use. Resource values are created with newResource and used with withResource. As an example, only one set of calls to the Excel API can occur at one time, therefore Excel is a finite resource of quantity 1. You can write:

shake shakeOptions{shakeThreads=2} $ do
    want ["a.xls","b.xls"]
    excel <- newResource "Excel" 1
    "*.xls" *> \out ->
        withResource excel 1 $
            system' "excel" [out,...]

Now we will never run two copies of excel simultaneously. Moreover, it will never block waiting for excel if there are other rules that could be run.

For most programming languages the compiler is CPU bound but the linker is disk bound. Running 8 linkers will often cause an 8 CPU system to grid to a halt. We can limit ourselves to 4 linkers with:

disk <- newResource "Disk" 4
want [show i <.> "exe" | i <- [1..100]]
    "*.exe" *> \out ->
        withResource disk 1 $
            system' "ld" ["-o",out,...]
    "*.o" *> \out ->
        system' "cl" ["-o",out,...]

Now we can use 7 or 8 CPUs while still leaving the computer responsive enough to browse the web.

Software licenses are another finite resource and can be managed in the same way. For a complete example see the Kansas Lava test program, which uses Shake.

Porting from Shake 0.8

In Shake 0.9 the newResource function has been renamed to newResourceIO - rename newResource to newResourceIO everywhere and your code will work again.

However, you may have noticed that newResourceIO (as it is now called) forces you to create the resource before calling the shake function, meaning that often the creation and use of the resource are far apart. I have introduced a function newResource which runs in the Rules monad, allowing you to create a resource and then use it nearby. Moving the creation and use of resources closer together makes it much easier to check your resource constraints are met.

The only other breaking change is that shakeVersion has become a String rather than an Int, allowing you to store more precise information about the version (for example, your build system might want to encode the GHC version and the version of the build system in the string).

Updated 18 Feb 2013: Ninja also supports finite resources.

5 comments:

Cédric Venet said...

Maybe a little off topic but thanks for the newCache function. I was thinking that I needed this functionality just yesterday and here it is :) [meaning I don't need to learn about StateT and monad transformer right now]

I suppose that newCache and newCacheIO use the same cache?

Neil Mitchell said...

Cédric: Each call to newCache/newCacheIO creates a brand new cache distinct from all previous ones. So you call 'cache <- newCache' exactly once, then reuse the resulting 'cache' function many times. If you call the functions twice, they use different caches.

This feature is implemented entirely using IO and MVars, so no StateT/monad-transformer stuff.
I'm not sure you could do this in Shake using StateT since Shake locks down the Action and Rules types. You'd need to put it inside the core of Shake.

Neil Mitchell said...

I notice the documentation for newCache isn't great, so I'll clarify your question there as well, and add an example.

Cédric Venet said...

Thanks for the clarification. I didn't take the time to read the function signature clearly. It is much clearer now.

(My understanding of haskell is still far from good, this is why I mentioned StateT, but I can see the problem.)

Neil Mitchell said...

In many other circumstances, StateT would be exactly the right solution. I had to modify the core of Shake to allow an IO function in the Rules monad, so it wasn't trivial (which is part of the reason that adding it to the library is useful).