Thursday, December 12, 2013

Progress Reporting in Shake

Summary: Shake can predict you how long your build will take. The prediction makes use of stateful applicative streams.

The Shake build system can predict how long your build will take. As an example, on Windows 7 I see:

The progress message is displayed in the titlebar of the console window (on Windows, Linux and Mac), for example 3m12s (82%) to indicate that the build is 82% complete and is predicted to take a further 3 minutes and 12 seconds. If you are running Windows 7 or higher and place the shake-progress utility somewhere on your %PATH% then the progress will also be displayed in the taskbar progress indicator.

Progress reporting is great for end users of Shake based build systems. They may not care that Shake is fast (since they probably have no frame of comparison for your particular build), or that it is robust, or that it is fully featured etc. But everyone wants to know if they should grab a coffee or just engage in some quick sword-fighting.


Predicting the future is tricky. Shake is a very dynamic build system, so unlike Make, while running it can both discover new things to do, and decide that things it previously intended to do are unnecessary. It also runs rules in parallel, and each rule takes a different amount of time. All these factors make the progress a "best guess", rather than a statement of fact. In particular:

  • The first run (or after deleting the Shake database) will have worse predictions, as Shake has no idea how long each rule takes to execute.
  • In the first few seconds of a build, the predicted progress may vary more dramatically, as Shake is still determining how much is left to do and how fast things are going.
  • If you have a small number of long executing rules (e.g. taking minutes) then progress changes will be very granular, as no attempt is made at predicting in-progress rules.

These limitations aside, the feature has changed the way I work, and is valuable to many people. One user remarked that "showing the estimated build time remaining in the terminal window's title is genius".

Turning the feature on

Using Shake 0.10.2 or above (from March 2012), you can pass the command line flag --progress or the set the ShakeOptions field shakeProgress to progressSimple. Using the shakeProgress field you can customise how the progress messages are calculated and displayed, for example you can generate annotations for the Team City continuous-integration tool.

Collecting the data

The primary goal for this feature was to have zero overhead for people not using it, and to not complicate the internals of the build system. I achieved that goal by making progress generation take the internal Shake state every 5 seconds (configurable, using shakeProgress) and generate summary statistics from the state. The result is the Progress data type (to follow the rest of this post, you do not need to understand the actual fields below):

data Progress = Progress
    {isFailure :: Maybe String -- ^ Starts out 'Nothing', becomes 'Just' a target name if a rule fails.
    ,countSkipped :: Int -- ^ Number of rules which were required, but were already in a valid state.
    ,countBuilt :: Int -- ^ Number of rules which were have been built in this run.
    ,countUnknown :: Int -- ^ Number of rules which have been built previously, but are not yet known to be required.
    ,countTodo :: Int -- ^ Number of rules which are currently required (ignoring dependencies that do not change), but not built.
    ,timeSkipped :: Double -- ^ Time spent building 'countSkipped' rules in previous runs.
    ,timeBuilt :: Double -- ^ Time spent building 'countBuilt' rules.
    ,timeUnknown :: Double -- ^ Time spent building 'countUnknown' rules in previous runs.
    ,timeTodo :: (Double,Int) -- ^ Time spent building 'countTodo' rules in previous runs, plus the number which have no known time (have never been built before).

This structure counts the number of rules in each state, and where available, provides execution times. It reflects the information easily computable by the internals of Shake.

Computing the progress

To compute the progress I first spent a long time measuring real builds, and the Progress values, and trying to detect some correlation. The results were disappointing, and I got nowhere - the progress jumps around too much (I tried Kalman filters etc, but they didn't help). I then thought hard and decided that the best predictor of completion time is:

(number-of-rules-to-build * time-to-build-each-rule) / work-rate

So you take:

  • The number of things you have to do.
  • The time you expect each one to take.
  • How fast you are running rules, which you would expect to be roughly the -j parallelism argument.

This measure does not attempt to predict if rules are skipped or discovered, but these are unpredictable events, so ignoring them is sensible (and likely why the data-driven approach fell down). These measures can be computed by:

  • number-of-rules-to-build is known, and stored as countTodo.
  • time-to-build-each-rule is influenced by timeTodo, comprising the time to build some of the rules, and the number of rules that have never been built before. We can therefore compute the time-to-build-each-rule from the times we have, and predicted-time-to-build for the times we are missing.
  • predicted-time-to-build can be computed by taking the time things took to build this run (timeBuilt) and dividing by the number of things that rebuilt (countBuilt). Given the available data, there are a number of plausible alternatives for calculating predicted-time-to-build.
  • work-rate is roughly the parallelism flag, but for laptops, may be skewed by things like whether the machine was using battery power. We compute the work rate by tracking how long we have been building, and how much timeBuilt time we have managed to execute.

Using these definitions, given the Progress structure and a Double for the number of seconds the build has been running, we can define:

progressMessage :: Progress -> Double -> String

Stateful progress

The formulae above give a reasonable approximation of progress time, but they have a number of weaknesses:

  • At the beginning we have completed no work, so timeBuilt is zero, and therefore work-rate is zero.
  • If we are running a long-running rule (e.g. 15s) then the work-rate keeps going down, as timeBuilt only increases at the end, but the time building keeps going up.
  • Builds often go in phases, meaning that unknown rules at the beginning are typically fast code generators, in the middle are typically compilers, and at the end are slow linkers. The average of all rules is often too small by the end.

Many of these weaknesses are due to taking the current state and computing the progress, without incorporating some information from previous progress calculations. We need to redefine our progressMessage function as:

progressMessage :: ProgressState -> Progress -> (ProgressState, String)

Now we take some ProgressState (which can include the time spent building), and in computing the message, we produce some new state, which will be used to produce the next message. For example, ProgressState can include some decaying value of predicted-time-to-build, where the new value is a weighted average between the old value and the new value, allowing new values to dominate.

Streamed progress

My first stateful progress predictor made use of an explicit state, but it's hard to thread around, and doesn't feel very Haskelly. It also made minor local edits to the definitions require global state modifications, and significantly hampered tweaking the definitions. The solution was to introduce an abstraction. I played with a number of variants, but eventually settled on:

newtype Stream i a = Stream {runStream :: i -> (a, Stream i a)}

A Stream can be supplied with an input value of type i and produces an output value a and a new Stream. In our case, we will have a value of type Stream Progress String, and provide new Progress values to get out progress messages. Any stateful information can be captured in the closure. (This definition is unrelated to other Stream definitions in Haskell, which are usually just infinite lists.)

I define four primitive operations on Stream, namely:

  • pure from the Applicative class, which produces a new Stream which, regardless of the input, always produces the same output.
  • <*> from the Applicative class, which applies a Stream of functions to a Stream of values to produce a Stream of results.
  • idStream :: Stream i i which is the stream which always echoes its input as its output. This primitive is the only way to get at the input.
  • foldStream :: (a -> b -> a) -> a -> Stream i b -> Stream i a, which is like scanl but for streams, keeping an accumulating a value. This primitive is the only one to keep any state between successive stream outputs.

On top of these primitives I define a number of useful auxiliaries. As an example, oldStream produces a stream where each value is a pair of the previous value and the new value. We can define oldStream as:

oldStream :: a -> Stream i a -> Stream i (a,a)
oldStream old = foldStream (\(_,old) new -> (old,new)) (old,old)

Using these primitives we can build up complex definitions such as:

decay :: Double -> Stream i Double -> Stream i Double -> Stream i Double
decay f a b = foldStream step 0 $ (,) <$> oldStream 0 a <*> oldStream 0 b
    where step r ((a,a'),(b,b')) = ((r*b) + f*(a'-a)) / (b + f*(b'-b))

Here we are computing a / b, but using f to decay the old value, where 1 is no decay. This definition maintains a lot of state between steps of the stream, but thanks to the Stream abstraction, we can focus on the mathematics, instead of being lost in a sea of state.

Using these functions we can start to calculate the progress, for example:

unknownRuleTime = decay 10 (timeBuild <$> progress) (fromInt . countBuild <$> progress)

Here unknownRuleTime is the time spent building, divided by the number of rules that have been built, decaying at a fairly rapid rate, meaning more recent rules will have a greater impact on unknownRuleTime.

The entire progress generator takes 37 lines, and has some of the highest comment densities in Shake. The complete Stream abstraction and progress prediction code can be found in the git repo. I would be surprised if someone has not already discovered Stream, given it a name, and investigated its properties - any pointers most welcome.

Update: I have renamed Stream to Mealy, following the machines package.


Brent said...

Cool stuff! Your Streams are also called "mealy machines". See e.g. (though I confess I have never been able to figure out how to use the machines package).

Neil Mitchell said...

Brent: Thanks for the pointer, I think their unfoldMealy is a generalisation of my foldStream, which might turn out to be easier to define some of my functions in terms of.

Unknown said...

I'm not familiar with a package that provides just your Stream abstraction, but variants of it are extremely common. They underlie Yampa and Netwire, for example (it's more apparent in older releases). And for that matter, the iteratee-style packages do something quite similar.

The fundamental problem (from my perspective) is that they're recursive data structures, which ghc doesn't optimize as well as one would like. But nevertheless they're a powerful abstraction.

Edward Kmett said...

You can get GHC to optimize that sort of Stream much more efficiently using an explicit existential.

I talked a bit about that here:

Neil Mitchell said...

I happen to be lucky that this isn't in any way performance critical, far more important to be concise.

I have now renamed Stream to Mealy in my code, to match the machines package.