Monday, February 01, 2016

New HLint version and API features

Summary: In the new version of HLint, errors are warnings and warnings are suggestions. I plan to remove the old API very soon.

I've just released a new version of HLint, the tool for suggesting hints to improve your Haskell code. This version comes with two big changes.

Firstly, hints have been reclassified in severity. What used to be errors are now warnings, and hints that used to be warnings are now suggestions. As people have mentioned in the past, nothing HLint suggested was really an error, and now HLint matches that.

Secondly, there is now an hlint API entry point in Language.Haskell.HLint3 which captures the pattern of running HLint with command-line arguments, but in process as a library. With that, I don't think there are any API patterns that are better captured by the Language.Haskell.HLint API. If no one contacts me with issues, I will be making the module Language.Haskell.HLint a copy of Language.Haskell.HLint3 in the next release.

This release (like all minor releases) fixes a few bugs and adds a few new features. As a few random examples, HLint now warns on a redundant DefaultSignatures language extension, suggests fmap in preference to liftM and warns when otherwise is used as a pattern variable. There is a complete change log in the repo.

Tuesday, January 19, 2016

One Bag of Huel

I've been a fan of the idea of meal replacement products for a while - the Matrix gruel always seemed simple and appealing. One such product is Huel, which is available in the UK, seems fairly sensibly thought out, and isn't cadim-yummy. When having lunch at home, I found myself eating stuff that was nutritionally garbage, inconvenient and time consuming to get and not very nice. Given the downsides, Huel seemed worth a try. Having gone through one bag, it seems to be working nicely, and I intend to continue this as my default lunch-at-home plan.

General random thoughts on Huel:

  • My wife pronounces Huel with a strong and long 'u', huuuuuuel, to imitate the sound of someone vomiting. Soylent is probably a better name.
  • I bought a first bag, and they threw in a free flask/shaker and t-shirt. I've been using the flask they supplied, and the t-shirt is very nice.
  • I like that it has all the vitamins and minerals required, so I'm not likely to be missing anything.
  • The taste is OK. Not horrid, perfectly drinkable, but not too nice, so unlikely to go overboard. They got the balance right.
  • Most of the flavouring suggestions seem like someone trolling. The default vanilla flavour works fine for me.
  • The taste/consistency depends a lot on exactly how much water goes in, so experiment a bit (I find 420ml and 3 scoops works for me).
  • By the end of the flask, it's significantly more dense, so I add a little bit of water with about 40ml to go.
  • If you add the powder before the water it's a bit of a disaster and you get clumped powder at the bottom.
  • Wash the flask as soon as you finish, or it sets quite hard.
  • I think I spend about 3 mins preparation time making the mixture and washing up after, which isn't bad.
  • The pouches come sealed at the top with a resealable strip that is initially unsealed. Cut the top of the strip, don't tear it, or you bump into the resealable strip. Before starting, you have to clean the resealable strip out with a knife (which gives a bad first impression, but otherwise is unproblematic).
  • The powder has a habit of escaping a bit when filling up the flask. If it gets on clothes, it stays there until you wash them, and a gentle brushing down has little effect. A little annoying, but not fatal.
  • It's sufficiently filling that I think I'm probably going under the number of calories I should be getting at lunch. I'm experimenting with trying to start my lunch Huel a bit earlier, and then have another Huel or snack in the afternoon - splitting lunch in two.

Since this is a post about a specific product, I should probably mention I have no relationship with the company other than having spent £45 on Huel.

Thursday, January 14, 2016

A simple Haskell function

Summary: An example of a small function I recently wrote - from type signature to tests.

When writing a build system there are lots of nasty corner cases to consider. One is that command line limits combined with lots of arguments sometimes requires splitting a single command up into multiple commands, each of which is under some maximum length. In this post I'll describe a function that was required, my implementation, and how I tested it.

Type signature and documentation

Before I even got to the function, it already had a type signature and some Haddock documentation:

-- | @chunksOfSize size strings@ splits a given list of strings into chunks not
--   exceeding @size@ characters. If that is impossible, it uses singleton chunks.
chunksOfSize :: Int -> [String] -> [[String]]

As an example:

chunksOfSize 5 ["this","is","a","test"] == [["this"],["is","a"],["test"]]

Implementation

My implementation was:

chunksOfSize n = repeatedly $ \xs ->
    let i = length $ takeWhile (<= n) $ scanl1 (+) $ map length xs
    in splitAt (max 1 i) xs

First we use the repeatedly function from the extra library. This has the signature:

repeatedly :: ([a] -> (b, [a])) -> [a] -> [b]

Given a list of input, you supply a function that splits off an initial piece and returns the rest. One of the examples in the documentation is:

repeatedly (splitAt 3) xs  == chunksOf 3 xs

So we can see how repeatedly lets us focus on just the "next step" of this list, ignoring the recursion. For the function argument we have two tasks - first decide how many items to put in this chunk, then to split the chunks. Splitting the chunks is the easy bit, and can be written:

splitAt (max 1 i) xs

If we know the next i elements will be at or below the limit, then we can use splitAt to divide the elements. As a special case, if no elements would be allowed, we allow one, using max 1 to ensure we never pass 0 to splitAt (and thus enter an infinite loop). That leaves us with:

i = length $ takeWhile (<= n) $ scanl1 (+) $ map length xs

Reading from right to left, we reduce each element to it's length, then use scanl1 to produce a running total - so each element represents the total length up to that point. We then use takeWhile (<= n) to keep grabbing elements while they are short enough, and finally length to convert back to something we can use with splitAt.

Tests

When testing, I tend to start with a few concrete examples then move on to QuickCheck properties. As an initial example we can do:

quickCheck $
    chunksOfSize 3 ["a","b","c","defg","hi","jk"] ==
    [["a","b","c"],["defg"],["hi"],["jk"]]

Here we are explicitly testing some of the corner cases - we want to make sure the full complement of 3 get into the first chunk (and we haven't got an off-by-one), we also test a singleton chunk of size 4. Now we move on to QuickCheck properties:

quickCheck $ \n xs ->
    let res = chunksOfSize n xs
    in concat res == xs &&
       all (\r -> length r == 1 || length (concat r) <= n) res

There are really two properties here - first, the chunks concat together to form the original. Secondly, each chunk is either under the limit or a singleton. These properties capture the requirements in the documentation.

A final property we can check is that it should never be possible to move the first piece from a chunk to the previous chunk. We can write such a property as:

all (> n) $ zipWith (+)
    (map (sum . map length) res)
    (drop 1 $ map (length . head) res)

This property isn't as important as the other invariants, and is somewhat tested in the example, so I didn't include it in the test suite.

Performance and alternatives

The complexity is O(n) in the number of Char values, which is as expected, since we have to count them all. Some observations about this point in the design space:

  • In a strict language this would be an O(n^2) implementation, since we would repeatedly length and scanl the remainder of the tail each time. As it is, we are calling length on the first element of each chunk twice, so there is minor constant overhead.
  • Usually in Haskell, instead of counting the number of elements and then doing splitAt we would prefer to use span - something like span ((<= n) . fst) .... While possible, it makes the special singleton case more difficult, and requires lots of tuples/contortions to associate each element with its rolling sum.
  • For a build system, the entire input will be evaluated before, and the entire output will be kept in memory afterwards. However, if we think about this program with lazy streaming inputs and outputs, it will buffer each element of the output list separately. As a result memory would be bounded by the maximum of the longest string and the Int argument to chunksOfSize.
  • It is possible to write a streaming version of this function, which returns each String as soon as it is consumed, with memory bounded by the longest string alone. Moreover, if the solution above was to use lazy naturals, it would actually come quite close to being streaming (albeit gaining a quadratic complexity term from the takeWhile (<= n)).
  • The type signature could be generalised to [a] instead of String, but I would suspect in this context it's more likely for String to be replaced by Text or ByteString, rather than to be used on [Bool]. As a result, sticking to String seems best.

Refactoring the previous version

The function already existed in the codebase I was working on, so below is the original implementation. This implementation does not handle the long singleton special case (it loops forever). We can refactor it to support the singleton case, which we do in several steps. The original version was:

chunksOfSize _    [] = []
chunksOfSize size strings = reverse chunk : chunksOfSize size rest
  where
    (chunk, rest) = go [] 0 strings
    go res _         []     = (res, [])
    go res chunkSize (s:ss) =
        if newSize > size then (res, s:ss) else go (s:res) newSize ss
      where
        newSize = chunkSize + length s

Refactoring to use repeatedly we get:

chunksOfSize size = repeatedly $ second reverse . go [] 0
  where
    go res _         []     = (res, [])
    go res chunkSize (s:ss) =
        if newSize > size then (res, s:ss) else go (s:res) newSize ss
      where
        newSize = chunkSize + length s

Changing go to avoid the accumulator we get:

chunksOfSize size = repeatedly $ go 0
  where
    go _         []     = ([], [])
    go chunkSize (s:ss) =
        if newSize > size then ([], s:ss) else first (s:) $ go newSize ss
      where
        newSize = chunkSize + length s

It is then reasonably easy to fix the singleton bug:

chunksOfSize size = repeatedly $ \(x:xs) -> first (x:) $ go (length x) xs
  where
    go _         []     = ([], [])
    go chunkSize (s:ss) =
        if newSize > size then ([], s:ss) else first (s:) $ go newSize ss
      where
        newSize = chunkSize + length s

Finally, it is slightly simpler to keep track of the number of characters still allowed, rather than the number of characters already produced:

chunksOfSize size = repeatedly $ \(x:xs) -> first (x:) $ go (size - length x) xs
  where
    go n (x:xs) | let n2 = n - length x, n2 >= 0 = first (x:) $ go n2 xs
    go n xs = ([], xs)

Now we have an alternative version that is maximally streaming, only applies length to each element once, and would work nicely in a strict language. I find the version at the top of this post more readable, but this version is a reasonable alternative.

Acknowledgements: Thanks to Andrey Mokhov for providing the repo, figuring out all the weird corner cases with ar, and distilling it down into a Haskell problem.

Monday, December 21, 2015

Solving the GCHQ puzzle "by hand"

The GCHQ 2015 Christmas puzzle is a Nonogram puzzle, which involves filling in squares on a grid to reveal a picture, guided by constraints on the rows and columns. For a computer, a nice way to solve this problem is using a SAT solver. But humans aren't great at SAT solving, and I was given a print-out of this puzzle while on holiday, with no computer. I'd never encountered such a puzzle before, so working with a friend (and some wine) we came up with an approach, and set about applying it. Alas, robustly applying an algorithm with many steps is not easy for a human, and we eventually ended up with contradictions. On returning from holiday, I automated our approach, and tested it. Our approach worked, and the code is below.

The Problem

The puzzle is:

It comprises a 25x25 grid, some filled in squares, and alongside each row/column are the number of consecutive squares that must be filled in each line. For example, the 8th row down must have two runs of three filled squares, with a gap in between, and potentially gaps before or after.

Our Approach

Our approach was to take each line and compute the number of "free" gaps - how many spaces could be inserted with choice. For one row (4 from the bottom) the entire grid is constrained, with no free gaps. Starting with the most constrained lines, we tried to figure out where the pieces could go, based on the existing squares. We quickly realised that negative information was important, so tagged each square with "don't know" (left blank), must be filled (we shaded it in) or must be unfilled (we drew a circle in it). For each line in isolation, looking at the constraints, we inferred squares to be filled or unfilled by examining the possible locations of each run.

The Code

I implemented our approach in Haskell, complete code is available here.

Our constraint system works over a grid where each square is in one of three states. We can encode the grid as [[Maybe Bool]]. The [[.]] is a list of lists, where the outer list is a list of rows, and the inner list is a list of squares. Each of the inner lists must be the same length, and for the GCHQ puzzle, they must all be 25 elements long. For the squares we use Maybe Bool, with Nothing for unknown and Just for known, using True as filled and False as unfilled.

Given the [[Maybe Bool]] grid and the constraints, our approach was to pick a single line, and try to layout the runs, identifying squares that must be True/False. To replicate that process on a computer, I wrote a function tile that given the constraints and the existing line, produces all possible lines that fit. The code is:

tile :: [Int] -> [Maybe Bool] -> [[Bool]]
tile [] xs = maybeToList $ xs ~> replicate (length xs) False
tile (c:cs) xs = concat [map (\r -> a ++ b ++ c ++ r) $ tile cs xs
    | gap <- [0 .. length xs - (c + sum cs + length cs)]
    , (false,xs) <- [splitAt gap xs]
    , (true,xs) <- [splitAt c xs]
    , (space,xs) <- [splitAt 1 xs]
    , Just a <- [false ~> replicate gap False]
    , Just b <- [true ~> replicate c True]
    , Just c <- [space ~> replicate (length space) False]]

The first equation (second line) says that if there are no remaining constraints we set all remaining elements to False. We use the ~> operator to check our desired assignment is consistent with the information already in the line:

(~>) :: [Maybe Bool] -> [Bool] -> Maybe [Bool]
(~>) xs ys | length xs == length ys &&
             and (zipWith (\x y -> maybe True (== y) x) xs ys)
           = Just ys
(~>) _ _ = Nothing

This function takes a line of the grid (which may have unknowns), and a possible line (which is entirely concrete), and either returns Nothing (inconsistent) or Just the proposed line. We first check the sizes are consistent, then that everything which is concrete (not a Nothing) matches the proposed value.

Returning to the second equation in tile, the idea is to compute how many spaces could occur at this point. Taking the example of a line 25 long, with two runs of size 3, we could have anywhere between 0 and 18 (25-3-3-1) spaces first. For each possible size of gap, we split the line up (the splitAt calls), then constrain each piece to match the existing grid (using ~>).

Given a way of returning all possible lines, we then collapse that into a single line, by marking all squares which could be either True or False as Nothing:

constrainLine :: [Int] -> [Maybe Bool] -> Maybe [Maybe Bool]
constrainLine cs xs = if null xs2 then Nothing else mapM f $ transpose xs2
    where xs2 = tile cs xs
          f (x:xs) = Just $ if not x `elem` xs then Nothing else Just x

If there are no satisfying assignments for the line, we return Nothing - that implies the constraints are unsatisfiable. Next, we scale up to a side of constraints, by combining all the constraints and lines:

constrainSide :: [[Int]] -> [[Maybe Bool]] -> Maybe [[Maybe Bool]]
constrainSide cs xs = sequence $ zipWith constrainLine cs xs

Finally, to constrain the entire grid, we constrain one side, then the other. To simplify the code, we just transpose the grid in between, so we can treat the rows and columns identically:

constrainGrid :: [[Int]] -> [[Int]] -> [[Maybe Bool]] -> Maybe [[Maybe Bool]]
constrainGrid rows cols xs =
    fmap transpose . constrainSide cols .
    transpose =<< constrainSide rows xs

To constrain the whole problem we apply constrainGrid repeatedly, until it returns Nothing (the problem is unsatisfiable), we have a complete solution (problem solved), or nothing changes. If nothing changes then there might be two solutions, or our approach might not be powerful enough without using search.

The Result

After four iterations we end up with a fully constrained answer. To see the progress, after one iteration we have:

..XXXXX...X.OO..X.XXXXX..
..OOOOX.X.O.....O.XOOOO..
..XXX.X....O...OX.X.XX...
X.XXX.X....XXXXXX.X.XX...
X.XXX.X..XXXX..XX.X.XX..X
X.OOOOX...XO...OO.XOOO..X
XXXXXXXOXOXOXOXOXOXXXXXXX
..OOO.OO..XOOOX.XOOOOO.O.
..XX..XX.OXOXOXXXOXO...X.
..XO..OO....OXX.O.O....X.
..X...X......X..X......O.
..O...O......XO.X........
..XX..X.X....O.OO.X......
..OXX.O.X....XXXX.X......
..XX..XXXXX..O.OO........
..X...O.X..O..O.X...O....
..X...X.X.OXO.O.X...X....
..OOO.O.X..O..O.X...X..X.
X.XXXXX.......O.X...X..X.
X.OOO.X.....XOO.X...X..X.
X.XXX.X.....XXOOX...X...O
XOXXXOXOXXXOXXXXXXXXXXOXX
..XXX.X.....XXXXX..XXXX.O
..OOOOX......OOOO...O.X..
..XXXXX......XOXX.O.X.X..

Here a . stands for Nothing. After four iterations we reach the answer in a total of 0.28s:

XXXXXXXOXXXOOOXOXOXXXXXXX
XOOOOOXOXXOXXOOOOOXOOOOOX
XOXXXOXOOOOOXXXOXOXOXXXOX
XOXXXOXOXOOXXXXXXOXOXXXOX
XOXXXOXOOXXXXXOXXOXOXXXOX
XOOOOOXOOXXOOOOOOOXOOOOOX
XXXXXXXOXOXOXOXOXOXXXXXXX
OOOOOOOOXXXOOOXXXOOOOOOOO
XOXXOXXXOOXOXOXXXOXOOXOXX
XOXOOOOOOXXXOXXOOOOXOOOXO
OXXXXOXOXXXXOXXOXOOOOXXOO
OXOXOOOXOOOXOXOXXXXOXOXXX
OOXXOOXOXOXOOOOOOXXOXXXXX
OOOXXXOXXOXXOXXXXXXOXXXOX
XOXXXXXXXXXOXOXOOXXOOOOXO
OXXOXOOXXOOOXXOXXXOOOOOXO
XXXOXOXOXOOXOOOOXXXXXOXOO
OOOOOOOOXOOOXXOXXOOOXXXXX
XXXXXXXOXOOXXOOOXOXOXOXXX
XOOOOOXOXXOOXOOXXOOOXXOXO
XOXXXOXOOOXXXXOOXXXXXOOXO
XOXXXOXOXXXOXXXXXXXXXXOXX
XOXXXOXOXOOXXXXXXOXXXXXXO
XOOOOOXOOXXOOOOOOXOXOXXOO
XXXXXXXOXXOOOXOXXOOOXXXXX

Update: On the third attempt, my friend managed to solve it manually using our technique, showing it does work for humans too.

Wednesday, December 09, 2015

MinGHC is Dead, Long Live Stack

Summary: The MinGHC project has now finished.

The MinGHC project was started to produce a minimal Windows installer which didn't contain many packages, but which could install many packages - in particular the network package. But time has moved on, and Stack now offers everything MinGHC does, but cross-platform and better. To install GHC using Stack, just do stack setup, then stack exec -- my command. Even if you prefer to use Cabal, stack exec -- cabal install hlint is a reasonable approach.

A few final remarks on MinGHC:

  • MinGHC was an experiment started by Michael Snoyman, which myself (Neil Mitchell) and Elliot Cameron joined in with. I had fun working with those guys.
  • The ideas and approaches behind MinGHC got reused in Stack, so the knowledge learnt has transferred.
  • The MinGHC project involved a Shake build system coupled to an NSIS EDSL installer. I think the technologies for building the installer worked out quite nicely.
  • The existing MinGHC installers are not going away, but we won't be making any changes, and probably won't upload new versions for GHC 7.10.3.
  • It's possible to build Windows installers on a Linux box using a Wine version of NSIS. Very cool.
  • I find maintaining such fundamental pieces of an ecosystem, involving installation and system configuration, to be much less fun than just writing Haskell code. Kudos to the Stack and Cabal guys.

Tuesday, December 08, 2015

What's the point of Stackage LTS?

Summary: Stackage LTS is mostly pointless.

Stackage provides a set of precise versions of a subset of Hackage which all place nicely together. At any one time, there are two such version sets:

  • Nightly, which aims to be the latest version of all packages.
  • LTS (Long Term Support), which with every major release updates every package, and every minor release updates packages that only changed their minor version.

Stack currently defaults to LTS. I don't think LTS fulfils any need, and does cause harm (albeit very mild harm), so should be abandoned. Instead, people should use a specific nightly that suits them, and upgrade to later nightlies on whatever schedule suits them. I share these somewhat half-baked thoughts to hopefully guide how I think Stackage should evolve, and there may be good counterarguments I haven't thought of.

Why is Nightly better?

Nightly always has newer versions of packages. I believe that newer versions of packages are better.

  • As a package author, with each new release, I fix bugs and evolve the library in ways which I think are beneficial. I make mistakes, and new versions are how I fix them, so people using old versions of my software are suffering for my past mistakes.
  • As a package user, if I find a bug, I will endeavour to report it to the author. If I'm not using the latest version, it's highly likely the author will ask me to upgrade first. If a feature I want gets added, it's going to be to the latest version. The latest version of a package always has better support.
  • If the latest version of a package does not suit my needs, that's an important alarm bell, and I need to take action. In some cases, that requires finding a fork or alternative package to do the same thing. In some cases, that involves talking to the author to make them aware of my particular use case. In either case, doing this sooner rather than later makes it easier to find a good solution.

What's the benefit of LTS?

As far as I am aware, LTS major releases are just Nightly at a particular point in time - so if you only ever use x.0 LTS, you might as well just use Nightly. The main benefit of LTS comes from picking a major LTS version, then upgrading to the subsequent minor LTS versions, to access new minor versions of your dependencies.

Upgrading packages is always a risk, as LTS Haskell says. What LTS minor releases do are minimize the risk of having to fix compilation errors, at the cost of not upgrading to the latest packages. However, the reasons I do upgrades are:

  • Sometimes, if one package has a known bug that impacts me, I specifically upgrade just that one package. I don't want to upgrade other packages (it introduces unnecessary risk), so I take my package set (be it LTS or Nightly) and replace one constraint.
  • Every so often, I want to upgrade all my packages, so that I'm not missing out on improvements and so I'm not falling behind the latest versions - reducing the time to upgrade in future. I do that by picking the latest version of all packages, fixing breakages, and upgrading.
  • When upgrading, compile-time errors are a minor issue at worst. Before Stackage, my major headache was finding a compatible set of versions, but now that is trivial. Fixing compile-time errors is usually a little work, but fairly easy and predictable. Checking for regressions is more time consuming, and running the risk of regressions that escape the test suite has a significant cost. Tracking down if there are any resulting regressions is very time-consuming.

I don't see a use case for upgrading "a little bit", since I get a lot less benefit for only marginally less work.

Why is LTS better?

When I asked this question on Twitter, I got two reasons that did seem plausible:

  • Matt Parsons suggested that LTS provided a higher likelihood of precompiled binaries shared between projects. That does make sense, but a similar benefit could be achieved by preferring Nightly on the 1st on the month.
  • Gabriel Gonzalez suggested that when including a resolver in tutorials, LTS looks much better than a Nightly. I agree, but if there only was Nightly, then it wouldn't look quite as bad.

If Haskell packages had bug fixes that were regularly backported to minor releases, then the case for an LTS version would be stronger. However, my experience is that fixes are rarely backported, even for the rare security issues.

Is this a problem?

Currently everyone developing Haskell has two choices, and thanks to Stack, might not even be aware which one they've ended up making (Stack will pick nightly on its own if it thinks it needs to). Reducing the choices and simplifying the story removes work for the Stackage maintainers, and cognitive burden from the Stackage users.

Stackage was a radical experiment in doing things in an entirely different way. I think it's fair to say it has succeeded. Now, after experience using the features, I think it's right to critically look at tweaks to the model.

Wednesday, October 28, 2015

ViewPatterns and lambda expansion

Summary: One of HLint's rules reduced sharing in the presence of view patterns. Lambda desugaring and optimisation could be improved in GHC.

HLint has the rule:

function x = \y -> body
    ==>
function x y = body

Given a function whose body is a lambda, you can use the function syntactic sugar to move the lambda arguments to the left of the = sign. One side condition is that you can't have a where binding, for example:

function x = \y -> xx + y
    where xx = trace "hit" x

This is equivalent to:

function x = let xx = trace "hit" x in \y -> xx + y

Moving a let under a lambda can cause arbitrary additional computation, as I previously described, so is not allowed (hint: think of map (function 1) [2..5]).

View Patterns

One side condition I hadn't anticipated is that if x is a view pattern, the transformation can still reduce sharing. Consider:

function (trace "hit" -> xx) = \y -> xx + y

This is equivalent to:

function x = case trace "hit" x of xx -> \y -> xx + y

And moving y to the right of the = causes trace "hit" to be recomputed for every value of y.

I've now fixed HLint 1.9.22 to spot this case. Using Uniplate, I added the side condition:

null (universeBi pats :: [Exp_])

Specifically, there must be no expressions inside the pattern, which covers the PViewPat constructor, and any others that might harbour expressions (and thus computation) in future.

The problem with function definitions also applies equally to \p1 -> \p2 -> e, which cannot be safely rewritten as \p1 p2 -> e if p1 contains a view pattern.

The Problem Worsens (Pattern Synonyms)

Pattern synonyms make this problem worse, as they can embody arbitrary computation in a pattern, which is lexically indistinguishable from a normal constructor. As an example:

pattern Odd <- (odd -> True)
f Odd = 1
f _ = 2

However, putting complex computation behind a pattern is probably not a good idea, since it makes it harder for the programmer to understand the performance characteristics. You could also argue that using view patterns and lambda expressions to capture computation after partial application on definitions then lambda expressions is also confusing, so I've refactored Shake to avoid that.

Potential Fixes

I think it should be possible to fix the problem by optimising the desugaring of functions, ensuring patterns are matched left-to-right where allowable, and that each match happens before the lambda requesting the next argument. The question is whether such a change would improve performance generally. Let's take an example:

test [1,2,3,4,5,6,7,8,9,10] x = x
test _ x = negate x

Could be changed to:

test [1,2,3,4,5,6,7,8,9,10] = \x -> x
test _ = trace "" $ \x -> negate x

Which goes 3-4x faster when running map (test [1..]) [1..n] at -O2 (thanks to Jorge Acereda MaciĆ” for the benchmarks). The trace is required to avoid GHC deoptimising the second variant to the first, as per GHC bug #11029.

There are two main downsides I can think of. Firstly, the desugaring becomes more complex. Secondly, these extra lambdas introduce overhead, as the STG machine GHC uses makes multiple argument lambdas cheaper. That overhead could be removed using call-site analysis and other optimisations, so those optimisations might need improving before this optimisation produces a general benefit.