Tuesday, June 20, 2017

Announcing Weeder: dead export detection

Most projects accumulate code over time. To combat that, I've written Weeder which detects unused Haskell exports, allowing dead code to be removed (pulling up the weeds). When used in conjunction with GHC -fwarn-unused-binds -fwarn-unused-imports and HLint it will enable deleting unused definitions, imports and extensions.

Weeder piggy-backs off files generated by stack, so first obtain stack, then:

  • Install weeder by running stack install weeder --resolver=nightly.
  • Ensure your project has a stack.yaml file. If you don't normally build with stack then run stack init to generate one.
  • Run weeder . --build, which builds your project with stack and reports any weeds.

What does Weeder detect?

Weeder detects a bunch of weeds, including:

  • You export a function helper from module Foo.Bar, but nothing else in your package uses helper, and Foo.Bar is not an exposed-module. Therefore, the export of helper is a weed. Note that helper itself may or may not be a weed - once it is no longer exported -fwarn-unused-binds will tell you if it is entirely redundant.
  • Your package depends on another package but doesn't use anything from it - the dependency should usually be deleted. This functionality is quite like packunused, but implemented quite differently.
  • Your package has entries in the other-modules field that are either unused (and thus should be deleted), or are missing (and thus should be added). The stack tool warns about the latter already.
  • A source file is used between two different sections in a .cabal file - e.g. in both the library and the executable. Usually it's better to arrange for the executable to depend on the library, but sometimes that would unnecessarily pollute the interface. Useful to be aware of, and sometimes worth fixing, but not always.
  • A file has not been compiled despite being mentioned in the .cabal file. This situation can be because the file is unused, or the stack compilation was incomplete. I recommend compiling both benchmarks and tests to avoid this warning where possible - running weeder . --build will use a suitable command line.

Beware of conditional compilation (e.g. CPP and the Cabal flag mechanism), as these may mean that something is currently a weed, but in different configurations it is not.

I recommend fixing the warnings relating to other-modules and files not being compiled first, as these may cause other warnings to disappear.

Ignoring weeds

If you want your package to be detected as "weed free", but it has some weeds you know about but don't consider important, you can add a .weeder.yaml file adjacent to the stack.yaml with a list of exclusions. To generate an initial list of exclusions run weeder . --yaml > .weeder.yaml.

You may wish to generalise/simplify the .weeder.yaml by removing anything above or below the interesting part. As an example of the .weeder.yaml file from ghcid:

- message: Module reused between components
- message:
  - name: Weeds exported
  - identifier: withWaiterPoll

This configuration declares that I am not interested in the message about modules being reused between components (that's the way ghcid works, and I am aware of it). It also says that I am not concerned about withWaiterPoll being a weed - it's a simplified method of file change detection I use for debugging, so even though it's dead now, I sometimes do switch to it.

Running with Continuous Integration

Before running Weeder on your continuous integration (CI) server, you should first ensure there are no existing weeds. One way to achieve that is to ignore existing hints by running weeder . --yaml > .weeder.yaml and checking in the resulting .weeder.yaml.

On the CI you should then run weeder . (or weeder . --build to compile as well). To avoid the cost of compilation you may wish to fetch the latest Weeder binary release. For certain CI environments there are helper scripts to do that.

Travis: Execute the following command:

curl -sL https://raw.github.com/ndmitchell/weeder/master/misc/travis.sh | sh -s .

The arguments after -s are passed to weeder, so modify the final . if you want other arguments.

Appveyor: Add the following statement to .appveyor.yml:

- ps: Invoke-Command ([Scriptblock]::Create((Invoke-WebRequest 'https://raw.githubusercontent.com/ndmitchell/weeder/master/misc/appveyor.ps1').Content)) -ArgumentList @('.')

The arguments inside @() are passed to weeder, so add new arguments surrounded by ', space separated - e.g. @('.' '--build').

What about Cabal users?

Weeder requires the textual .hi file for each source file in the project. Stack generates that already, so it was easy to integrate in to. There's no reason that information couldn't be extracted by either passing flags to Cabal, or converting the .hi files afterwards. I welcome patches to do that integration.


7 comments:

Sylvain HENRY said...

Thank you for this tool!

I have had two false positives:
1) an import used only in a Template Haskell splice. It is reported as unused.
2) an import X where X reexports something (in my case Word8 from Data.Word). X is reported as unused but it can't be removed.

Echo Nolan said...

Can Weeder "do the right thing" when you have a package with one or more executables and a library they depend on that nobody outside that package is supposed to depend on? I.e. report things that are exported from modules in exposed-modules but not used by any of the executables as weeds?

Neil Mitchell said...

@Sylvain: I raised https://github.com/ndmitchell/weeder/issues/14 for the first issue, that isn't too surprising, I haven't tested it on TemplateHaskell. For the Word8 reexport, it should deal with the reexport properly - can you share an example where it goes wrong?

@Echo: No, anything exported by exposed-modules is considered not a weed, even if no one ever uses it. That's a fundamental limitation of .hi files - it lists explicit uses within a package, but outside that it just lists the package name.

Echo Nolan said...

That makes sense. Too bad though.

Sylvain HENRY said...

@Neid: I have filled https://github.com/ndmitchell/weeder/issues/15

Unknown said...

Tried to build weeder, needed the install-ghc flag since I never used 8.2.1 before.

C:\TorXakisSandbox\TorXakis.git>stack install weeder --resolver=nightly --install-ghc

everything goes fine until
...
Extracting ghc-8.2.1\perl
Extracting ghc-8.2.1\perl\perl.exe
Extracting ghc-8.2.1\perl\perl56.dll

Everything is Ok

Folders: 450
Files: 8685
Size: 1670722456
Compressed: 1677639680
GHC installed to C:\Users\laarpjljvd\AppData\Local\Programs\stack\x86_64-windows\ghc-integersimple-8.2.1\

Error: While constructing the build plan, the following exceptions were encountered:

In the dependencies for hashable-1.2.6.1:
integer-gmp must match >=0.2, but the stack configuration has no specified version (latest applicable is 1.0.0.1)
needed since hashable-1.2.6.1 is a build target.

In the dependencies for integer-logarithms-1.0.2:
integer-gmp must match <1.1, but the stack configuration has no specified version (latest applicable is 1.0.0.1)
needed since integer-logarithms-1.0.2 is a build target.

In the dependencies for scientific-0.3.5.1:
integer-gmp must match -any, but the stack configuration has no specified version (latest applicable is 1.0.0.1)
needed since scientific-0.3.5.1 is a build target.

Recommended action: try adding the following to your extra-deps in C:\TorXakisSandbox\TorXakis.git\stack.yaml:
- integer-gmp-1.0.0.1

You may also want to try the 'stack solver' command
Plan construction failed.


Problem might be caused since our project (https://github.com/TorXakis/TorXakis) needs integer-simple due to license issues...

Neil Mitchell said...

Pierre: These are issues with other packages that Weeder depends on - but I know they can be worked around since I've used code for hashable that uses integer-simple. I suggest raising them on the Stack bug tracker or mailing list.