Thursday, October 26, 2006

Hat+Windows, 1 happy user!

I just got an email:

"I've got hat-observe working on my code, which makes me very happy."

This is a happy Hat user, who is using my Windows port. As far as I know, this is the first Hat user on Windows who has actually got a real project going through which they really want to debug - not just test stuff.

This makes me very happy too :)

Saturday, October 21, 2006

30% faster than GHC!

I have been working on a back end to my analysis framework Catch for a while, I do lots of transformations as part of Catch, some of which speed up the code, so hooking a back end up seems sensible. Using this back end, I can be 30% faster than GHC.

Before I show any pretty graphs, there are few big and important disclaimers:

  • I use GHC as a back end
    • This means that GHC optimises my code as well!
  • Benchmarked against GHC 6.4.2, using -O2
  • Only tested on one single file! Prime numbers, from the nofib suite.
  • All experiments run on a P4, 3GHz
  • All experiments run 5 times, and the lowest number recorded
  • I do whole program analysis


And so on to some pretty graphs of *HC vs GHC on the prime number benchmark:





These numbers show a pretty much one third increase in speed.

One of my future tasks is to hook this directly up to a code generator, and hopefully my speed will increase even further - at the moment I have to add things in to make the output valid Haskell which slows down the generated code. A custom back end would help with this, plus I have other techniques for speeding up the back end given some of the "knowledge" accumulated by Catch. I am reasonably confident that GHC is not doing too much of the heavy work when compiling my code, as compiling without any optimisation does not penalise my GHC output too much.

How do I get fast code?

I take the Yhc compiler, and generate Yhc.Core. I take all the Core generated for all bits in the program and splat them together, including the Prelude etc. I run some analysis passes on this code, including making the code completely first order, a little bit of deforestation and a touch of partial evaluation. I then output Haskell, however my Core language is not a subset of Haskell, so some additional things need to be handled.

Will this only work for Primes?

Hopefully not! In theory the back end is general purpose, and should work for anything. In practice, I'm still working on it, and not everything is finished yet.

Whole program analysis? Thats slow!

Not really, I develop Catch in Hugs, it takes around 10 seconds to compile Primes in Hugs, using an entirely unoptimised pipeline - I even use associative lists for function name lookup - and still the performance is not too bad. Only a very small number of the steps I perform are whole program, and the ones that are only get done once in a linear fasion. It probably won't scale to "a whole compiler", but it can certainly hit 1000 line programs with no issue.

What's next?

Stop getting distracted by developing a compiler, and get back to my PhD!

Thursday, September 21, 2006

DriFT

I tried out DriFT recently, and was most impressed, it has the feel of a program that hasn't had much love recently, but thats ok, its still a very useful program.

Firstly there is no easy way to compile it on Windows - its not hard, its just not obvious either. For reference the steps are 1) cd src, 2) ren Version.hs.in Version.hs, 3) ghc --make Drift.hs. For those Windows users who don't want to do that, I've shoved up a binary on my Windows distribution page. Its certainly not hard, but its not as easy as Cabal based thingies.

Once you've done that, "drift -r File.hs" produces the goods, in a very straight forward way. What I wanted was a deriving Binary, with a loadBinary and a saveBinary interface. DriFT offers two separate binary output modes, Binary and GhcBinary - I wanted to use it in Hugs, but a bit of experimentation showed that GhcBinary has a nicer output, so I jumped with that. Once thats done I wanted to combine it with the Binary library to do the actual serialisation - there is one in the repo but it seems to have not been given the attention it needs, so was easier to write my own. See this file in the Yhc repo for how I did it - the answer is not very nicely, but quite workable.

With all this done, now Yhc spits out Core in binary, but more importantly my PhD program now has binary cache's - changing some operations from 30 seconds to 2 seconds, which is nice :)

Wednesday, September 06, 2006

Hoogle mailing list

There is now a hoogle mailing list: http://www.haskell.org/mailman/listinfo/hoogle

So instead of emailing me individually, or talking to me on IRC, instead you can just email that list and hopefully others can give useful feedback as well.

Friday, September 01, 2006

Over 65,000 Hoogle Searches!

I just checked the Hoogle logs, and found a stagering 67,112 searches have been performed.
18495 of these searches were for different terms, people searched for "map" 4036. Thats quite a lot of searching!

I think those logs are since about October, which makes this just under a years data. I have no idea of what the breakdown over time is, but Hoogle 4 will have better logging, and should be able to tell me.

Update: dons tells me there have been 3849 lambdabot @hoogle searches

Wednesday, August 23, 2006

Parsec and Hoogle

For the last few days I've been rewritting the parsing in Hoogle to use parsec. As an end result, the parser is a lot more powerful, and more maintainable, and extendable - on the downside its longer and more complex.

The thing that most impressed me about parsec was its compositionality. In Hoogle there are type signatures, which are both on database lines (map :: (a -> b) -> [a] -> [b]), and there are user queries which might have a type signature in them, amongst other junk. Thanks to parsec I can use the same type signature parser in both of them, with extensions for the relevant bits. I couldn't really do this with a traditional yacc/bison/happy parser generator. Its also great for the fuzzy nature of user searches - you don't want to parse error if there is any sensible user interpretation of what they wrote.

Thanks to this rewrite, I now get a few query goodies that I always wanted but could never properly parse. Included in this are multiple words "concat map", names and type signatures "map :: [a] -> [a]" and the search parser now checks for command line options, which stops the bugs with things like "->" being misinterpretted.

Thursday, August 17, 2006

Neil vs Cabal

Today I've been trying to get into cabal, since it seems a pretty cool thing, and it looks like the way forward.

As I've been doing this, trying to compile various projects using Cabal, it turns out that I spent all day encountering bugs! I've hit things that seem a bit curious in loads of programs, sent off patches, reported bugs, asked for clarification etc. Hopefully this will be fixed at some point soon, after enough people have bashed through it.

In particular, to try and get this going better, I'm going to try and keep the HEAD versions of various projects compiled regularly with Cabal on Windows - and then probably distribute the binaries as part of my Haskell on Windows drive.

I'm also trying to get Hoogle working properly with Cabal, since thats going to be the future way of building it, probably.

Which brings me on to a final question about the Hoogle license, what should it be? Currently Hoogle is licensed under Creative Commons Attribution-NonCommercial-ShareAlike License. Nothing else in the Haskell world is, so its not particularly sensible that Hoogle is. My basic thoughts are GPL vs BSD. What do people think one way or the other?

Wednesday, August 16, 2006

Hoogle 4 plans

After spending about the last 3 months replying to most people's Hoogle comments with "that will be fixed in Hoogle 4", its about time I actually implement Hoogle 4. Just to give people an idea of where I'm going, I thought I'd summarise what Hoogle 4 means to me :)

First off, I abandoned .ths after talking with Niklas who does Haskell-source-exts and HSP, and will be using his stuff. Its got advantages of tag safety and a better syntax, lacks a few bits, but being written by someone else saves me a bit of work. Its also well supported, something thats essential!

Anyway, the plans for Hoogle 4 fall into a few areas:

No bugs: type classes, type aliases, higher kinded type classes - all these things confuse Hoogle. Either they are bugs, or they come close enough to count as them. These will all be fixed.

Help the user: searching for ThreadID doesn't work (its ThreadId), searching for Just a doesn't work, searching for Maybe -> Int doesn't work, all just fail silently. I want to tell people when they are searching a bit dubiously and fix it for them.

Do what users keep asking for: often users do searches for multiple words, "map concat", this currently gives them very confusing results (the type "m a" is equivalent).

Be a database: I want to give more database like features, lookup Just will give you the functions that use it.

Faster: I want to make text searching 100's of times faster, which isn't so the results come back faster but so that...

Packages: I want to be able to search packages other than the default ones, such as Gtk2hs (which you can already Hoogle search), and lots lots more. Which requires a speed boost.

AJAX: I have a few good AJAX ideas for making searching just a little bit quicker.

Lots to do, and will probably be an entire rewrite (again...), but this is hopefully going to be the version that sticks arond for a very long time and comes out of Beta.

Sunday, August 13, 2006

.ths (Textual Haskell Source)

I've been doing some work on Hoogle 4 over the last week while I've been away from a computer. Lots of cool new ideas, some paper code, and other goodies - will probably be a few weeks before I start to crank out implementations and improvements get seen on the main Hoogle site, and perhaps 2 or 3 months before Hoogle 4 starts to take shape.

Anyway, one thing Hoogle needs to do is to output a web page, and at the moment it does that by reading in text files, and writing them out. To do a typical search page it shoves out a standard prefix, a top bit, the answers and a suffix. Only the answers are generated on the fly, the others are included in. Of course, this means that the HTML is in 4 places, and the reusability is poor (files are chunks of text, not functions). The pages also have small tweaks to them based on dynamic data - for example the title of the page is the search query. To accomodate this I had to add $ replacement, so $ becomes the search query. Messy, and not very general.

So to answer all this, I devised .ths - Textual Haskell Source. Currently you have .hs (source code is the main thing), .lhs (comments are the main thing) and now you have .ths (text is the main thing). Lets start with an example:

> showHtml :: String -> String
> showHtml search = pure
[html]
[head]
[title]<% search %> - Hoogle[/title]
[/head]

Note that in this example I am escaping the code (with > ticks), and the text is just the main bit. I also have <% code %> which is substituted inline.

I can also do more advanced things (naturally)

> showList :: FilePath -> [Bool] -> IO String
> showList filename items =
> src <- readFile filename
The length of <% filename %> is <% length src %>
And the booleans are
<%for i = items %> <%if i %>ON<%else%>off<%end if%><%end for%>

I have all these bits implemented, and hope to make a release in a few days. I kind of have to release, because the current darcs version of Hoogle will be using them in a few days anyway.

And of course, since all this stuff is Haskell, its easier to compose, call as functions, etc.

Thoughts or comments?

Friday, July 07, 2006

Haskell debugging with Hat on Windows

For the last couple of days I've been trying to get Hat working on Windows. I now have over half the tools working on Windows, and have a bundle ready for Windows users to install: http://www.cs.york.ac.uk/fp/temp/hat-win32-05_jul_2006.zip

How to install: Extract the contents of the .zip file into a folder preserving
directory structure. Add the folder containing hat-make to your %PATH% variable, this is
100% required, even if you give the explicit path to hat-make when you
use it. Make sure ghc is available on your system and has been added to the %PATH%.

How to use: cd to the directory containing your Haskell source
hat-make Main.hs
main
hat-stack Main.hat
hat-observe Main.hat

I am also working on a graphical user interface for these tools, a screenshot is here, using Gtk2Hs. - its not ready yet, but hopefully soon.

If any Windows users try this out and find that it either works or does not, please let me know.

Thursday, June 29, 2006

Haskell Suggest

Often a more experienced Haskell user can point out some clever trick in some Haskell code that a beginner may not know about, for example:

  • concat (map f x)
  • map g (map f x)
  • putStr . (++) "\n"


Often this is because the new user is unfamiliar with the existance of a particular function, of course they can Hoogle it, if they thought it might exist, but often it never occurs to them.

The solution is Haskell Suggest, have a tool that automatically spots and suggests these things. I think the best implementation would be using Yhc Core for a couple of reasons, its relatively unmodified (no advanced transformations like inlining), has source positions and is simple.

Sounds like a good idea to go and implement.... Credits to dons for discussing this on Haskell IRC.

Wednesday, June 21, 2006

Hoogle improvements

I have been making some improvements to Hoogle - nothing to do with the actual searching engine, but lots of tweaks to the page layout. Its now less cluttered, and has less pointless pages, and has a good link to a Firefox plugin. I have also moved all the documentation into the wiki, so hopefully other people will be able to contribute.

I'm still looking for a nice logo if anyone has any artistic talent - I certainly don't!

Monday, June 12, 2006

Windows and Haskell

I have decided to start the Haskell on Windows software repository, its located here:
http://www-users.cs.york.ac.uk/~ndm/projects/windows.php

The idea is that Linux users can use their relevant package manager and in one click do "emerge ghc" or something and get GHC installed quickly and easily. For Windows users this means downloading either a .zip or an installer of the project quickly and directly.
The page above just links directly to the most appropriate installation file, and is going to be kept up to date with new versions.

Its kind of depressing to see how few precompiled Windows binaries there are for Haskell programs - only 8, and I compiled 3 of them myself. If anyone has a Haskell project and would like a Windows build contributing please email me, and I'll make a binary and add it to that list.

Monday, June 05, 2006

The Play class

One useful trick I've found when manipulating data structures is the Play class, which I created to "play" with various data structures. Often a data structure will contain the same type within it - for example:

data Expr = Sum [Expr]
| Literal Int
| Div Expr Expr

Now I define a Play class as:

class PlayExpr a where
mapExpr :: (Expr -> Expr) -> a -> a
allExpr :: a -> [Expr]


mapExpr just maps over every element in the data structure, and allExpr gives every element back, this makes lots of things quite easy.

For example, with these properties you can test if there are any negative literals in the list:

[n | Literal n <- allExpr x, n < 0]

And operations like replacing Sum [x] with x can be coded easily as follows:

mapExpr f x
where
f (Sum [x]) = x
f x = x

This could be done as just two functions, not in a class, but by putting it in a class you can add instances for [x] (a,b) etc. And also, if this expression is embeded in a larger data structure, you can then traverse that larger data structure in exactly the same way.

I have used this quite extensively in some of my code.

Saturday, May 27, 2006

Abusing Haskell for fun and profit

At the moment I am working on a System.FilePath module combining Lemmih's one from cabal, and the one from Yhc. In order to do this I have had to abuse Monad's to the extreme (instance Monad Test) and CPP to the extreme (#define module --). Hopefully the result will be useful to a large number of people, and might even make it into base. [Note: the interface to System.FilePath is unstable, and will change - if you have any suggestions please let me know!]

I have also been advocating Haskell to my research group, to the stage where in my group there is only one hold out Python programmer, and everyone else has moved to Haskell even for non-Haskell related projects/PhD's. Now I have to start trying to persuade them to move to Windows...

Tuesday, May 16, 2006

WinHugs release

A release of WinHugs has just gone out:

http://cvs.haskell.org/Hugs/pages/downloading-May2006.htm

This is the first released final Haskell software I have contributed to!

For Windows users, this should be an essential upgrade - an entirely rewritten WinHugs, updated libraries, FFI that works with Visual Studio and lots of other goodies.

Saturday, April 29, 2006

Windows and Haskell

I use Haskell on Windows, and I always tend to feel like a second class citizen... Lots of things just don't work as well on Windows as compared to Linux with Haskell tools - for instance, there is no hmake for Windows, nhc never worked on Windows, ghc ships with something close to a linux distribution with Windows, I once read the instructions to build ghc on windows and I cried, to make the standard libraries for Haskell its pretty much Linux, or something terrible like MSYS or Cygwin - the list goes on...

The reason I'm complaining is that I've been working on getting hugs and FFI working on Windows, the actual Windows code is all relatively easy, but trying to get the base package to compile on Windows seems not possible. Since the base package also compiles FFI .dll's, these are also built in MSYS with GCC. Hopefully in the future Cabal will come to the rescue, but at the moment I'm still not convinced - first off Cabal seems to match the way Linux users think, and not Windows users in any way. Although at the same time, it does seem quite impressive, and the way out for the future.

Hopefully, one day everyone will see the light and stop using Linux, move to Windows, and we can all have nice user interfaces and nice programming languages in one package.

--
Just a quick note, I really am very greatful for all the projects that have Windows ports,
I just hit my head against a brick wall every time I see a makefile :)

Tuesday, April 25, 2006

Hoogle Logo

I just got an email from someone pointing out that the Hoogle logo might be infringing Google's copyright or trademark. He might be right, he might not, but I think Hoogle is getting to the stage where I probably need to change the logo to something less like Google. If anyone has any ideas, I'd be happy to see them. I just want something vector based (SVG or Xara or something else) that looks nice and is essentially the word "Hoogle" with a lambda for the l, in reasonably clear writing. Fades, gradients, transparencies, textures are all fine.

I've just filled out a report on Hoogle for the HCAR, including some light plans for the future. It seems that between every HCAR I release a new version and rewrite the existing version, and never make a release. I'll try to change that before the next one.

The future plans for Hoogle are to make it go faster, and once that happens I can add loads more libraries and applications into the search. I also want to fix a few remaining bugs (nothing is considered a Monad due to higher kinds), and add a few features that never got finished (type aliasing). I also want searching for multiple words, since it seems a lot of people do that, and currently Hoogle considers it a type search. If anyone did want to do any work on it, there is plenty there, and I'm happy to accept patches :)

With all those fixes, I want the following searches to be "better":

  • Monad a => [a b] -> a [b]
  • zip with
  • [Char] -> String

And I want the following searches to have better error messages:
  • Just a -> a
  • Maybe -> a

Saturday, April 15, 2006

My Haskell related projects

Just for general information, I am involved in the following Haskell projects:

As author:
Hoogle - a Haskell search engine
Catch - a safety checker for Haskell (my PhD)
WinHaskell - A GUI for Haskell use on Windows
WinHugs - the GUI bit of Hugs (I rewrote the old WinHugs from scratch)

As a major contributor:
Yhc - the York Haskell compiler, I do the -core stuff, and other related bits.

And have submitted patches to:
Haddock - Add hoogle output
GHCi - :set prompt feature
Hugs - :main support (which is now in GHCi as well, thanks to someone else)

What do all the projects that I am mainly working on have in common? None have ever had an official release. Hoogle is approaching version 4 without having ever left beta, WinHaskell is just basically functional but definately not finished, Catch is coming along but far away from end user use, WinHugs is pretty much done, just release work remaining really.

At the moment I'm focused on WinHaskell, my progress can be tracked roughly here, but there are about 10 additional patches on my computer and I'm currently working on number 11.

Friday, April 14, 2006

Planet Haskell

I thought I'd turn this blog into one for my Haskell related stuff, since i doubt my friends in real life want to hear lots about Haskell, and I doubt Haskell people want to hear about me getting drunk and ranting about the world.

Just a few links for the first Haskell related post:

Planet Haskell - http://planet.haskell.org/

My academic website - http://www.cs.york.ac.uk/~ndm/