I've now finished my Hoogle Summer of Code work, though I still intend to continue working on Hoogle when I get the chance. Before the coding period expired, I was able to add a number of new features to Hoogle. These features are all available at Hoogle, under http://haskell.org/hoogle/.
More Compact Text Searching
The old text search feature was very fast, using an on disk trie to navigate around the possible matches. The downside to this trie was the space it consumed, about half the database was devoted to it. Fortunately, I came up with an alternative way to get fast text searching (albeit slightly slower), in a lot more compact form.
Much smaller database files also mean much faster database generation, as the time spent in the IO routines is the main bottleneck.
Faster IO routines
I rewrote the underlying binary layer in Hoogle, to make it faster. It's not as fast as I would like, and I think that moving to memory-mapped files is probably a good idea. With these improvements, along with the compact text searching, I am able to generate databases in about 2 seconds (compared to about 20 seconds before).
Database Restricted Searches
Hoogle has been able to run database restricted searches for some time, but now the databases contain enough information to make it practical. By adding +package or -package to the search you can include or exclude certain packages. For example, to find out which map functions are in the containers package try map +containers. To find out which map functions are not in the containers or bytestring packages try map -containers -bytestring. I have also split out the GHC.* modules from base, so if you want to find some unboxed types in GHC's libraries try # +ghc. Note that not all the documentation links work from the GHC modules, I am still trying to fix this.
By default Hoogle searches the following packages: array, base, bytestring, cabal, containers, directory, filepath, haskell-src, hunit, keyword, mtl, parallel, parsec, pretty, process, quickcheck, random, stm, template-haskell, time, xhtml
The "ghc" package is also available if specified with +ghc and includes the GHC.* modules of base only.
Hoogle 3
I have now replaced the default Hoogle with Hoogle 4, but have copied Hoogle 3 to http://haskell.org/hoogle/3. Unfortunately, it doesn't yet work, as I need some admin help. But it will in the next few days, I hope. The only reason I can think of for using Hoogle 3 is Gtk2hs library searching, which I do want to add to Hoogle 4 when possible.
Give Me Feedback
There are quite a lot of enhancements to Hoogle that I still want to make. I have tried to list all these improvements in my bug tracker. If you find a bug, or want some feature, open an issue. If you have a particular interest in a bug, you can star it, to be informed on its progress and to indicate to me that you care.
I'm particularly interested in two pieces of feedback:
I don't use Hoogle 4 because ...
Do you use any type/name search engine? Do you want to still use Hoogle 3? Do you use Hayoo? If you use something else, what feature draws you to it? What do you dislike about Hoogle 4?
I use Hoogle 4, but my life would be nicer if ...
There are many things which effect Hoogle 4 users that I'm not aware of. If you open a bug saying what annoys you (or leave a comment and I'll do it for you) then I can keep track of this information. Even if you don't necessarily see any way to fix the problems, I'd still like to know them.
Thanks for everyone who has given feedback on Hoogle so far, it has been very useful.
Good work, Neil! I really like the new Hoogle. I do use Hayoo! every now and then, because they index all the packages on Hackage. That's pretty much the only reason.
ReplyDeleteHey Neil, really cool, I especially love the extendable description display. In long term, I think we should really throw together our search engines to get one single powerful Haskell API search.
ReplyDeleteKeep up the good work!
Timo: It seems Hoogle and Hayoo have very different building blocks at their heart - but anything we can do to make one search engine with the best bits from both is a step in the right direction. If you have any ideas do let me know :-)
ReplyDeleteAre you going to be at ICFP? If so, we can have a chat then.