Sunday, June 22, 2008

GSoC Hoogle: Week 4

This week I've stayed in one place, and had lots of opportunity to get on with Hoogle. I've done a number of different things this week:

More on Haddock databases I fixed a number of issues with the Haddock generated Hoogle information. These patches have been submitted back to Haddock.

Binary Defer library I merged the binary defer library into the Hoogle sources, and modified it substantially. Some of the modifications were thanks to suggestions from the Haskell community, particularly David Roundy. The library is now more robust, and is being used as a solid foundation to build the rest of Hoogle on top of.

Text Searching You can now search for words, even multiple words, and the search will be performed. The text searching uses efficient data structures, scales excellently, and returns better results first.

Suggestions These improvements were detailed earlier in the week.

Next week: Type searching. I have various ideas on how to go about this, but it is the most tricky part of the whole project. I hope to come up with the perfect solution by the end of the week, but if not, will come up with something good enough for Hoogle 4 then revise it after the Summer is over (it could easily suck in a whole Summer of time if I am not careful!). Much of the low-level infrastructure is already present, so it is just the search algorithm.

User visible changes: Text searching works. A session with Hoogle as it currently stands:

> cabal haddock --hoogle
-- generates tagsoup.txt
> hoogle --convert=tagsoup.txt
Generating Hoogle database
Written tagsoup.hoo
> hoogle +tagsoup is open --color
Text.HTML.TagSoup.Type isTagOpen :: Tag -> Bool
Text.HTML.TagSoup.Type isTagOpenName :: String -> Tag -> Bool


No comments: