<?xml version='1.0' encoding='UTF-8'?><?xml-stylesheet href="http://www.blogger.com/styles/atom.css" type="text/css"?><feed xmlns='http://www.w3.org/2005/Atom' xmlns:openSearch='http://a9.com/-/spec/opensearchrss/1.0/' xmlns:georss='http://www.georss.org/georss' xmlns:gd='http://schemas.google.com/g/2005' xmlns:thr='http://purl.org/syndication/thread/1.0'><id>tag:blogger.com,1999:blog-7094652</id><updated>2012-01-12T10:37:32.401Z</updated><category term='paper'/><category term='haddock'/><category term='catch'/><category term='office'/><category term='derive'/><category term='supero'/><category term='filepath'/><category term='tutorial'/><category term='tagsoup'/><category term='hlint'/><category term='r'/><category term='hiw'/><category term='binarydefer'/><category term='yhc'/><category term='safe'/><category term='cabal'/><category term='conference'/><category term='xmonad'/><category term='neil'/><category term='ghc'/><category term='emily'/><category term='c#'/><category term='outlook'/><category term='darcs'/><category term='firstify'/><category term='wpf'/><category term='uniplate'/><category term='cmdargs'/><category term='windows'/><category term='hoogle'/><category term='ada'/><category term='soc'/><category term='rant'/><title type='text'>Neil Mitchell's Haskell Blog</title><subtitle type='html'></subtitle><link rel='http://schemas.google.com/g/2005#feed' type='application/atom+xml' href='http://neilmitchell.blogspot.com/feeds/posts/default'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default?max-results=100'/><link rel='alternate' type='text/html' href='http://neilmitchell.blogspot.com/'/><link rel='hub' href='http://pubsubhubbub.appspot.com/'/><link rel='next' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default?start-index=101&amp;max-results=100'/><author><name>Neil Mitchell</name><uri>http://www.blogger.com/profile/13084722756124486154</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www-users.cs.york.ac.uk/~ndm/elements/my-photo.jpg'/></author><generator version='7.00' uri='http://www.blogger.com'>Blogger</generator><openSearch:totalResults>172</openSearch:totalResults><openSearch:startIndex>1</openSearch:startIndex><openSearch:itemsPerPage>100</openSearch:itemsPerPage><entry><id>tag:blogger.com,1999:blog-7094652.post-3806981827422662779</id><published>2012-01-08T13:03:00.003Z</published><updated>2012-01-08T14:34:10.157Z</updated><title type='text'>Pascal's Triangle in Haskell</title><content type='html'>&lt;i&gt;Summary: I'm continually amazed how concise and powerful Haskell is, compared to mainstream languages. This post describes how to write Pascal's Triangle, and gives some of the advantages of using Haskell.&lt;/i&gt;&lt;br /&gt;&lt;br /&gt;Often, when programming in Haskell, I feel like I'm cheating. As an example, I recently came across &lt;a href="http://www.cforcoding.com/2012/01/interview-programming-problems-done.html"&gt;this article by William Shields&lt;/a&gt;, which suggests that prospective interview candidates be given simple programming tasks like generating &lt;a href="http://en.wikipedia.org/wiki/Pascal's_triangle"&gt;Pascal's Triangle&lt;/a&gt;. William gives examples in Python, including some of the answers a typical candidate might give.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Pascal's Triangle in Haskell&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;William describes Pascal's Triangle as:&lt;br /&gt;&lt;br /&gt;&lt;i&gt;"The root element is 1. Every other element is the sum of the one or two above it (diagonally left and diagonally right)."&lt;/i&gt;&lt;br /&gt;&lt;br /&gt;As a Haskell programmer, the obvious technique to use is induction. The first row of the triangle is &lt;tt&gt;[1]&lt;/tt&gt;, and each row can be computed from the previous row by adding the row shifted left, and the row shifted right:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;next xs = zipWith (+) ([0] ++ xs) (xs ++ [0])&lt;br /&gt;pascal = iterate next [1]&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Here, we define &lt;tt&gt;next&lt;/tt&gt; to take one row and produce the next row. We then use the &lt;tt&gt;iterate&lt;/tt&gt; function to repeatedly apply &lt;tt&gt;next&lt;/tt&gt; starting at the root element. The solution is short, and follows from the definition.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Laziness for Modularity&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;William originally posed three questions:&lt;br /&gt;&lt;br /&gt;&lt;ul&gt;&lt;br /&gt;&lt;li&gt;Print out the triangle to a specific row: &lt;tt&gt;print $ take 100 pascal&lt;/tt&gt;&lt;/li&gt;&lt;br /&gt;&lt;li&gt;Return a given row of the triangle: &lt;tt&gt;pascal !! 50&lt;/tt&gt;&lt;/li&gt;&lt;br /&gt;&lt;li&gt;Return a given element (by row and index) of the triangle: &lt;tt&gt;pascal !! 10 !! 5&lt;/tt&gt;&lt;/li&gt;&lt;br /&gt;&lt;/ul&gt;&lt;br /&gt;&lt;br /&gt;Thanks to laziness, we can concisely answer all these questions in terms of the original &lt;tt&gt;pascal&lt;/tt&gt; definition. In contrast, using a language such as Python, the best solution (Dynamic Programming from the original article) can only perform the first task.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Interview problems&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;The original article was not about the choice of programming language, but about choosing suitable questions for interviewing programmers. I agree with William that Pascal's Triangle is a suitable problem - it isn't a trick puzzle, it isn't an API knowledge quiz - it's about understanding how to program. Given how much easier the problem is to solve in Haskell, I wonder if using Haskell in a job interview should be considered cheating? ;-)&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7094652-3806981827422662779?l=neilmitchell.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://neilmitchell.blogspot.com/feeds/3806981827422662779/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=7094652&amp;postID=3806981827422662779' title='7 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/3806981827422662779'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/3806981827422662779'/><link rel='alternate' type='text/html' href='http://neilmitchell.blogspot.com/2012/01/pascals-triangle-in-haskell.html' title='Pascal&apos;s Triangle in Haskell'/><author><name>Neil Mitchell</name><uri>http://www.blogger.com/profile/13084722756124486154</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www-users.cs.york.ac.uk/~ndm/elements/my-photo.jpg'/></author><thr:total>7</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7094652.post-6195171823147689673</id><published>2012-01-08T11:38:00.006Z</published><updated>2012-01-08T12:03:30.848Z</updated><category scheme='http://www.blogger.com/atom/ns#' term='windows'/><title type='text'>Hiding Win32 Windows</title><content type='html'>&lt;i&gt;Summary: This post describes how to hide windows on the Windows operating system, by using the Win32 API from Haskell.&lt;/i&gt;&lt;br /&gt;&lt;br /&gt;Imagine that whenever your computer restarts, it pops up a message box:&lt;br /&gt;&lt;br /&gt;&lt;img src="http://4.bp.blogspot.com/-BynXXBF-jWw/TwmA0bFrmMI/AAAAAAAAAI8/HhAposE6TVI/s1600/msgbox.png" /&gt;&lt;br /&gt;&lt;br /&gt;If you interact with this window, your computer will be destroyed. You can't kill the process, or dismiss the window harmlessly. (This scenario isn't hypothetical...) The solution is to &lt;i&gt;hide&lt;/i&gt; the window, so it still exists but is out of the way of misplaced clicks. To hide the window, we first find it's OS handle, then we call some Win32 API functions.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Find the Window&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;To find the handle of the window, we use Spy++. Spy++ comes with Visual Studio, and is bundled in the Express edition (the free version) from 2010 onwards. Start Spy++, got to &lt;tt&gt;Search&lt;/tt&gt;, &lt;tt&gt;Find Window&lt;/tt&gt;, then use the finder tool to select the window in question. Check that the caption of the window matches what Spy++ reports:&lt;br /&gt;&lt;br /&gt;&lt;img src="http://4.bp.blogspot.com/-jGoprc-Mozw/TwmA60l1wQI/AAAAAAAAAJI/S8-a1FvHlkM/s1600/spy%252B%252B.png" /&gt;&lt;br /&gt;&lt;br /&gt;The important information is the handle: &lt;tt&gt;0004061E&lt;/tt&gt;.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Hide the Window&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;To hide the window you need a programming language capable of making Win32 API calls. In the past I have used Word VBA as the host language, but Haskell is probably easier. Start GHCi, and type:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;$ import System.Win32&lt;br /&gt;$ import Graphics.Win32&lt;br /&gt;$ showWindow (castUINTToPtr 0x0004061E) sW_HIDE&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Replace &lt;tt&gt;0x0004061E&lt;/tt&gt; on the final line with &lt;tt&gt;0x&lt;i&gt;your-handle&lt;/i&gt;&lt;/tt&gt;. The final line should cause the window to be hidden, saving your computer from destruction.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Thanks:&lt;/b&gt; Thanks to Roman Leshchinskiy for reminding me that there were better solutions than just trying not to click the window. Thanks to the Win32 Haskell developers - the Win32 binding was a lot of work, which not many people ever see.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7094652-6195171823147689673?l=neilmitchell.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://neilmitchell.blogspot.com/feeds/6195171823147689673/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=7094652&amp;postID=6195171823147689673' title='6 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/6195171823147689673'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/6195171823147689673'/><link rel='alternate' type='text/html' href='http://neilmitchell.blogspot.com/2012/01/hiding-win32-windows.html' title='Hiding Win32 Windows'/><author><name>Neil Mitchell</name><uri>http://www.blogger.com/profile/13084722756124486154</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www-users.cs.york.ac.uk/~ndm/elements/my-photo.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://4.bp.blogspot.com/-BynXXBF-jWw/TwmA0bFrmMI/AAAAAAAAAI8/HhAposE6TVI/s72-c/msgbox.png' height='72' width='72'/><thr:total>6</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7094652.post-148014221981417638</id><published>2011-12-14T21:14:00.004Z</published><updated>2011-12-14T21:23:47.249Z</updated><category scheme='http://www.blogger.com/atom/ns#' term='outlook'/><category scheme='http://www.blogger.com/atom/ns#' term='office'/><title type='text'>Enabling Reply To All in Outlook - version 2</title><content type='html'>I previously described &lt;a href="http://neilmitchell.blogspot.com/2008/12/enabling-reply-to-all-in-outlook.html"&gt;how to enable Reply To All in Outlook&lt;/a&gt;. Since then I've modified the VBA script so it works on both Office 2003 and 2007, gives better error messages, automatically replaces the disabled Reply To All buttons, allows editing messages (even if that feature has been disabled) and strips messages of inline images. Installation instructions and links to the code are in the user manual:&lt;br /&gt;&lt;br /&gt;&lt;a href="http://community.haskell.org/~ndm/darcs/office/ReplyToAll.htm"&gt;http://community.haskell.org/~ndm/darcs/office/ReplyToAll.htm&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;I don't want to duplicate the installation instructions here, as I will likely revise them over time. However, at the bottom of the manual is a plea to Outlook administrators not to disable Reply To All in the first place. I don't think arguments against disabling essential email functionality can be made often enough, so here is my plea:&lt;br /&gt;&lt;br /&gt;&lt;h2&gt;Plea to Outlook Administrators&lt;/h2&gt;&lt;br /&gt;&lt;br /&gt;Please do not disable essential email functionality. With the workarounds linked to above, the attempt is futile, but remains deeply inconvenient. Consider the situation where Alice sends an email to Bob, Charlie and Dave asking for some financial details of Mega Corp. Bob has the details on a post-it note on his desk and quickly replies to everyone with the information. But in a world without Reply To All...&lt;br /&gt;&lt;br /&gt;&lt;ul&gt;&lt;br /&gt; &lt;li&gt;Bob replies just to Alice. Charlie, being the helpful soul he is, decides to search through the filing cabinet for the information. 15 minutes later Charlie finds the details and emails Alice, only to be told "Thanks, Bob answered this 15 minutes ago". Charlie  realises he just wasted 15 minutes of his life, and goes to get a cookie to make himself feel better.&lt;/li&gt;&lt;br /&gt; &lt;li&gt;Bob replies just to Alice. At the end of the week Dave is reviewing his emails and realises that Alice still hasn't got a reply, but before he potentially wastes 15 minutes, he drops Alice an email - "Do you still want those details?". Alice replies "No". Dave concludes that he's a clever bunny for not going to the filing cabinet, and decides a one week latency when replying to Alice is just common sense.&lt;/li&gt;&lt;br /&gt; &lt;li&gt;Bob replies, but realising the potential cost of replying just to Alice, also replies to Charlie and Dave. Bob spent a minute retyping the recipient list, and wonders "Why can't Outlook have a button that does this for me?".&lt;/li&gt;&lt;br /&gt; &lt;li&gt;Bob replies, also including Charles and Dave. Woops! That should have been Charlie, not Charles. As a best case scenario, Charles gets annoyed with a useless email, but at worst Charles brings down Mega Corp with the sensitive information gleaned from the email. Charlie still ends up going to get a cookie.&lt;/li&gt;&lt;br /&gt;&lt;/ul&gt;&lt;br /&gt;&lt;br /&gt;Removing Reply To All increases the volume of email required, and increases the risk of email accidents. I've heard only two arguments against Reply To All, both of which are wrong:&lt;br /&gt;&lt;br /&gt;&lt;ul&gt;&lt;br /&gt; &lt;li&gt;When Bob replies to everyone, the financial information about Mega Corp gets sent to Charlie, but what if Charlie shouldn't have access to that information? Of course, if Charlie shouldn't have access to that information then Alice was at fault for sending the request to Charlie in the first place. Bob should also check when sending sensitive information, but most emails are not sensitive (it is a poor default), checking the senders should not require retyping the senders (it is a poor user interface) and by default mailing lists are not fully expanded (so he won't be able to tell the full list of senders anyway).&lt;/li&gt;&lt;br /&gt; &lt;li&gt;When Zorg, the owner of Mega Corp, sends a Christmas email to all his employees, what if one of them hits Reply To All? That wastes a lot of company time, and should be prevented - but not on the client side. It is possible to restrict the number of recipients to an email, Zorg could send out the email with everyone in the BCC field, or IT could set up a mailing list which does not permit replies.&lt;/li&gt;&lt;br /&gt;&lt;/ul&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7094652-148014221981417638?l=neilmitchell.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://neilmitchell.blogspot.com/feeds/148014221981417638/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=7094652&amp;postID=148014221981417638' title='5 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/148014221981417638'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/148014221981417638'/><link rel='alternate' type='text/html' href='http://neilmitchell.blogspot.com/2011/12/enabling-reply-to-all-in-outlook.html' title='Enabling Reply To All in Outlook - version 2'/><author><name>Neil Mitchell</name><uri>http://www.blogger.com/profile/13084722756124486154</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www-users.cs.york.ac.uk/~ndm/elements/my-photo.jpg'/></author><thr:total>5</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7094652.post-303935856416108785</id><published>2011-11-05T12:17:00.005Z</published><updated>2011-11-11T10:57:07.881Z</updated><category scheme='http://www.blogger.com/atom/ns#' term='uniplate'/><title type='text'>Abstract Generics with Uniplate</title><content type='html'>&lt;i&gt;Summary: The new version of Uniplate has several wrappers for working with abstract data types, such as Map from the containers package. These wrappers let you transform abstract data types without breaking their invariants.&lt;/i&gt;&lt;br /&gt;&lt;br /&gt;Abstract data types, such as &lt;tt&gt;Map&lt;/tt&gt; from the containers package, hide their internal structure so they can maintain invariants necessary for their correct operation. Generic programming, such as the Data class used by SYB, allows programs to generically traverse data types, without detailed knowledge of their structure. The two are somewhat incompatible - if &lt;tt&gt;Map&lt;/tt&gt; allows manipulating it's internal structure, the invariants could be broken causing &lt;tt&gt;Map&lt;/tt&gt; to perform incorrectly.&lt;br /&gt;&lt;br /&gt;Using the &lt;tt&gt;Data&lt;/tt&gt; class, a &lt;tt&gt;Map String Int&lt;/tt&gt; claims it has no constructors, but if you operate on it with &lt;tt&gt;gfoldl&lt;/tt&gt; is behaves as though it were &lt;tt&gt;[(String,Int)]&lt;/tt&gt;. When Uniplate traverses the &lt;tt&gt;Map&lt;/tt&gt;, such as when searching for an &lt;tt&gt;Int&lt;/tt&gt;, it first analyses the available constructors to see if a &lt;tt&gt;Map String Int&lt;/tt&gt; can possibly contain an &lt;tt&gt;Int&lt;/tt&gt;. Since &lt;tt&gt;Map&lt;/tt&gt; has no constructors, it concludes that &lt;tt&gt;Map&lt;/tt&gt; cannot contain an &lt;tt&gt;Int&lt;/tt&gt;, and Uniplate fails to operate on the contained &lt;tt&gt;Int&lt;/tt&gt;'s.&lt;br /&gt;&lt;br /&gt;For people who use the Data-style Uniplate module, using the new version of Uniplate you can now correctly operate over &lt;tt&gt;Map String Int&lt;/tt&gt;, provided you use the newtype &lt;tt&gt;Map&lt;/tt&gt; wrapper from &lt;tt&gt;Data.Generics.Uniplate.Data.Instances&lt;/tt&gt;. When you transform over &lt;tt&gt;Bool&lt;/tt&gt; (which does not touch the &lt;tt&gt;Map&lt;/tt&gt;) it will ignore the &lt;tt&gt;Map&lt;/tt&gt; and take &lt;i&gt;O(1)&lt;/i&gt;. When you transform over &lt;tt&gt;Int&lt;/tt&gt; it will reconstruct the &lt;tt&gt;Map&lt;/tt&gt; in &lt;i&gt;O(n)&lt;/i&gt;, and if you transform &lt;tt&gt;String&lt;/tt&gt; or &lt;tt&gt;Char&lt;/tt&gt; it will reconstruct the &lt;tt&gt;Map&lt;/tt&gt; in &lt;i&gt;O(n log n)&lt;/i&gt;. Regardless of what operations you do, it will work efficiently and correctly. As an example:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;$ import qualified Data.Map as Map&lt;br /&gt;$ import Data.Char&lt;br /&gt;$ import Data.Generics.Uniplate.Data&lt;br /&gt;$ import Data.Generics.Uniplate.Data.Instances&lt;br /&gt;$ fromMap $ transformBi toUpper $ toMap $ Map.fromList [("haskell",12),("test",18)]&lt;br /&gt;fromList [("HASKELL",12),("TEST",18)]&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;There are two approaches for dealing with the &lt;tt&gt;Map&lt;/tt&gt; problem in Uniplate. For users of the Direct-style Uniplate module, there is a function &lt;tt&gt;plateProject&lt;/tt&gt;, which has been available for some time. For users of the Data-style Uniplate module, or for users of the SYB package, there is a new module &lt;tt&gt;Data.Generics.Uniplate.Data.Instances&lt;/tt&gt; in uniplate-1.6.4 (released today) which provides three types with special &lt;tt&gt;Data&lt;/tt&gt; instances (&lt;tt&gt;Hide&lt;/tt&gt;, &lt;tt&gt;Trigger&lt;/tt&gt;, &lt;tt&gt;Invariant&lt;/tt&gt;). Using these three types we can construct wrappers providing Data instances for abstract types, and Uniplate includes wrappers for several of the types in the containers package (&lt;tt&gt;Map&lt;/tt&gt;, &lt;tt&gt;Set&lt;/tt&gt;, &lt;tt&gt;IntMap&lt;/tt&gt;, &lt;tt&gt;IntSet&lt;/tt&gt;).&lt;br /&gt;&lt;br /&gt;&lt;b&gt;plateProject&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;The plateProject function helps you define Direct Uniplate instances for abstract types. Instead of defining how to examine the data type, you instead define how to transform the data type into one you can examine, and how to transform it back. As an example:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;instance Biplate (Map.Map [Char] Int) Char where&lt;br /&gt;    biplate = plateProject Map.toList Map.fromList&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;If the types ensure that any operations will not change the keys we can optimise and use the &lt;tt&gt;fromDistictAscList&lt;/tt&gt; function to reconstruct the &lt;tt&gt;Map&lt;/tt&gt;:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;instance Biplate (Map.Map [Char] Int) Int where&lt;br /&gt;    biplate = plateProject Map.toAscList Map.fromDistinctAscList&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Hide&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;The &lt;tt&gt;Hide&lt;/tt&gt; data type is useful for wrapping values that you want to ignore with Uniplate. The type is defined as:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;data Hide a = Hide {fromHide :: a}&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;But has a &lt;tt&gt;Data&lt;/tt&gt; instance that pretends it is defined using the extension &lt;tt&gt;EmptyDataDecls&lt;/tt&gt; as:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;data Hide a&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;As an example of avoiding particular values, you can write:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;transformBi (+1) (1, 2, Hide 3, Just 4) == (2, 3, Hide 3, Just 5)&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;As a result of having no constructors, any calls to the methods &lt;tt&gt;toConstr&lt;/tt&gt; or &lt;tt&gt;gunfold&lt;/tt&gt; will raise an error.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Trigger&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;The &lt;tt&gt;Trigger&lt;/tt&gt; data type is useful for determining when a value was constructed with the &lt;tt&gt;Data&lt;/tt&gt; methods. It is defined as:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;data Trigger a = Trigger {trigger :: Bool, fromTrigger :: a}&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;But the &lt;tt&gt;Data&lt;/tt&gt; instance pretends that it is defined as:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;data Trigger a = Trigger a&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;However, whenever &lt;tt&gt;gfoldl&lt;/tt&gt; or &lt;tt&gt;gunfold&lt;/tt&gt; constructs a new value, it will have the &lt;tt&gt;trigger&lt;/tt&gt; field set to &lt;tt&gt;True&lt;/tt&gt;. The trigger information is useful to indicate whether any invariants have been broken, and thus need fixing. As an example:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;data SortedList a = SortedList (Trigger [a]) deriving (Data,Typeable)&lt;br /&gt;toSortedList xs = SortedList $ Trigger False $ sort xs&lt;br /&gt;fromSortedList (SortedList (Trigger t xs)) = if t then sort xs else xs&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;This data type represents a sorted list. When constructed the items are initially sorted, but operations such as &lt;tt&gt;gmapT&lt;/tt&gt; could break that invariant. The &lt;tt&gt;Trigger&lt;/tt&gt; type is used to detect when the Data operations have been performed, and resort the list.&lt;br /&gt;&lt;br /&gt;The &lt;tt&gt;Trigger&lt;/tt&gt; type is often used in conjunction with &lt;tt&gt;Invariant&lt;/tt&gt;, which fixes the invariants.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Invariant&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;The &lt;tt&gt;Invariant&lt;/tt&gt; data type is useful for ensuring that an invariant is always applied to a data type. It is defined as:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;data Invariant a = Invariant {invariant :: a -&gt; a, fromInvariant :: a}&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;But the &lt;tt&gt;Data&lt;/tt&gt; instance pretends that it is defined as:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;data Invariant a = Invariant a&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Whenever a &lt;tt&gt;gfoldl&lt;/tt&gt; constructs a new value, it will have the function in the &lt;tt&gt;invariant&lt;/tt&gt; field applied to it. As an example:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;data SortedList a = SortedList (Invariant [a]) deriving (Data,Typeable)&lt;br /&gt;toSortedList xs = SortedList $ Invariant sort (sort xs)&lt;br /&gt;fromSortedList (SortedList (Invariant _ xs)) = xs&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Any time an operation such as &lt;tt&gt;gmapT&lt;/tt&gt; is applied to the data type, the &lt;tt&gt;invariant&lt;/tt&gt; function is applied to the result. The &lt;tt&gt;fromSortedList&lt;/tt&gt; function can then rely on this invariant.&lt;br /&gt;&lt;br /&gt;The &lt;tt&gt;gunfold&lt;/tt&gt; method is partially implemented - all constructed values will have an undefined value for all fields, regardless of which function is passed to &lt;tt&gt;fromConstrB&lt;/tt&gt;. If you only use &lt;tt&gt;fromConstr&lt;/tt&gt; (as Uniplate does) then the &lt;tt&gt;gunfold&lt;/tt&gt; method is sufficient.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Map&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;Using the &lt;tt&gt;Hide&lt;/tt&gt;, &lt;tt&gt;Trigger&lt;/tt&gt; and &lt;tt&gt;Invariant&lt;/tt&gt; types, we can define a wrapper for the containers &lt;tt&gt;Map&lt;/tt&gt; type as:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;newtype Map k v = Map (Invariant (Trigger [k], Trigger [v], Hide (Map.Map k v)))&lt;br /&gt;    deriving (Data, Typeable)&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;The &lt;tt&gt;Map&lt;/tt&gt; type is defined as an &lt;tt&gt;Invariant&lt;/tt&gt; of three components - the keys, the values, and the underlying &lt;tt&gt;Map&lt;/tt&gt;. We use &lt;tt&gt;Invariant&lt;/tt&gt; to ensure that the keys/values/map always remain in sync. We use &lt;tt&gt;Trigger&lt;/tt&gt; on the keys and values to ensure that whenever the keys or values change we rebuild the &lt;tt&gt;Map&lt;/tt&gt;, but if they don't, we reuse the previous value. The function to extract a containers &lt;tt&gt;Map&lt;/tt&gt; from the wrapper requires only simple pattern matching:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;fromMap (Map (Invariant _ (_,_,Hide x))) = x&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;The function to wrap a containers &lt;tt&gt;Map&lt;/tt&gt; is slightly harder, as we need to come up with an invariant restoring function:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;toMap :: Ord k =&gt; Map.Map k v -&gt; Map k v&lt;br /&gt;toMap x = Map $ Invariant inv $ create x&lt;br /&gt;    where&lt;br /&gt;        create x = (Trigger False ks, Trigger False vs, Hide x)&lt;br /&gt;            where (ks,vs) = unzip $ Map.toAscList x&lt;br /&gt;&lt;br /&gt;        inv (ks,vs,x)&lt;br /&gt;            | trigger ks = create $ Map.fromList $ zip (fromTrigger ks) (fromTrigger vs)&lt;br /&gt;            | trigger vs = create $ Map.fromDistinctAscList $ zip (fromTrigger ks) (fromTrigger vs)&lt;br /&gt;            | otherwise = (ks,vs,x)&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;The &lt;tt&gt;create&lt;/tt&gt; function creates a value from a &lt;tt&gt;Map&lt;/tt&gt;, getting the correct keys and values. The &lt;tt&gt;inv&lt;/tt&gt; function looks at the triggers on the keys/values. If the keys trigger has been tripped, then we reconstruct the &lt;tt&gt;Map&lt;/tt&gt; using &lt;tt&gt;fromList&lt;/tt&gt;. If the values trigger has been tripped, but they keys trigger has not, we can use &lt;tt&gt;fromDistinctAscList&lt;/tt&gt;, reducing the complexity of constructing the &lt;tt&gt;Map&lt;/tt&gt;. If nothing has changed we can reuse the previous value.&lt;br /&gt;&lt;br /&gt;The end result is that all Uniplate (or SYB) traversals over &lt;tt&gt;Map&lt;/tt&gt; result in a valid value, which has had all appropriate transformations applied.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7094652-303935856416108785?l=neilmitchell.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://neilmitchell.blogspot.com/feeds/303935856416108785/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=7094652&amp;postID=303935856416108785' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/303935856416108785'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/303935856416108785'/><link rel='alternate' type='text/html' href='http://neilmitchell.blogspot.com/2011/11/abstract-generics-with-uniplate.html' title='Abstract Generics with Uniplate'/><author><name>Neil Mitchell</name><uri>http://www.blogger.com/profile/13084722756124486154</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www-users.cs.york.ac.uk/~ndm/elements/my-photo.jpg'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7094652.post-103569694669985581</id><published>2011-11-01T18:00:00.003Z</published><updated>2011-11-01T18:06:40.174Z</updated><category scheme='http://www.blogger.com/atom/ns#' term='hiw'/><title type='text'>Haskell Implementors Workshop 2011</title><content type='html'>The videos from the &lt;a href="http://www.haskell.org/haskellwiki/HaskellImplementorsWorkshop/2011"&gt;Haskell Implementors Workshop 2011&lt;/a&gt; are now online at &lt;a href="http://www.youtube.com/playlist?list=PLF5363394606160CE"&gt;YouTube&lt;/a&gt;. I'm currently uploading them to &lt;a href="http://vimeo.com/album/1736561"&gt;Vimeo&lt;/a&gt;, but that will take another week (I ran out of free quota).&lt;br /&gt;&lt;br /&gt;Many thanks to &lt;a href="http://www.ipl.t.u-tokyo.ac.jp/~emoto/"&gt;Kento EMOTO&lt;/a&gt; who did all the editing and conversion, and to all the speakers for the great talks.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7094652-103569694669985581?l=neilmitchell.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://neilmitchell.blogspot.com/feeds/103569694669985581/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=7094652&amp;postID=103569694669985581' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/103569694669985581'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/103569694669985581'/><link rel='alternate' type='text/html' href='http://neilmitchell.blogspot.com/2011/11/haskell-implementors-workshop-2011.html' title='Haskell Implementors Workshop 2011'/><author><name>Neil Mitchell</name><uri>http://www.blogger.com/profile/13084722756124486154</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www-users.cs.york.ac.uk/~ndm/elements/my-photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7094652.post-3532878487315634840</id><published>2011-10-30T09:55:00.004Z</published><updated>2011-11-04T11:47:49.135Z</updated><category scheme='http://www.blogger.com/atom/ns#' term='r'/><title type='text'>Calling Haskell from R</title><content type='html'>&lt;i&gt;Summary: This post describes how to write a function in Haskell, and then call it from R.&lt;/i&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="http://www.r-project.org/"&gt;R&lt;/a&gt; is a very popular language for statistics, particular with biologists (and &lt;a href="http://neilmitchell.blogspot.com/2011/03/experience-report-functional.html"&gt;computational paleobiologists&lt;/a&gt;). For writing high performance code, the R developers recommend the use of C or Fortran - not languages that are particularly easy for beginners. However, you can instead write a Haskell function that can be called directly from R. The basic idea is to create a C compatible library using Haskell (as described in &lt;a href="http://www.haskell.org/ghc/docs/latest/html/users_guide/win32-dlls.html"&gt;the GHC users manual&lt;/a&gt;) and then call that library from R (as described &lt;a href="http://users.stat.umn.edu/~charlie/rc/"&gt;in this document&lt;/a&gt;). As a simple example, let's write a function that adds up the square roots of a list of numbers.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Create an R-compatible Haskell library&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;In normal Haskell, we would define the function to add up the square roots of a list as:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;sumRoots :: [Double] -&gt; Double&lt;br /&gt;sumRoots xs = sum (map sqrt xs)&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;However, to make a function that is compatible with R, we have to follow two rules:&lt;br /&gt;&lt;br /&gt;&lt;ul&gt;&lt;br /&gt;&lt;li&gt;Every argument must be a &lt;tt&gt;Ptr&lt;/tt&gt; to a C compatible type, typically &lt;tt&gt;Int&lt;/tt&gt;, &lt;tt&gt;Double&lt;/tt&gt; or &lt;tt&gt;CString&lt;/tt&gt;. (To be pedantic, we should probably use &lt;tt&gt;CInt&lt;/tt&gt; or &lt;tt&gt;CDouble&lt;/tt&gt;, but using GHC on Windows these types are equivalent - keeping things simpler.)&lt;/li&gt;&lt;br /&gt;&lt;li&gt;The result must be &lt;tt&gt;IO ()&lt;/tt&gt;&lt;/li&gt;&lt;br /&gt;&lt;/ul&gt;&lt;br /&gt;&lt;br /&gt;Obeying these restrictions, we need to use the type:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;sumRootsR :: Ptr Int -&gt; Ptr Double -&gt; Ptr Double -&gt; IO ()&lt;br /&gt;sumRootsR n xs result = ...&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Instead of passing in the list &lt;tt&gt;xs&lt;/tt&gt;, we now pass in:&lt;br /&gt;&lt;br /&gt;&lt;ul&gt;&lt;br /&gt;&lt;li&gt;&lt;tt&gt;n&lt;/tt&gt;, the length of the list &lt;tt&gt;xs&lt;/tt&gt;&lt;/li&gt;&lt;br /&gt;&lt;li&gt;&lt;tt&gt;xs&lt;/tt&gt;, the elements of the list&lt;/li&gt;&lt;br /&gt;&lt;li&gt;&lt;tt&gt;result&lt;/tt&gt;, a space to put the result&lt;/li&gt;&lt;br /&gt;&lt;/ul&gt;&lt;br /&gt;&lt;br /&gt;We can implement &lt;tt&gt;sumRootsR&lt;/tt&gt; by using the functions available in the &lt;tt&gt;Foreign&lt;/tt&gt; module:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;sumRootsR :: Ptr Int -&gt; Ptr Double -&gt; Ptr Double -&gt; IO ()&lt;br /&gt;sumRootsR n xs result = do&lt;br /&gt;    n &amp;lt;- peek n&lt;br /&gt;    xs &amp;lt;- peekArray n xs&lt;br /&gt;    poke result $ sumRoots xs&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;This function first gets the value for &lt;tt&gt;n&lt;/tt&gt;, then for each element in &lt;tt&gt;0..n-1&lt;/tt&gt; gets the element out of the pointer array &lt;tt&gt;xs&lt;/tt&gt; and puts it in a nice list. We then call the original &lt;tt&gt;sumRoots&lt;/tt&gt;, and store the value in the space provided by result. As a general rule, you should put all the logic in one function (&lt;tt&gt;sumRoots&lt;/tt&gt;), and the wrapping in another (&lt;tt&gt;sumRootsR&lt;/tt&gt;). We can then export this function with the definition:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;foreign export ccall sumRootsR :: Ptr Int -&gt; Ptr Double -&gt; Ptr Double -&gt; IO ()&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Putting everything together, we end up with the Haskell file:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;-- SumRoots.hs&lt;br /&gt;{-# LANGUAGE ForeignFunctionInterface #-}&lt;br /&gt;module SumRoots where&lt;br /&gt;&lt;br /&gt;import Foreign&lt;br /&gt;&lt;br /&gt;foreign export ccall sumRootsR :: Ptr Int -&gt; Ptr Double -&gt; Ptr Double -&gt; IO ()&lt;br /&gt;&lt;br /&gt;sumRootsR :: Ptr Int -&gt; Ptr Double -&gt; Ptr Double -&gt; IO ()&lt;br /&gt;sumRootsR n xs result = do&lt;br /&gt;    n &lt;- peek n&lt;br /&gt;    xs &lt;- peekArray n xs&lt;br /&gt;    poke result $ sumRoots xs&lt;br /&gt;&lt;br /&gt;sumRoots :: [Double] -&gt; Double&lt;br /&gt;sumRoots xs = sum (map sqrt xs)&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;We also need a C stub file. The one described in the &lt;a href="http://www.haskell.org/ghc/docs/latest/html/users_guide/win32-dlls.html"&gt;GHC users guide&lt;/a&gt; works well:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;// StartEnd.c&lt;br /&gt;#include &amp;lt;Rts.h&amp;gt;&lt;br /&gt;&lt;br /&gt;void HsStart()&lt;br /&gt;{&lt;br /&gt;   int argc = 1;&lt;br /&gt;   char* argv[] = {"ghcDll", NULL}; // argv must end with NULL&lt;br /&gt;&lt;br /&gt;   // Initialize Haskell runtime&lt;br /&gt;   char** args = argv;&lt;br /&gt;   hs_init(&amp;argc, &amp;args);&lt;br /&gt;}&lt;br /&gt;&lt;br /&gt;void HsEnd()&lt;br /&gt;{&lt;br /&gt;   hs_exit();&lt;br /&gt;}&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;We can now compile our library with the commands:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;ghc -c SumRoots.hs&lt;br /&gt;ghc -c StartEnd.c&lt;br /&gt;ghc -shared -o SumRoots.dll SumRoots.o StartEnd.o&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;This creates the library &lt;tt&gt;SumRoots.dll&lt;/tt&gt;.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Calling Haskell from R&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;At the R command prompt, we can load the library with:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;dyn.load("C:/SumRoots.dll") # use the full path to the SumRoots library&lt;br /&gt;.C("HsStart")&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;We can now invoke our function:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;input = c(9,3.5,5.58,64.1,12.54)&lt;br /&gt;.C("sumRootsR", n=as.integer(length(input)), xs=as.double(input), result=as.double(0))$result&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;This prints out the answer 18.78046.&lt;br /&gt;&lt;br /&gt;We can make this function easier to use on the R side by writing a wrapper, for example:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;sumRoots &lt;- function(input)&lt;br /&gt;{&lt;br /&gt;    return(.C("sumRootsR", n=as.integer(length(input)), xs=as.double(input), result=as.double(0))$result)&lt;br /&gt;}&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Now we can write:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;sumRoots(c(12,444.34))&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;And get back the answer 24.54348. With a small amount of glue code, it's easy to call Haskell libraries from R programs.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7094652-3532878487315634840?l=neilmitchell.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://neilmitchell.blogspot.com/feeds/3532878487315634840/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=7094652&amp;postID=3532878487315634840' title='8 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/3532878487315634840'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/3532878487315634840'/><link rel='alternate' type='text/html' href='http://neilmitchell.blogspot.com/2011/10/calling-haskell-from-r.html' title='Calling Haskell from R'/><author><name>Neil Mitchell</name><uri>http://www.blogger.com/profile/13084722756124486154</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www-users.cs.york.ac.uk/~ndm/elements/my-photo.jpg'/></author><thr:total>8</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7094652.post-5182023435954281588</id><published>2011-10-01T18:28:00.003+01:00</published><updated>2011-10-01T18:34:17.025+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='conference'/><title type='text'>Haskell Implementors Workshop 2011 slides</title><content type='html'>I've just uploaded the slides for all the talks given at the &lt;a href="http://www.haskell.org/haskellwiki/HaskellImplementorsWorkshop/2011"&gt;Haskell Implementors Workshop 2011&lt;/a&gt; - just click on "Slides" next to any of the talks. Many thanks to all the speakers for very interesting talks, and for sending me their slides.&lt;br /&gt;&lt;br /&gt;There will be video, hosted on either YouTube or Vimeo, at some point in the future. I've got all the copyright permission forms, but I don't yet have the video footage. I'm currently working to get a copy of the video sent from Japan.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7094652-5182023435954281588?l=neilmitchell.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://neilmitchell.blogspot.com/feeds/5182023435954281588/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=7094652&amp;postID=5182023435954281588' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/5182023435954281588'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/5182023435954281588'/><link rel='alternate' type='text/html' href='http://neilmitchell.blogspot.com/2011/10/haskell-implementors-workshop-2011.html' title='Haskell Implementors Workshop 2011 slides'/><author><name>Neil Mitchell</name><uri>http://www.blogger.com/profile/13084722756124486154</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www-users.cs.york.ac.uk/~ndm/elements/my-photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7094652.post-2869676683645623007</id><published>2011-10-01T08:00:00.004+01:00</published><updated>2011-10-01T08:05:33.400+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='uniplate'/><title type='text'>Template Haskell fights with Generic Programming</title><content type='html'>&lt;i&gt;Summary: The InfixE construction in Template Haskell fits poorly with generic programming, because its type does not capture all its restrictions. This mismatch can result in confusing bugs, but there is a simple workaround.&lt;/i&gt;&lt;br /&gt;&lt;br /&gt;I have often said that anyone manipulating abstract syntax trees, without using some form of generic programming, is doing it wrong. Recently I have been manipulating Template Haskell syntax trees using &lt;a href="http://hackage.haskell.org/packages/uniplate"&gt;Uniplate&lt;/a&gt;, my preferred generic programming library. Consider the problem of replacing all instances of &lt;tt&gt;delete&lt;/tt&gt; with &lt;tt&gt;deleteBy (==)&lt;/tt&gt; - this task can be done with Template Haskell:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;{-# LANGUAGE TemplateHaskell #-}&lt;br /&gt;module IHateDelete where&lt;br /&gt;import Data.List&lt;br /&gt;import Language.Haskell.TH&lt;br /&gt;import Data.Generics.Uniplate.Data&lt;br /&gt;&lt;br /&gt;iHateDelete :: Q [Dec] -&gt; Q [Dec]&lt;br /&gt;iHateDelete = fmap (transformBi f)&lt;br /&gt;    where&lt;br /&gt;        f :: Exp -&gt; Exp&lt;br /&gt;        f (VarE x) | x == 'delete = VarE 'deleteBy `AppE` VarE '(==)&lt;br /&gt;        f x = x&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;We can test this function with:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;{-# LANGUAGE TemplateHaskell #-}&lt;br /&gt;import IHateDelete&lt;br /&gt;import Data.List&lt;br /&gt;&lt;br /&gt;$(iHateDelete&lt;br /&gt;    [d|&lt;br /&gt;        mapDelete x = map (delete x)&lt;br /&gt;        myElem x xs = length (delete x xs) /= length xs&lt;br /&gt;    |])&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;To see the result of running &lt;tt&gt;iHateDelete&lt;/tt&gt; pass the flag &lt;tt&gt;-ddump-splices&lt;/tt&gt;. As far as we can tell, our &lt;tt&gt;iHateDelete&lt;/tt&gt; function works perfectly. But wait - let's try another example:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;$(iHateDelete&lt;br /&gt;    [d|&lt;br /&gt;        myDelete x xs = x `delete` xs&lt;br /&gt;    |])&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;In GHC 6.12, we get a GHC panic. In GHC 7.2 we get the error message:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;Operator application with a non-variable operator: deleteBy (==)&lt;br /&gt;(Probably resulting from a Template Haskell splice)&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;(I would find this message far clearer if it said "Infix application..." rather than "Operation application...")&lt;br /&gt;&lt;br /&gt;The body of &lt;tt&gt;myDelete&lt;/tt&gt; starts out as:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;InfixE (Just (VarE 'x)) (VarE 'delete) (Just (VarE' xs))&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;After our transformation, this becomes:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;InfixE (Just (VarE 'x)) (AppE (VarE 'deleteBy) ('(==))) (Just (VarE' xs))&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Or, written in Haskell syntax:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;x `deleteBy (==)` xs&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;This expression is not valid Haskell, and causes an error when spliced back in (when inserted back into the Haskell code).&lt;br /&gt;&lt;br /&gt;&lt;b&gt;The Problem&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;The underlying problem is called out in the Template Haskell &lt;tt&gt;Exp&lt;/tt&gt; documentation:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;InfixE (Maybe Exp) Exp (Maybe Exp)&lt;br /&gt;    -- ^ It's a bit gruesome to use an Exp as the operator, but how else can we distinguish constructors from non-constructors?&lt;br /&gt;    --   Maybe there should be a var-or-con type? Or maybe we should leave it to the String itself?&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;The operator in &lt;tt&gt;InfixE&lt;/tt&gt; has a type which permits any expression, but has the restriction that when spliced back in the expression must only be a &lt;tt&gt;VarE&lt;/tt&gt; or &lt;tt&gt;ConE&lt;/tt&gt;. This restriction poses a problem for generic programming, where values are treated in a uniform manner. Sadly, both of the suggested fixes would also cause problems for generic programming. Perhaps the true fix is to let Haskell have arbitrary expressions for infix operators? Or perhaps Template Haskell should translate &lt;tt&gt;InfixE&lt;/tt&gt; to &lt;tt&gt;AppE&lt;/tt&gt; if the operator is incompatible with Haskell?&lt;br /&gt;&lt;br /&gt;&lt;b&gt;The Workaround&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;As a workaround, you can translate away all &lt;tt&gt;InfixE&lt;/tt&gt; expressions that have a complex middle expression. I use the following function:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;fixupInfix :: [Dec] -&gt; [Dec]&lt;br /&gt;fixupInfix = transformBi f&lt;br /&gt;    where&lt;br /&gt;        bad VarE{} = False&lt;br /&gt;        bad ConE{} = False&lt;br /&gt;        bad _ = True&lt;br /&gt;&lt;br /&gt;        f (InfixE a b c) | bad b = case (a,c) of&lt;br /&gt;            (Nothing, Nothing) -&gt; b&lt;br /&gt;            (Just a , Nothing) -&gt; b `AppE` a&lt;br /&gt;            (Nothing, Just c ) -&gt; VarE 'flip `AppE` b `AppE` c&lt;br /&gt;            (Just a , Just c ) -&gt; b `AppE` a `AppE` c&lt;br /&gt;        f x = x&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;The original &lt;tt&gt;iHateDelete&lt;/tt&gt; can then be modified to become:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;iHateDelete = fmap (fixupInfix . transformBi f)&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7094652-2869676683645623007?l=neilmitchell.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://neilmitchell.blogspot.com/feeds/2869676683645623007/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=7094652&amp;postID=2869676683645623007' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/2869676683645623007'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/2869676683645623007'/><link rel='alternate' type='text/html' href='http://neilmitchell.blogspot.com/2011/10/template-haskell-fights-with-generic.html' title='Template Haskell fights with Generic Programming'/><author><name>Neil Mitchell</name><uri>http://www.blogger.com/profile/13084722756124486154</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www-users.cs.york.ac.uk/~ndm/elements/my-photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7094652.post-4094078378910420890</id><published>2011-09-06T18:43:00.002+01:00</published><updated>2011-09-06T18:52:33.224+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='wpf'/><category scheme='http://www.blogger.com/atom/ns#' term='c#'/><title type='text'>Faster WPF ComboBoxes</title><content type='html'>If you create a &lt;a href="http://en.wikipedia.org/wiki/Windows_Presentation_Foundation"&gt;WPF&lt;/a&gt; &lt;a href="http://msdn.microsoft.com/en-us/library/system.windows.controls.combobox.aspx"&gt;ComboBox&lt;/a&gt; with 2000 items it will take about 2 seconds to drop down, and about 1 second to retract back. But you can make all operations complete instantly if the items are simple (such as strings). If you are writing XAML, add:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;&amp;lt;ComboBox&amp;gt;&lt;br /&gt;  &amp;lt;ComboBox.ItemsPanel&amp;gt;&lt;br /&gt;    &amp;lt;ItemsPanelTemplate&amp;gt;&lt;br /&gt;      &amp;lt;VirtualizingStackPanel /&amp;gt;&lt;br /&gt;    &amp;lt;/ItemsPanelTemplate&amp;gt;&lt;br /&gt;  &amp;lt;/ComboBox.ItemsPanel&amp;gt;&lt;br /&gt;&lt;br /&gt;  ...&lt;br /&gt;&amp;lt;/ComboBox&amp;gt;&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;If you are writing C#, add:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;comboBox.ItemsPanel = new ItemsPanelTemplate(new FrameworkElementFactory(typeof(VirtualizingStackPanel)));&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Thanks to &lt;a href="http://blogs.msdn.com/b/henryh/archive/2007/06/15/loading-a-combobox-with-many-items-is-slow.aspx"&gt;Henry Hahn&lt;/a&gt; for the XAML code I started from. Reading the C# version, I am reminded of the &lt;a href="http://discuss.joelonsoftware.com/default.asp?joel.3.219431"&gt;Hammer Factory&lt;/a&gt; joke.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7094652-4094078378910420890?l=neilmitchell.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://neilmitchell.blogspot.com/feeds/4094078378910420890/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=7094652&amp;postID=4094078378910420890' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/4094078378910420890'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/4094078378910420890'/><link rel='alternate' type='text/html' href='http://neilmitchell.blogspot.com/2011/09/faster-wpf-comboboxes.html' title='Faster WPF ComboBoxes'/><author><name>Neil Mitchell</name><uri>http://www.blogger.com/profile/13084722756124486154</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www-users.cs.york.ac.uk/~ndm/elements/my-photo.jpg'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7094652.post-7334274688121679446</id><published>2011-09-04T13:07:00.002+01:00</published><updated>2011-09-04T13:13:58.188+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='tutorial'/><title type='text'>Sharing in Haskell</title><content type='html'>&lt;i&gt;Summary: The let and lambda constructs give a precise way to control sharing, but their exact use can be tricky. This post gives some worked examples and general guidance.&lt;/i&gt;&lt;br /&gt;&lt;br /&gt;An easy way to improve performance is to &lt;a href="http://neilmitchell.blogspot.com/2010/01/optimising-hlint.html"&gt;call something fewer times&lt;/a&gt;, which requires understanding how many times something gets called. One topic I find myself regularly explaining is how lambda expressions under let expressions affect sharing. Consider the two following examples:&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Example 1&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;f x y = sqrt x + y&lt;br /&gt;result = f 1 2 + f 1 4&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Example 2&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;f x = let sqrt_x = sqrt x in \y -&gt; sqrt_x + y&lt;br /&gt;result = let f_1 = f 1 in f_1 2 + f_1 4&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Question&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;In each example, how many times is &lt;tt&gt;sqrt&lt;/tt&gt; executed to compute &lt;tt&gt;result&lt;/tt&gt;? (Assume no advanced optimisations - these often break down on larger examples.)&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Answer&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;In Example 1 we execute &lt;tt&gt;sqrt&lt;/tt&gt; twice, while in Example 2 we execute &lt;tt&gt;sqrt&lt;/tt&gt; once. To go from Example 1 to Example 2 we need to make two changes:&lt;br /&gt;&lt;br /&gt;&lt;ul&gt;&lt;br /&gt;&lt;li&gt;Step 1: Rewrite &lt;tt&gt;f&lt;/tt&gt; to compute &lt;tt&gt;sqrt&lt;/tt&gt; after one argument instead of two.&lt;/li&gt;&lt;br /&gt;&lt;li&gt;Step 2: Rewrite &lt;tt&gt;result&lt;/tt&gt; to share the result of &lt;tt&gt;f&lt;/tt&gt; with one argument.&lt;/li&gt;&lt;br /&gt;&lt;/ul&gt;&lt;br /&gt;&lt;br /&gt;Performing either rewrite alone will still result in &lt;tt&gt;sqrt&lt;/tt&gt; being executed twice.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Step 1: Rewriting &lt;tt&gt;f&lt;/tt&gt;&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;Let's take a look at the original definition of &lt;tt&gt;f&lt;/tt&gt;:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;f x y = sqrt x + y&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Rewriting this function in English, we can describe it as:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;given x and y, compute sqrt x + y&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;But the computation of &lt;tt&gt;sqrt x&lt;/tt&gt; does not depend on &lt;tt&gt;y&lt;/tt&gt;. If the computation of &lt;tt&gt;sqrt x&lt;/tt&gt; is expensive, and if we know the function will often be called with the same &lt;tt&gt;x&lt;/tt&gt; for many different values of &lt;tt&gt;y&lt;/tt&gt;, it is better to describe it as:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;given x, compute sqrt x, then given y, add that value to y&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;The Haskell syntax for this description is:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;f = \x -&gt; let sqrt_x = sqrt x in \y -&gt; sqrt_x + y&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Which would usually be written in the equivalent declaration form as:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;f x = let sqrt_x = sqrt x in \y -&gt; sqrt_x + y&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Step 2: Using the rewritten &lt;tt&gt;f&lt;/tt&gt;&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;If we look at the definition of &lt;tt&gt;result&lt;/tt&gt;:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;result = f 1 2 + f 1 4&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;We see that the subexpression &lt;tt&gt;f 1&lt;/tt&gt; occurs twice. We can perform common subexpression elimination (CSE) and write:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;result = let f_1 = f 1 in f_1 2 + f_1 4&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;With the original definition of &lt;tt&gt;f&lt;/tt&gt;, commoning up &lt;tt&gt;f 1&lt;/tt&gt; would have had no performance benefit - after &lt;tt&gt;f&lt;/tt&gt; was applied to 1 argument it did nothing but wait for the second argument. However, with the revised definition of &lt;tt&gt;f&lt;/tt&gt;, the value &lt;tt&gt;f_1&lt;/tt&gt; will create the computation of &lt;tt&gt;sqrt 1&lt;/tt&gt;, which will be performed only once when executed by &lt;tt&gt;f_1 2&lt;/tt&gt; and &lt;tt&gt;f_1 4&lt;/tt&gt;.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;The Optimisation&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;This optimisation technique can be described as:&lt;br /&gt;&lt;br /&gt;&lt;ul&gt;&lt;br /&gt;&lt;li&gt;Step 1: Rewrite the function to perform some computation before all arguments are supplied.&lt;/li&gt;&lt;br /&gt;&lt;li&gt;Step 2: Share the partially applied function.&lt;/li&gt;&lt;br /&gt;&lt;/ul&gt;&lt;br /&gt;&lt;br /&gt;Crucially the function in Step 1 must take it's arguments in an order that allows computation to be performed incrementally.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;A Practical Example&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;In previous versions of &lt;a href="haskell.org/hoogle/"&gt;Hoogle&lt;/a&gt;, the function I wrote to resolve type synonyms (e.g. &lt;tt&gt;type String = [Char]&lt;/tt&gt;) was:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;resolveSynonyms :: [Synonym] -&gt; TypeSig -&gt; TypeSig&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Given a list of type synonyms, and a type signature, return the type signature with all synonyms expanded out. However, searching through a list of type synonyms is expensive - it is more efficient to compute a table allowing fast lookup by synonym name. Therefore, I used the optimisation technique above to write:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;resolveSynonyms synonyms = let info = buildSynonymTable synonyms in \x -&gt; transformSynonyms info x&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;This technique worked well, especially given that the list of synonyms was usually constant. However, from simply looking at the type signatures, someone else is unlikely to guess that &lt;tt&gt;resolveSynonyms&lt;/tt&gt; should be partially applied where possible. An alternative is to make the sharing more explicit in the types, and provide:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;data SynonymTable&lt;br /&gt;buildSynonymTable :: [Synonym] -&gt; SynonymTable&lt;br /&gt;resolveSynonyms :: SynonymTable -&gt; TypeSig -&gt; TypeSig&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;The disadvantage is the increase in the size of the API - we have gone from one function to two functions and a data type. Something that used to take one function call now takes two.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Conclusion&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;I think all Haskell programmers benefit from understand how the interaction of lambda and let affect sharing. Pushing lambda under let is often a useful optimisation technique, particularly when the resulting function is used in a map. However, I wouldn't usually recommend exporting public API's that rely on partial application to get acceptable performance - it's too hard to discover.&lt;br /&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7094652-7334274688121679446?l=neilmitchell.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://neilmitchell.blogspot.com/feeds/7334274688121679446/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=7094652&amp;postID=7334274688121679446' title='7 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/7334274688121679446'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/7334274688121679446'/><link rel='alternate' type='text/html' href='http://neilmitchell.blogspot.com/2011/09/sharing-in-haskell.html' title='Sharing in Haskell'/><author><name>Neil Mitchell</name><uri>http://www.blogger.com/profile/13084722756124486154</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www-users.cs.york.ac.uk/~ndm/elements/my-photo.jpg'/></author><thr:total>7</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7094652.post-4259915861029180219</id><published>2011-08-19T11:54:00.003+01:00</published><updated>2011-08-19T12:20:50.442+01:00</updated><title type='text'>IFL 2011 submission</title><content type='html'>&lt;i&gt;Summary: Submit to IFL 2011. It's a great breeding ground for ideas.&lt;/i&gt;&lt;br /&gt;&lt;br /&gt;The &lt;a href="http://www.ittc.ku.edu/ifl2011/"&gt;IFL 2011&lt;/a&gt; submission deadline has been extended until 31st August, and, in a move I thoroughly agree with, the notification of acceptance is now &lt;i&gt;within 24 hours of submission!&lt;/i&gt;&lt;br /&gt;&lt;br /&gt;IFL (and &lt;a href="http://www-fp.cs.st-andrews.ac.uk/tifp/"&gt;TFP&lt;/a&gt;) do not work in the same way as &lt;a href="http://www.icfpconference.org/"&gt;ICFP&lt;/a&gt;/&lt;a href="http://www.haskell.org/haskell-symposium/"&gt;Haskell Symposium&lt;/a&gt;. You submit a paper, describing the work you've done (something functional programming related) and then present the paper. After the conference, you resubmit the paper, and the result is reviewed in greater depth. There is a great opportunity to learn from the experience, and produce a second paper that is far better.&lt;br /&gt;&lt;br /&gt;I submitted my original supercompilation work to IFL 2007. You can compare &lt;a href="http://community.haskell.org/~ndm/downloads/paper-supero_making_haskell_faster-27_sep_2007.pdf"&gt;the paper I submitted before the conference&lt;/a&gt; with &lt;a href="http://community.haskell.org/~ndm/downloads/paper-a_supercompiler_for_core_haskell-01_may_2008.pdf"&gt;the paper I submitted afterwards&lt;/a&gt;. Note that the original paper doesn't even mention the word supercompilation! At the conference I talked to many people, learnt a lot about supercompilation, program optimisation and termination orderings (and lots of other interesting topics). Using this experience, and building on the enthusiasm people had expressed, I was able to produce a much better second paper. My work on supercompilation went into &lt;a href="http://community.haskell.org/~ndm/downloads/paper-transformation_and_analysis_of_functional_programs-4_jun_2008.pdf"&gt;my thesis&lt;/a&gt;, and lead to a subsequent &lt;a href="http://community.haskell.org/~ndm/downloads/paper-rethinking_supercompilation-29_sep_2010.pdf"&gt;ICFP paper&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;I think that IFL/TFP are great venues for to present your work, and in presenting and discussing it, to understand more about the work - making it better in the process. You have just under 2 weeks to make the &lt;a href="http://www.ittc.ku.edu/ifl2011/"&gt;IFL 2011&lt;/a&gt; deadline of 31st August.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7094652-4259915861029180219?l=neilmitchell.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://neilmitchell.blogspot.com/feeds/4259915861029180219/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=7094652&amp;postID=4259915861029180219' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/4259915861029180219'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/4259915861029180219'/><link rel='alternate' type='text/html' href='http://neilmitchell.blogspot.com/2011/08/ifl-2011-submission.html' title='IFL 2011 submission'/><author><name>Neil Mitchell</name><uri>http://www.blogger.com/profile/13084722756124486154</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www-users.cs.york.ac.uk/~ndm/elements/my-photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7094652.post-2800792400842264518</id><published>2011-07-17T16:40:00.002+01:00</published><updated>2011-07-17T16:45:00.142+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='hiw'/><title type='text'>Submit to the Haskell Implementors Workshop</title><content type='html'>Are you going to &lt;a href="http://www.icfpconference.org/icfp2011/"&gt;ICFP 2011&lt;/a&gt; in Japan? Do you work with Haskell? You should consider offering a talk for the &lt;a href="http://haskell.org/haskellwiki/HaskellImplementorsWorkshop/2011"&gt;Haskell Implementors Workshop&lt;/a&gt;! You have until this coming Friday (22nd July 2011) to offer a talk, by emailing an abstract (less than 200 words) to &lt;tt&gt;benl -AT- cse.unsw.edu.au&lt;/tt&gt; - for full details see the &lt;a href="http://haskell.org/haskellwiki/HaskellImplementorsWorkshop/2011/Call_for_Talks"&gt;Call for Talks&lt;/a&gt;. Talks are 20 minutes long, plus 10 minutes for questions.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;What topics are suitable?&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;If you have been doing or thinking stuff that would interest a bunch of Haskell implementors in a pub, you probably have something suitable for a talk. The aim of the workshop is to provide an "opportunity to bat around ideas, share experiences, and ask for feedback from fellow experts". Unlike &lt;a href="http://www.icfpconference.org/"&gt;ICFP&lt;/a&gt; or the &lt;a href="http://www.haskell.org/haskell-symposium/"&gt;Haskell Symposium&lt;/a&gt;, there are no published proceedings. If you have some work that isn't quite ready for a more formal setting, this workshop is a useful venue to seek feedback. If you have an idea which seems a bit crazy, it's probably also very interesting!&lt;br /&gt;&lt;br /&gt;&lt;b&gt;What if I have less to say?&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;In addition to the 30 minute talks, the Haskell Implementors Workshop will also have 2-10 minute lightening talks, organised on the day. Attend the workshop, and sign up on the day to give a brief talk.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7094652-2800792400842264518?l=neilmitchell.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://neilmitchell.blogspot.com/feeds/2800792400842264518/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=7094652&amp;postID=2800792400842264518' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/2800792400842264518'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/2800792400842264518'/><link rel='alternate' type='text/html' href='http://neilmitchell.blogspot.com/2011/07/submit-to-haskell-implementors-workshop.html' title='Submit to the Haskell Implementors Workshop'/><author><name>Neil Mitchell</name><uri>http://www.blogger.com/profile/13084722756124486154</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www-users.cs.york.ac.uk/~ndm/elements/my-photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7094652.post-8761228674900845003</id><published>2011-06-18T13:08:00.005+01:00</published><updated>2011-06-18T13:15:53.635+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='windows'/><category scheme='http://www.blogger.com/atom/ns#' term='c#'/><title type='text'>Changing the Windows titlebar gradient colors</title><content type='html'>&lt;i&gt;Summary: This post describes how to change the gradient colors of the titlebar of a single window on Windows XP, using C#.&lt;/i&gt;&lt;br /&gt;&lt;br /&gt;On Windows XP the titlebar of most windows have a gradient coloring:&lt;br /&gt;&lt;br /&gt;&lt;img src="http://2.bp.blogspot.com/-sbxQUyeyQGg/TfyVXM0jOrI/AAAAAAAAAGU/RkkkbRu_z2U/s400/normal-titlebar.png"&gt;&lt;br /&gt;&lt;br /&gt;For various reasons, I wanted to override the titlebar gradient colors on a single window. The &lt;tt&gt;&lt;a href="http://msdn.microsoft.com/en-us/library/ms724940(v=vs.85).aspx"&gt;SetSysColors&lt;/a&gt;&lt;/tt&gt; function can change the colors, but that applies to all windows and immediately forces all windows to repaint, which is slow and very flickery. You can paint the titlebar your self (like Chrome does), but then you need to draw and handle min/max buttons etc. After some experimentation, using the unofficial &lt;tt&gt;SetSysColorsTemp&lt;/tt&gt; function, I was able to change the gradient for a single window. As an example, here are some screenshots, overriding the left color with green, then overriding both the colors with green/blue and orange/blue:&lt;br /&gt;&lt;br /&gt;&lt;img src="http://4.bp.blogspot.com/-9YAuuXGvsTs/TfyVisbnvrI/AAAAAAAAAGc/dNj_wSdCjGQ/s400/new-titlebar.png"&gt;&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Disclaimer&lt;/b&gt;: This code uses an undocumented function, and while it seems to work robustly on my copy of Windows XP, it unlikely to work on future versions of Windows. Generally, it is a bad idea to override user preferences - for example the titlebar text may not be visible with overridden gradient colors.&lt;br /&gt;&lt;br /&gt;All the necessary code is available at the bottom of this post, and can be used by anyone, for any purpose. The crucial function is &lt;tt&gt;SetSysColorsTemp&lt;/tt&gt;, which takes an array of colors and an array of brushes created from those colors. If you call &lt;tt&gt;SetSysColorsTemp&lt;/tt&gt; passing a new array of colors/brushes they will override the global system colors until a reboot, without forcing a repaint. Passing &lt;tt&gt;null&lt;/tt&gt; for both arrays will restore the colors to how they were before. To make the color change only for the current window we hook into &lt;tt&gt;&lt;a href="http://msdn.microsoft.com/en-us/library/system.windows.forms.control.wndproc.aspx"&gt;WndProc&lt;/a&gt;&lt;/tt&gt;, and detect when the colors are about to be used in a paint operation. We set the system colors before, call &lt;tt&gt;base.WndProc&lt;/tt&gt; which paints the titlebar (using our modified system colors), then put the colors back.&lt;br /&gt;&lt;br /&gt;There are two known issues:&lt;br /&gt;&lt;br /&gt;1) If you change the colors after the titlebar has drawn, it will not use the colors until the titlebar next repaints. I suspect this problem is solvable.&lt;br /&gt;2) As you can see in the demo screenshots, the area near the min/max buttons does not usually get recolored. If you only change the left gradient color this doesn't matter.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;public class GradientForm : Form&lt;br /&gt;{&lt;br /&gt;    // Win32 API calls, often based on those from pinvoke.net&lt;br /&gt;    [DllImport("gdi32.dll")] static extern bool DeleteObject(int hObject);&lt;br /&gt;    [DllImport("user32.dll")] static extern int SetSysColorsTemp(int[] lpaElements, int [] lpaRgbValues, int cElements);&lt;br /&gt;    [DllImport("gdi32.dll")] static extern int CreateSolidBrush(int crColor);&lt;br /&gt;    [DllImport("user32.dll")] static extern int GetSysColorBrush(int nIndex);&lt;br /&gt;    [DllImport("user32.dll")] static extern int GetSysColor(int nIndex);&lt;br /&gt;    [DllImport("user32.dll")] private static extern IntPtr GetForegroundWindow();&lt;br /&gt;&lt;br /&gt;    // Magic constants&lt;br /&gt;    const int SlotLeft = 2;&lt;br /&gt;    const int SlotRight = 27;&lt;br /&gt;    const int SlotCount = 28; // Math.Max(SlotLeft, SlotRight) + 1;&lt;br /&gt;&lt;br /&gt;    // The colors/brushes to use&lt;br /&gt;    int[] Colors = new int[SlotCount];&lt;br /&gt;    int[] Brushes = new int[SlotCount];&lt;br /&gt;&lt;br /&gt;    // The colors the user wants to use&lt;br /&gt;    Color titleBarLeft, titleBarRight;&lt;br /&gt;    public Color TitleBarLeft{get{return titleBarLeft;} set{titleBarLeft=value; UpdateBrush(SlotLeft, value);}}&lt;br /&gt;    public Color TitleBarRight{get{return titleBarRight;} set{titleBarRight=value; UpdateBrush(SlotRight, value);}}&lt;br /&gt;&lt;br /&gt;    void CreateBrushes()&lt;br /&gt;    {&lt;br /&gt;        for (int i = 0; i &amp;lt; SlotCount; i++)&lt;br /&gt;        {&lt;br /&gt;            Colors[i] = GetSysColor(i);&lt;br /&gt;            Brushes[i] = GetSysColorBrush(i);&lt;br /&gt;        }&lt;br /&gt;        titleBarLeft = ColorTranslator.FromWin32(Colors[SlotLeft]);&lt;br /&gt;        titleBarRight = ColorTranslator.FromWin32(Colors[SlotRight]);&lt;br /&gt;    }&lt;br /&gt;&lt;br /&gt;    void UpdateBrush(int Slot, Color c)&lt;br /&gt;    {&lt;br /&gt;        DeleteObject(Brushes[Slot]);&lt;br /&gt;        Colors[Slot] = ColorTranslator.ToWin32(c);&lt;br /&gt;        Brushes[Slot] = CreateSolidBrush(Colors[Slot]);&lt;br /&gt;    }&lt;br /&gt;&lt;br /&gt;    void DestroyBrushes()&lt;br /&gt;    {&lt;br /&gt;        DeleteObject(Brushes[SlotLeft]);&lt;br /&gt;        DeleteObject(Brushes[SlotRight]);           &lt;br /&gt;    }&lt;br /&gt;&lt;br /&gt;    // Hook up to the Window&lt;br /&gt;&lt;br /&gt;    public GradientForm()&lt;br /&gt;    {&lt;br /&gt;        CreateBrushes();&lt;br /&gt;    }&lt;br /&gt;&lt;br /&gt;    protected override void Dispose(bool disposing)&lt;br /&gt;    {&lt;br /&gt;        if (disposing) DestroyBrushes();&lt;br /&gt;        base.Dispose(disposing);&lt;br /&gt;    }&lt;br /&gt;&lt;br /&gt;    protected override void WndProc(ref System.Windows.Forms.Message m) &lt;br /&gt;    {&lt;br /&gt;        const int WM_NCPAINT = 0x85; &lt;br /&gt;        const int WM_NCACTIVATE = 0x86;&lt;br /&gt;&lt;br /&gt;        if ((m.Msg == WM_NCACTIVATE &amp;&amp; m.WParam.ToInt32() != 0) ||&lt;br /&gt;            (m.Msg == WM_NCPAINT &amp;&amp; GetForegroundWindow() == this.Handle))&lt;br /&gt;        {&lt;br /&gt;&lt;br /&gt;            int k = SetSysColorsTemp(Colors, Brushes, Colors.Length);&lt;br /&gt;            base.WndProc(ref m); &lt;br /&gt;            SetSysColorsTemp(null, null, k);&lt;br /&gt;        }&lt;br /&gt;        else&lt;br /&gt;            base.WndProc(ref m); &lt;br /&gt;    }&lt;br /&gt;}&lt;br /&gt;&lt;/pre&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7094652-8761228674900845003?l=neilmitchell.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://neilmitchell.blogspot.com/feeds/8761228674900845003/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=7094652&amp;postID=8761228674900845003' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/8761228674900845003'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/8761228674900845003'/><link rel='alternate' type='text/html' href='http://neilmitchell.blogspot.com/2011/06/changing-windows-titlebar-gradient.html' title='Changing the Windows titlebar gradient colors'/><author><name>Neil Mitchell</name><uri>http://www.blogger.com/profile/13084722756124486154</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www-users.cs.york.ac.uk/~ndm/elements/my-photo.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://2.bp.blogspot.com/-sbxQUyeyQGg/TfyVXM0jOrI/AAAAAAAAAGU/RkkkbRu_z2U/s72-c/normal-titlebar.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7094652.post-3495889072691189889</id><published>2011-06-07T20:03:00.002+01:00</published><updated>2011-06-07T20:11:07.916+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='safe'/><title type='text'>INLINE pragmas in the Safe library</title><content type='html'>&lt;i&gt;Summary: The Safe library has lots of small functions, none of which have INLINE pragmas on them. The lack of INLINE pragmas probably has no effect on the optimisation.&lt;/i&gt;&lt;br /&gt;&lt;br /&gt;I was recently asked a question about the &lt;a href="http://community.haskell.org/~ndm/safe"&gt;Safe library&lt;/a&gt;:&lt;br /&gt;&lt;br /&gt;&lt;blockquote&gt;&lt;br /&gt;I was wondering why you didn't use any &lt;tt&gt;INLINE&lt;/tt&gt; pragmas in the Safe library. I'm not a Haskell expert yet, but I've noticed that in many other libraries most one-liners are annotated by &lt;tt&gt;INLINE&lt;/tt&gt; pragmas, so is there any reason you didn't add them to Safe as well?&lt;br /&gt;&lt;/blockquote&gt;&lt;br /&gt;&lt;br /&gt;When compiling with optimisations, GHC tries to inline functions/values that are either "small enough" or have an &lt;a href="http://www.haskell.org/ghc/docs/7.0.3/html/users_guide/pragmas.html#inline-pragma"&gt;&lt;tt&gt;INLINE&lt;/tt&gt; pragma&lt;/a&gt;. These functions will be inlined within their module, and will also be placed in their module's interface file, for inlining into other modules. The advantage of inlining is avoiding the function call overhead, and possibly exposing further optimisations. The disadvantage is that the code size may grow, which may result in poor utilisation of the processor's instruction cache.&lt;br /&gt;&lt;br /&gt;The Safe library contains wrappers for lots of partial &lt;tt&gt;Prelude&lt;/tt&gt; and &lt;tt&gt;Data.List&lt;/tt&gt; functions (i.e. &lt;tt&gt;head&lt;/tt&gt;), with versions that don't fail, or fail with more debugging information. Using the &lt;tt&gt;tail&lt;/tt&gt; function as an example, the library supplies four additional variants:&lt;br /&gt;&lt;br /&gt;&lt;ul&gt; &lt;br /&gt;&lt;li&gt;&lt;tt&gt;tail :: [a] -&gt; [a]&lt;/tt&gt;, crashes on &lt;tt&gt;tail []&lt;/tt&gt;&lt;/li&gt; &lt;br /&gt;&lt;li&gt;&lt;tt&gt;tailNote :: &lt;b&gt;String&lt;/b&gt; -&gt; [a] -&gt; [a]&lt;/tt&gt;, takes an extra argument which supplements the error message&lt;/li&gt; &lt;br /&gt;&lt;li&gt;&lt;tt&gt;tailDef :: &lt;b&gt;[a]&lt;/b&gt; -&gt; [a] -&gt; [a]&lt;/tt&gt;, takes a default to return instead of errors&lt;/li&gt; &lt;br /&gt;&lt;li&gt;&lt;tt&gt;tailMay :: [a] -&gt; &lt;b&gt;Maybe&lt;/b&gt; [a]&lt;/tt&gt;, wraps the result in a &lt;tt&gt;Maybe&lt;/tt&gt;&lt;/li&gt; &lt;br /&gt;&lt;li&gt;&lt;tt&gt;tailSafe :: [a] -&gt; [a]&lt;/tt&gt;, returns some sensible default if possible, &lt;tt&gt;[]&lt;/tt&gt; in the case of &lt;tt&gt;tail&lt;/tt&gt;&lt;/li&gt; &lt;br /&gt;&lt;/ul&gt;&lt;br /&gt;&lt;br /&gt;As the questioner correctly noted, there are no &lt;tt&gt;INLINE&lt;/tt&gt; pragmas in the library. Taking the example of &lt;tt&gt;tailSafe&lt;/tt&gt;, it is defined as:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;tailSafe :: [a] -&gt; [a]&lt;br /&gt;tailSafe = liftSafe tail null&lt;br /&gt;&lt;br /&gt;liftSafe :: (a -&gt; a) -&gt; (a -&gt; Bool) -&gt; (a -&gt; a)&lt;br /&gt;liftSafe func test val = if test val then val else func val&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Given this indirection, and the lack of an &lt;tt&gt;INLINE&lt;/tt&gt; pragma, is calling &lt;tt&gt;tailSafe&lt;/tt&gt; as cheap as you might hope? We can answer this question by looking at the interface file at the standard optimisation level, by running &lt;tt&gt;ghc -ddump-hi Safe.hs -O -c&lt;/tt&gt;. Looking at the definition attached to &lt;tt&gt;tailSafe&lt;/tt&gt;, we see:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;tailSafe = \val -&gt; case val of&lt;br /&gt;    [] -&gt; []&lt;br /&gt;    ds1:ds2 -&gt; ds2&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;The &lt;tt&gt;tailSafe&lt;/tt&gt; function has been optimised as much as it can be, and will be inlined into other modules. Adding an &lt;tt&gt;INLINE&lt;/tt&gt; pragma would have no effect at standard optimisation. When compiling without optimisation, even with an &lt;tt&gt;INLINE&lt;/tt&gt; pragma, the function is not included in the module's interface. Adding the &lt;tt&gt;INLINE&lt;/tt&gt; pragma to &lt;tt&gt;tailSafe&lt;/tt&gt; is of no benefit.&lt;br /&gt;&lt;br /&gt;Let us now look at all functions in the &lt;tt&gt;Safe&lt;/tt&gt; module. When compiled without optimisation (&lt;tt&gt;-O0&lt;/tt&gt;) no functions are placed in the interface file. At all higher levels of optimisation (&lt;tt&gt;-O1&lt;/tt&gt; and &lt;tt&gt;-O2&lt;/tt&gt;) the interface files are identical. Looking through the functions, only 6 functions are not included in the interface files:&lt;br /&gt;&lt;br /&gt;&lt;b&gt;&lt;tt&gt;abort&lt;/tt&gt;&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;The &lt;tt&gt;abort&lt;/tt&gt; function is defined as:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;abort = error&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Neither adding an &lt;tt&gt;INLINE&lt;/tt&gt; pragma or eta expanding (adding an additional argument) includes the function in the interface file. I suspect that GHC decides no further optimisation is possible, so doesn't ever inline &lt;tt&gt;abort&lt;/tt&gt;.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;&lt;tt&gt;atMay&lt;/tt&gt; and &lt;tt&gt;$watMay&lt;/tt&gt;&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;The function &lt;at&gt;atMay&lt;/tt&gt; is recursive, and thus cannot be inlined even when it's definition is available, so GHC sensibly omits it from the interface file. The function &lt;tt&gt;$watMay&lt;/tt&gt; is the &lt;a href="http://www.cs.nott.ac.uk/~gmh/wrapper.pdf"&gt;worker/wrapper&lt;/a&gt; definition of &lt;tt&gt;atMay&lt;/tt&gt;, which is also recursive.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;&lt;tt&gt;readNote&lt;/tt&gt; and &lt;tt&gt;readMay&lt;/tt&gt;&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;The functions &lt;tt&gt;readNote&lt;/tt&gt; and &lt;tt&gt;readMay&lt;/tt&gt; both follow exactly the same pattern, so I describe only &lt;tt&gt;readNote&lt;/tt&gt;. The function &lt;tt&gt;readNote&lt;/tt&gt; is quite big, and GHC has split it into two top-level functions - producing both &lt;tt&gt;readNote&lt;/tt&gt; and &lt;tt&gt;readNote1&lt;/tt&gt;. The function &lt;tt&gt;readNote&lt;/tt&gt; is included in the interface file, but &lt;tt&gt;readNote1&lt;/tt&gt; is not. The result is that GHC will be able to partially inline the original definition of &lt;tt&gt;readNote&lt;/tt&gt; - however, the &lt;tt&gt;read&lt;/tt&gt; functions tend to be rather complex, so GHC is unlikely to benefit from the additional inlining it has managed to acheive.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;&lt;tt&gt;atNote&lt;/tt&gt;&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;The &lt;tt&gt;atNote&lt;/tt&gt; function is bigger than the inlining threshold, so does not appear in the interface file unless an &lt;tt&gt;INLINE&lt;/tt&gt; pragma is given. Since the inner loop of &lt;tt&gt;atNote&lt;/tt&gt; is recursive, it is unlikely that inlining would give any noticable benefit, so GHC's default behaviour is appropriate.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Conclusion&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;Having analysed the interface files, I do not think it is worth adding &lt;tt&gt;INLINE&lt;/tt&gt; pragmas to the Safe library - GHC does an excellent job without my intervention. My advice is to only add &lt;tt&gt;INLINE&lt;/tt&gt; pragmas after either profiling, or analysing the resulting Core program.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7094652-3495889072691189889?l=neilmitchell.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://neilmitchell.blogspot.com/feeds/3495889072691189889/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=7094652&amp;postID=3495889072691189889' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/3495889072691189889'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/3495889072691189889'/><link rel='alternate' type='text/html' href='http://neilmitchell.blogspot.com/2011/06/inline-pragmas-in-safe-library.html' title='INLINE pragmas in the Safe library'/><author><name>Neil Mitchell</name><uri>http://www.blogger.com/profile/13084722756124486154</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www-users.cs.york.ac.uk/~ndm/elements/my-photo.jpg'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7094652.post-3627842556430093450</id><published>2011-05-22T14:26:00.003+01:00</published><updated>2011-05-22T14:34:16.168+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='hoogle'/><title type='text'>Hoogle talk from TFP 2011 [PDF]</title><content type='html'>Last week I went to &lt;a href="http://dalila.sip.ucm.es/tfp11/"&gt;TFP 2011&lt;/a&gt;, and gave a talk on &lt;a href="http://haskell.org/hoogle/"&gt;Hoogle&lt;/a&gt; entitled "Finding Functions from Types". The &lt;a href="http://community.haskell.org/~ndm/downloads/slides-hoogle_finding_functions_from_types-16_may_2011.pdf"&gt;slides are now available online&lt;/a&gt;. These slides give some information about how the type searching works in Hoogle, and I intend to write further details in the future.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7094652-3627842556430093450?l=neilmitchell.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://neilmitchell.blogspot.com/feeds/3627842556430093450/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=7094652&amp;postID=3627842556430093450' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/3627842556430093450'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/3627842556430093450'/><link rel='alternate' type='text/html' href='http://neilmitchell.blogspot.com/2011/05/hoogle-talk-from-tfp-2011-pdf.html' title='Hoogle talk from TFP 2011 [PDF]'/><author><name>Neil Mitchell</name><uri>http://www.blogger.com/profile/13084722756124486154</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www-users.cs.york.ac.uk/~ndm/elements/my-photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7094652.post-8882924469783688675</id><published>2011-05-08T20:25:00.002+01:00</published><updated>2011-05-08T20:31:52.051+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='cmdargs'/><title type='text'>CmdArgs is not dangerous</title><content type='html'>&lt;i&gt;Summary: CmdArgs can be used purely, and even if you choose to use it in an impure manner, you don't need to worry.&lt;/i&gt;&lt;br /&gt;&lt;br /&gt;As a result of my &lt;a href="http://neilmitchell.blogspot.com/2011/05/cmdargs-fighting-ghc-optimiser.html"&gt;blog post yesterday&lt;/a&gt;, a few people have said they have been put off using &lt;a href="http://community.haskell.org/~ndm/cmdargs/"&gt;CmdArgs&lt;/a&gt;. There are three reasons why you shouldn't be put off using CmdArgs, even though it has some impure code within it.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;1: You can use CmdArgs entirely purely&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;You do not need to use the impure code at all. The module &lt;a href="http://hackage.haskell.org/packages/archive/cmdargs/0.7/doc/html/System-Console-CmdArgs-Implicit.html"&gt;System.Console.CmdArgs.Implict&lt;/a&gt; provides two ways to write annotated records. The first way is impure, and uses &lt;tt&gt;cmdArgs&lt;/tt&gt; and &lt;tt&gt;&amp;=&lt;/tt&gt;. The second way is pure, and uses &lt;tt&gt;cmdArgs_&lt;/tt&gt; and &lt;tt&gt;+=&lt;/tt&gt;. Both ways have exactly the same power. For example, you can write either of:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;sample = cmdArgs $&lt;br /&gt;    Sample{hello = def &amp;= help "World argument" &amp;= opt "world"}&lt;br /&gt;    &amp;= summary "Sample v1"&lt;br /&gt;&lt;br /&gt;sample = cmdArgs_ $&lt;br /&gt;    record Sample{} [hello := def += help "World argument" += opt "world"]&lt;br /&gt;    += summary "Sample v1"&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;The first definition is impure. The second is pure. Both are equivalent. I prefer the syntax of the first version, but the second version is not much longer or uglier. If the impurities scare you, just switch. The &lt;tt&gt;Implicit&lt;/tt&gt; module documentation describes four simple rules for converting between the two methods.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;2: You do not need to use annotated records at all&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;Notice that the above module is called &lt;tt&gt;Implicit&lt;/tt&gt;, CmdArgs also features an &lt;a href="http://hackage.haskell.org/packages/archive/cmdargs/0.7/doc/html/System-Console-CmdArgs-Explicit.html"&gt;Explicit&lt;/a&gt; version where you create a data type representing your command line arguments (which is entirely pure). For example:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;arguments :: Mode [(String,String)]&lt;br /&gt;arguments = mode "explicit" [] "Explicit sample program" (flagArg (upd "file") "FILE")&lt;br /&gt;     [flagOpt "world" ["hello","h"] (upd "world") "WHO" "World argument"&lt;br /&gt;     ,flagReq ["greeting","g"] (upd "greeting") "MSG" "Greeting to give"&lt;br /&gt;     ,flagHelpSimple (("help",""):)]&lt;br /&gt;     where upd msg x v = Right $ (msg,x):v&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Here you construct modes and flags explicitly, and pass functions which describe how to update the command line state. Everything in the &lt;tt&gt;Implicit&lt;/tt&gt; module maps down to the &lt;tt&gt;Explicit&lt;/tt&gt; module. In addition, you can use the &lt;a href="http://hackage.haskell.org/packages/archive/cmdargs/0.7/doc/html/System-Console-CmdArgs-GetOpt.html"&gt;GetOpt&lt;/a&gt; compatibility layer, which also maps down to the &lt;tt&gt;Explicit&lt;/tt&gt; parser.&lt;br /&gt;&lt;br /&gt;Having written command line parsers with annotated records, I'd never go back to writing them out in full. However, if you want to map your own command line parser description down to the &lt;tt&gt;Explicit&lt;/tt&gt; version you can get help messages and parsing for free.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;3: I fight the optimiser so you don't have to&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;Even if you choose to use the impure implicit version of CmdArgs, you don't need to do anything, even at high optimisation levels. I have an extensive test suite, and will continue to ensure CmdArgs programs work properly - I rely on it for several of my programs. While I find the implicit impure nicer to work with, I am still working on making a pure version with the same syntax, developing alternative methods of describing the annotations, developing quasi-quoters etc.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7094652-8882924469783688675?l=neilmitchell.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://neilmitchell.blogspot.com/feeds/8882924469783688675/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=7094652&amp;postID=8882924469783688675' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/8882924469783688675'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/8882924469783688675'/><link rel='alternate' type='text/html' href='http://neilmitchell.blogspot.com/2011/05/cmdargs-is-not-dangerous.html' title='CmdArgs is not dangerous'/><author><name>Neil Mitchell</name><uri>http://www.blogger.com/profile/13084722756124486154</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www-users.cs.york.ac.uk/~ndm/elements/my-photo.jpg'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7094652.post-3519543689115519938</id><published>2011-05-07T22:37:00.004+01:00</published><updated>2011-05-08T20:36:40.562+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='cmdargs'/><title type='text'>CmdArgs - Fighting the GHC Optimiser</title><content type='html'>&lt;i&gt;Summary: Everyone using GHC 7 should upgrade to CmdArgs 0.7. GHC's optimiser got better, so CmdArgs optimiser avoidance code had to get better too.&lt;/i&gt;&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Update: Has this post scared you off CmdArgs? Read &lt;a href="http://neilmitchell.blogspot.com/2011/05/cmdargs-is-not-dangerous.html"&gt;why CmdArgs is not dangerous&lt;/a&gt;.&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="http://community.haskell.org/~ndm/cmdargs/"&gt;CmdArgs&lt;/a&gt; is a Haskell library for concisely specifying the command line arguments, see &lt;a href="http://neilmitchell.blogspot.com/2010/08/cmdargs-example.html"&gt;this post&lt;/a&gt; for an introduction. I have just released versions 0.6.10 and 0.7 of CmdArgs, which are a strongly recommended upgrade, particular for GHC 7 users. (The 0.6.10 is so anyone who specified 0.6.* in their Cabal file gets an automatic upgrade, and 0.7 is so people can write 0.7.* to require the new version.)&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Why CmdArgs is impure&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;CmdArgs works by annotating fields to determine what command line argument parsing you want. Consider the data type:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;data Opts = Opts {file :: FilePath}&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;By default this will create a command line which would respond to &lt;tt&gt;myprogram --file=foo&lt;/tt&gt;. If instead we want to use &lt;tt&gt;myprogram foo&lt;/tt&gt; we need to attach the annotation &lt;tt&gt;args&lt;/tt&gt; to the &lt;tt&gt;file&lt;/tt&gt; field. We can do this in CmdArgs with:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;cmdArgs $ Opts {file = "" &amp;= args}&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;However, CmdArgs does not have to be impure - you can equally use the pure variant of CmdArgs and write:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;cmdArgs_ $ record Opts{} [file := "" += args]&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Sadly, this code is a little more ugly. I prefer the impure version, but everything is supported in both versions.&lt;br /&gt;&lt;br /&gt;I am still experimenting with other ways of writing an annotated record, weighing the trade-offs between purity, safety and the syntax required. I have been experimenting with pure variants tonight, and hope in a future release to make CmdArgs pure by default.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;GHC vs CmdArgs&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;Because CmdArgs uses unsafe and untracked side effects, GHC's optimiser can manipulate the program in ways that change the semantics. A standard example where GHC's optimiser can harm CmdArgs is:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;Opts2 {file1 = "" &amp;= typFile, file2 = "" &amp;= typFile}&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Here the subexpression &lt;tt&gt;"" &amp;= typFile&lt;/tt&gt; is duplicated, and if GHC spots this duplication, it can use common sub-expression eliminate to transform the program to:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;let x = "" &amp;= typFile in Opts2 {file1 = x, file2 = x}&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Unfortunately, because CmdArgs is impure, this program attaches the annotation to &lt;tt&gt;file1&lt;/tt&gt;, but not &lt;tt&gt;file2&lt;/tt&gt;.&lt;br /&gt;&lt;br /&gt;This optimisation problem happens in practice, and can be eliminated by writing &lt;tt&gt;{-# OPTIONS_GHC -fno-cse #-}&lt;/tt&gt; in the source file defining the annotations. However, it is burdensome to require all users of CmdArgs to add pragmas to their code, so I investigated how to reduce the problem.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Beating GHC 6.10.4&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;The key function used to beat GHC's optimiser is &lt;tt&gt;&amp;=&lt;/tt&gt;. It is defined as:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;(&amp;=) :: (Data val, Data ann) =&gt; val -&gt; ann -&gt; val&lt;br /&gt;(&amp;=) x y = addAnn x y&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;In order to stop CSE, I use two tricks. Firstly, I mark &lt;tt&gt;&amp;=&lt;/tt&gt; as &lt;tt&gt;INLINE&lt;/tt&gt;, so that it's definition ends up in the annotations - allowing me to try and modify the expression so it doesn't become suitable for CSE. For GHC 6.10.4 I then made up increasingly random expressions, with increasingly random pragmas, until the problem went away. The end solution was:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;{-# INLINE (&amp;=) #-}&lt;br /&gt;(&amp;=) :: (Data val, Data ann) =&gt; val -&gt; ann -&gt; val&lt;br /&gt;(&amp;=) x y = addAnn (id_ x) (id_ y)&lt;br /&gt;&lt;br /&gt;{-# NOINLINE const_ #-}&lt;br /&gt;const_ :: a -&gt; b -&gt; b&lt;br /&gt;const_ f x = x&lt;br /&gt;&lt;br /&gt;{-# INLINE id_ #-}&lt;br /&gt;id_ :: a -&gt; a&lt;br /&gt;id_ x = const_ (\() -&gt; ()) x&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Beating GHC 7.0.1&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;Unfortunately, after upgrading to GHC 7.0.1, the problem happened again. I asked for &lt;a href="http://stackoverflow.com/questions/5920200/how-to-prevent-common-sub-expression-elimination-cse-with-ghc/"&gt;help&lt;/a&gt;, and then started researching GHC's CSE optimiser (using the &lt;a href="http://neilmitchell.blogspot.com/2011/05/searching-ghc-with-hoogle.html"&gt;Hoogle support for GHC&lt;/a&gt; I added last weekend - a happy coincidence). The information I found is summarised &lt;a href="http://stackoverflow.com/questions/5920200/how-to-prevent-common-sub-expression-elimination-cse-with-ghc/5921087#5921087"&gt;here&lt;/a&gt;. Using this information I was able to construct a much more targeted solution (which I can actually understand!):&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;{-# INLINE (&amp;=) #-}&lt;br /&gt;(&amp;=) x y = addAnn x (unique x)&lt;br /&gt;&lt;br /&gt;{-# INLINE unique #-}&lt;br /&gt;unique x = case unit of () -&gt; x&lt;br /&gt;    where unit = reverse "" `seq` ()&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;As before, we &lt;tt&gt;INLINE&lt;/tt&gt; the &lt;tt&gt;&amp;=&lt;/tt&gt;, so it gets into the annotations. Now we want to make all annotations appear different, even though they are the same. We use &lt;tt&gt;unique&lt;/tt&gt; which is equivalent to &lt;tt&gt;id&lt;/tt&gt;, but when wrapped around an expression causes all instances to appear different under CSE. The &lt;tt&gt;unit&lt;/tt&gt; binding has the value &lt;tt&gt;()&lt;/tt&gt;, but in a way that GHC can't reduce, so the &lt;tt&gt;case&lt;/tt&gt; does not get eliminated. GHC does not CSE case expressions, so all annotations are safe.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;How GHC will probably defeat CmdArgs next&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;There are three avenues GHC could explore to defeat CmdArgs:&lt;br /&gt;&lt;br /&gt;&lt;ul&gt;&lt;br /&gt;&lt;li&gt;Reorder the optimisation phases. Performing CSE before inlining would defeat everything, as any tricks in &lt;tt&gt;&amp;=&lt;/tt&gt; would be ignored.&lt;/li&gt;&lt;br /&gt;&lt;li&gt;Better CSE. Looking inside case would defeat my scheme.&lt;/li&gt;&lt;br /&gt;&lt;li&gt;Reduce &lt;tt&gt;unit&lt;/tt&gt; to &lt;tt&gt;()&lt;/tt&gt;. This reduction could be done in a number of ways:&lt;br /&gt;    &lt;ul&gt;&lt;br /&gt;    &lt;li&gt;Constructor specialisation of reverse should manage to reduce &lt;tt&gt;reverse&lt;/tt&gt; to "", which then &lt;tt&gt;seq&lt;/tt&gt; can evaluate, and then eliminate the case. I'm a little surprised this isn't already happening, but I'm sure it will one day.&lt;/li&gt;&lt;br /&gt;    &lt;li&gt;Supercompilation can inline recursive functions, and inlining &lt;tt&gt;reverse&lt;/tt&gt; would eliminate the case.&lt;/li&gt;&lt;br /&gt;    &lt;li&gt;If GHC could determine &lt;tt&gt;reverse ""&lt;/tt&gt; was total it could eliminate the &lt;tt&gt;seq&lt;/tt&gt; without knowing it's value. This is somewhat tricky for &lt;tt&gt;reverse&lt;/tt&gt; as it isn't total for infinite lists.&lt;/li&gt;&lt;br /&gt;    &lt;/ul&gt;&lt;br /&gt;&lt;/li&gt;&lt;br /&gt;&lt;/ul&gt;&lt;br /&gt;&lt;br /&gt;Of these optimisations, I consider the reduction of &lt;tt&gt;unit&lt;/tt&gt; to be most likely, but also the easiest to counteract.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;How CmdArgs will win&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;The only safe way for CmdArgs to win is to rewrite the library to be pure. I am working on various annotation schemes and hope to have something available shortly.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7094652-3519543689115519938?l=neilmitchell.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://neilmitchell.blogspot.com/feeds/3519543689115519938/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=7094652&amp;postID=3519543689115519938' title='4 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/3519543689115519938'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/3519543689115519938'/><link rel='alternate' type='text/html' href='http://neilmitchell.blogspot.com/2011/05/cmdargs-fighting-ghc-optimiser.html' title='CmdArgs - Fighting the GHC Optimiser'/><author><name>Neil Mitchell</name><uri>http://www.blogger.com/profile/13084722756124486154</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www-users.cs.york.ac.uk/~ndm/elements/my-photo.jpg'/></author><thr:total>4</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7094652.post-7513442696177829841</id><published>2011-05-01T17:56:00.004+01:00</published><updated>2011-05-01T22:04:01.861+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='hoogle'/><title type='text'>Searching GHC with Hoogle</title><content type='html'>&lt;i&gt;Summary: Hoogle can now search the GHC source code. There are also lots of small improvements in the latest version.&lt;/i&gt;&lt;br /&gt;&lt;br /&gt;A few weeks ago &lt;a href="http://goto.ucsd.edu/~rjhala/"&gt;Ranjit Jhala&lt;/a&gt; asked  me for help getting &lt;a href="http://haskell.org/hoogle/"&gt;Hoogle&lt;/a&gt; working on the &lt;a href="http://www.haskell.org/ghc/docs/latest/html/libraries/ghc/"&gt;GHC documentation&lt;/a&gt;. As a result of this conversation, I've now released Hoogle 4.2.3, and upgraded the Hoogle web tool.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;For GHC developers&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;You can search the GHC documentation using the standard Hoogle website, for example: &lt;a href="http://haskell.org/hoogle/?hoogle=llvm+%2Bghc"&gt;&lt;tt&gt;llvm +ghc&lt;/tt&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;To search within a package simply write &lt;tt&gt;+&lt;i&gt;package&lt;/i&gt;&lt;/tt&gt; in your search query. The &lt;tt&gt;ghc&lt;/tt&gt; package on Hoogle includes all the internals for GHC.&lt;br /&gt;&lt;br /&gt;If you want to search using the console, you can install Hoogle and generate the GHC package database with:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;cabal update&lt;br /&gt;cabal install hoogle&lt;br /&gt;hoogle data default ghc&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;You can now perform searches with:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;hoogle +ghc llvm&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;&lt;b&gt;For all Hoogle users&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;The new release of Hoogle contains a number of small enhancements:&lt;br /&gt;&lt;br /&gt;&lt;ul&gt;&lt;br /&gt;&lt;li&gt;The web server has been upgraded to &lt;a href="http://hackage.haskell.org/package/warp"&gt;Warp&lt;/a&gt;. I'll write a blog post shortly on the move to Warp - but generally it's been a very positive step.&lt;/li&gt;&lt;br /&gt;&lt;li&gt;Some of the snippets of documentation have been fixed, where the markup was interpreted wrongly.&lt;/li&gt;&lt;br /&gt;&lt;li&gt;There is only an expand button next to the documentation if there is more information to expand.&lt;/li&gt;&lt;br /&gt;&lt;li&gt;Some iPad integration, so you can now add it to your home page with a nice icon.&lt;/li&gt;&lt;br /&gt;&lt;li&gt;Work on a deployment script to automate uploading a new version to the web server, which will allow for more frequent updates (until now it took over 2 hours to deploy a new version).&lt;/li&gt;&lt;br /&gt;&lt;li&gt;Updates as some web resources moved around, particularly the &lt;a href="http://hackage.haskell.org/platform/"&gt;Haskell Platform&lt;/a&gt; cabal file.&lt;/li&gt;&lt;br /&gt;&lt;/ul&gt;&lt;br /&gt;&lt;br /&gt;&lt;b&gt;The theory behind Hoogle&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;I'll be talking about the theory behind type searching in Hoogle at &lt;a href="http://dalila.sip.ucm.es/tfp11/"&gt;Trends in Functional Programming 2011&lt;/a&gt; in Madrid in a few weeks time. It's not too late to register.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7094652-7513442696177829841?l=neilmitchell.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://neilmitchell.blogspot.com/feeds/7513442696177829841/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=7094652&amp;postID=7513442696177829841' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/7513442696177829841'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/7513442696177829841'/><link rel='alternate' type='text/html' href='http://neilmitchell.blogspot.com/2011/05/searching-ghc-with-hoogle.html' title='Searching GHC with Hoogle'/><author><name>Neil Mitchell</name><uri>http://www.blogger.com/profile/13084722756124486154</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www-users.cs.york.ac.uk/~ndm/elements/my-photo.jpg'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7094652.post-1267772236928211031</id><published>2011-04-29T11:31:00.002+01:00</published><updated>2011-04-29T11:44:46.460+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='darcs'/><title type='text'>Darcs for my Wife</title><content type='html'>&lt;span style="font-style:italic;"&gt;&lt;/span&gt;&lt;i&gt;Summary: Using the version control system Darcs is simple. This document describes the commands I use in my workflow.&lt;/i&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="http://darcs.net/"&gt;Darcs&lt;/a&gt; is my favourite version control system because of its simple user interface. I am able to do everything I need with a handful of simple commands. I work with darcs as a single user, without branches, occasionally accepting small external contributions. This guide is designed to provide a sufficient reference for a someone who isn't a computer expert to continue using Darcs after their friendly comp-sci has set up the SSH bits for them (e.g. my wife).&lt;br /&gt;&lt;br /&gt;A Darcs repository (or repo) is a set of changes (sometimes referred to as patches). There are three places changes may live:&lt;br /&gt;&lt;br /&gt;&lt;ul&gt;&lt;br /&gt;&lt;li&gt;&lt;b&gt;Remote Repo&lt;/b&gt;, such as on &lt;a href="http://community.haskell.org/"&gt;community.haskell.org&lt;/a&gt;.&lt;/li&gt;&lt;br /&gt;&lt;li&gt;&lt;b&gt;Local Repo&lt;/b&gt;, on your computer. You may have many local repos all using the same remote repo, especially if you use multiple computers.&lt;/li&gt;&lt;br /&gt;&lt;li&gt;&lt;b&gt;Local Changes&lt;/b&gt;, modifications you have made to your local repo, by editing files.&lt;/li&gt;&lt;br /&gt;&lt;/ul&gt;&lt;br /&gt;&lt;br /&gt;The way changes move between locations is described by:&lt;br /&gt;&lt;br /&gt;&lt;img src="http://1.bp.blogspot.com/-phmT-5pT31k/TbqUQCE0zUI/AAAAAAAAAGI/KyBbQQxnrBQ/s1600/temp.png"&gt;&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Basic Commands&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;I use four commands daily, shown in red on the above diagram, which are:&lt;br /&gt;&lt;br /&gt;&lt;ul&gt;&lt;br /&gt;&lt;li&gt;&lt;b&gt;darcs pull&lt;/b&gt; - copy changes from the remote repo to your local repo. Use pull at the beginning of the day, and during the day if you are sharing a remote repo with other people.&lt;/li&gt;&lt;br /&gt;&lt;li&gt;&lt;b&gt;darcs whatsnew&lt;/b&gt; - see what local changes you have made. A useful flag is --summary, which lists only the names of files which have changed.&lt;/li&gt;&lt;br /&gt;&lt;li&gt;&lt;b&gt;darcs record&lt;/b&gt; - after you have decided to keep your changes, use record to store them in your local repo.&lt;/li&gt;&lt;br /&gt;&lt;li&gt;&lt;b&gt;darcs push --no-set-default ndm@code.haskell.org:/srv/code/hoogle&lt;/b&gt; - at the end of the day use push to move your changes to the remote repo. This command is the most tricky to use, you need SSH access to the server and need to use the file path of the repo, not the URL exposed by the website. I recommend writing a .bat file to automate this command, or using the &lt;a href="http://neilmitchell.blogspot.com/2010/10/enhanced-cabal-sdist.html"&gt;neil push&lt;/a&gt; command if appropriate.&lt;/li&gt;&lt;br /&gt;&lt;/ul&gt;&lt;br /&gt;&lt;br /&gt;In addition to the four commands above, there are two additional basic commands that are used more rarely:&lt;br /&gt;&lt;br /&gt;&lt;ul&gt;&lt;br /&gt;&lt;li&gt;&lt;b&gt;darcs add filename&lt;/b&gt; - tell darcs that you have created a new file that should be kept in the repo. If you forget to add a file, darcs will not store it for you.&lt;/li&gt;&lt;br /&gt;&lt;li&gt;&lt;b&gt;darcs mv oldname newname&lt;/b&gt; - rename a file stored in darcs. Use this command instead of directly renaming the file using a file explorer.&lt;/li&gt;&lt;br /&gt;&lt;/ul&gt;&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Creating Repos&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;Usually I work with existing repos, but it's useful to know how to create new repos:&lt;br /&gt;&lt;br /&gt;&lt;ul&gt;&lt;br /&gt;&lt;li&gt;&lt;b&gt;darcs init&lt;/b&gt; - create an empty remote repo, with no changes in it. This command is usually run on the server.&lt;/li&gt;&lt;br /&gt;&lt;li&gt;&lt;b&gt;darcs get http://code.haskell.org/hoogle&lt;/b&gt; - create a new local repo based on a remote repo.&lt;/li&gt;&lt;br /&gt;&lt;/ul&gt;&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Advanced Commands&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;These commands are useful to correct mistakes, but they aren't essential - you can do everything you need without them.&lt;br /&gt;&lt;br /&gt;&lt;ul&gt;&lt;br /&gt;&lt;li&gt;&lt;b&gt;darcs revert&lt;/b&gt; - undo local changes that have not yet been recorded.&lt;/li&gt;&lt;br /&gt;&lt;li&gt;&lt;b&gt;darcs unrecord&lt;/b&gt; - undo a previous record command. Do not unrecord if you have pushed the changes, or have done substantial work since the initial record. Instead, manually create a new change undoing the previous record, and record that.&lt;/li&gt;&lt;br /&gt;&lt;/ul&gt;&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Collaboration Commands&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;If you are accepting occasional contributions from other people, you will need the following commands:&lt;br /&gt;&lt;br /&gt;&lt;ul&gt;&lt;br /&gt;&lt;li&gt;&lt;b&gt;darcs send -o filename.patch&lt;/b&gt; - make a file containing all the changes in your local repo, but not in the remote repo. This file can be emailed to other people.&lt;/li&gt;&lt;br /&gt;&lt;li&gt;&lt;b&gt;darcs apply filename.patch&lt;/b&gt; - apply a set of changes sent to you by email.&lt;/li&gt;&lt;br /&gt;&lt;/ul&gt;&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Conclusion&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;Effective use of darcs requires only a few simple commands, with only a few command line switches. This guide includes all the commands I ever use. Darcs rarely surprises me, and I think it is a suitable version control system for people who are familiar with the command line, but aren't computer experts.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7094652-1267772236928211031?l=neilmitchell.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://neilmitchell.blogspot.com/feeds/1267772236928211031/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=7094652&amp;postID=1267772236928211031' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/1267772236928211031'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/1267772236928211031'/><link rel='alternate' type='text/html' href='http://neilmitchell.blogspot.com/2011/04/darcs-for-my-wife.html' title='Darcs for my Wife'/><author><name>Neil Mitchell</name><uri>http://www.blogger.com/profile/13084722756124486154</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www-users.cs.york.ac.uk/~ndm/elements/my-photo.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://1.bp.blogspot.com/-phmT-5pT31k/TbqUQCE0zUI/AAAAAAAAAGI/KyBbQQxnrBQ/s72-c/temp.png' height='72' width='72'/><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7094652.post-9135612148341174249</id><published>2011-03-20T23:08:00.002Z</published><updated>2011-03-20T23:15:24.659Z</updated><category scheme='http://www.blogger.com/atom/ns#' term='emily'/><title type='text'>Experience Report: Functional Programming through Deep Time</title><content type='html'>My wife has just completed a draft of the experience report she's intending to submit to &lt;a href="http://www.icfpconference.org/icfp2011/"&gt;ICFP 2011&lt;/a&gt;. It's called &lt;i&gt;&lt;a href="http://community.haskell.org/~ndm/temp/EGMitchell-ExperienceReport.pdf"&gt;Functional Programming through Deep Time&lt;/a&gt;&lt;/i&gt;:&lt;br /&gt;&lt;br /&gt;&lt;blockquote&gt;This experience report describes how Haskell was used to model the beginnings of complex life on Earth. My work combines ecological modeling in Haskell with statistical analysis in R, to answer some long standing paleontological questions. For my work, I found that neither Haskell nor R was suffcient - statistical analysis in Haskell is overly burdensome, while R lacks the structure to express complex algorithms in a maintainable manner. The reaction from my colleagues has ranged from indifferent to excited - but I have yet to tempt any of them over to the pure side!&lt;/blockquote&gt;&lt;br /&gt;&lt;br /&gt;I initially persuaded my wife to switch to Haskell, but since then, I have had little involvement with her code. If you have any feedback for her (ideally before Wednesday!) please leave it in the comments.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7094652-9135612148341174249?l=neilmitchell.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://neilmitchell.blogspot.com/feeds/9135612148341174249/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=7094652&amp;postID=9135612148341174249' title='7 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/9135612148341174249'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/9135612148341174249'/><link rel='alternate' type='text/html' href='http://neilmitchell.blogspot.com/2011/03/experience-report-functional.html' title='Experience Report: Functional Programming through Deep Time'/><author><name>Neil Mitchell</name><uri>http://www.blogger.com/profile/13084722756124486154</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www-users.cs.york.ac.uk/~ndm/elements/my-photo.jpg'/></author><thr:total>7</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7094652.post-4718557179401297128</id><published>2011-03-13T14:05:00.003Z</published><updated>2011-03-13T14:57:13.883Z</updated><category scheme='http://www.blogger.com/atom/ns#' term='hoogle'/><title type='text'>Hoogle for your language (i.e. F#, Scala, ML, Clean...)</title><content type='html'>&lt;i&gt;Summary: If you offer to help, I'll make Hoogle search your statically typed functional language.&lt;/i&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="http://haskell.org/hoogle/"&gt;Hoogle&lt;/a&gt; is a search engine for &lt;a href="http://haskell.org/"&gt;Haskell&lt;/a&gt; functions, that allows you to search by either name, or by type. But very little of Hoogle is actually Haskell specific - most is applicable to any language with a &lt;a href="http://en.wikipedia.org/wiki/Type_inference"&gt;Hindley-Milner&lt;/a&gt; based type system.&lt;br /&gt;&lt;br /&gt;Recently I have been asked by several people what they can do to allow Hoogle to search their preferred language. There are four steps to integrating a language with Hoogle, detailed below. If you are interested in helping please &lt;a href="http://community.haskell.org/~ndm/contact/"&gt;email me&lt;/a&gt; - I already have volunteers for both F# and Scala, but additional volunteers for other languages are welcome.&lt;br /&gt;&lt;br /&gt;To allow searching a language from Hoogle, there are four steps:&lt;br /&gt;&lt;br /&gt;1) A volunteer needs to generate some Hoogle input files containing details of the modules/functions/packages etc. to be searched. These files should be plain text, but can be in a language specific format - i.e. ML syntax for type signatures. For a rough idea of how these files could look see &lt;a href="http://hackage.haskell.org/packages/archive/cmdargs/0.6.5/doc/html/cmdargs.txt"&gt;this example&lt;/a&gt; - for Haskell I get these files from Hackage. The code to generate these input files can be written in any language, and can live outside Hoogle.&lt;br /&gt;&lt;br /&gt;2) Someone needs to write a parser that converts these language specific inputs into internal Hoogle representations. The equivalent code for Haskell is &lt;a href="http://code.haskell.org/hoogle/src/Hoogle/Language/Haskell.hs"&gt;in the Hoogle repo&lt;/a&gt;. If a volunteer writes this code, I'll happily use it. If I have to write this code then that's OK, although I might take a bit longer. This code needs to be written in Haskell and live inside Hoogle.&lt;br /&gt;&lt;br /&gt;At this stage, Hoogle will be able to search the new language. The remaining stages will just make the experience more pleasant.&lt;br /&gt;&lt;br /&gt;3) Someone needs to write a query parser for the language, inside Hoogle. I may do this, as I'm intending to rewrite the Haskell query parser anyway, and I could probably find some savings by doing them together. This code needs to be written in Haskell and live inside Hoogle.&lt;br /&gt;&lt;br /&gt;4) A volunteer would be useful to keep the function definitions up to date, generate new definitions, and ensure they get uploaded.&lt;br /&gt;&lt;br /&gt;&lt;a href="http://community.haskell.org/~ndm/contact/"&gt;Email me&lt;/a&gt; if you want to volunteer!&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7094652-4718557179401297128?l=neilmitchell.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://neilmitchell.blogspot.com/feeds/4718557179401297128/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=7094652&amp;postID=4718557179401297128' title='4 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/4718557179401297128'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/4718557179401297128'/><link rel='alternate' type='text/html' href='http://neilmitchell.blogspot.com/2011/03/hoogle-for-your-language-ie-f-scala-ml.html' title='Hoogle for your language (i.e. F#, Scala, ML, Clean...)'/><author><name>Neil Mitchell</name><uri>http://www.blogger.com/profile/13084722756124486154</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www-users.cs.york.ac.uk/~ndm/elements/my-photo.jpg'/></author><thr:total>4</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7094652.post-9154789723564397050</id><published>2011-02-13T18:31:00.003Z</published><updated>2011-02-13T18:38:22.676Z</updated><category scheme='http://www.blogger.com/atom/ns#' term='c#'/><title type='text'>Corner Cases And Zero (and how WinForms gets it wrong)</title><content type='html'>&lt;i&gt;Summary: Zero is a perfectly good number and functions should deal with it sensibly. In WinForms, both the Bitmap object and the DrawToBitmap function fail on zero, which is wrong. Functional programming (and recursion) make it harder to get the corner cases wrong.&lt;/i&gt;&lt;br /&gt;&lt;br /&gt;Lots of programming is about reusing existing functions/objects. Many types have natural corner cases, e.g. an empty string, the number zero, and an array with zero elements. If the functions you reuse don't deal sensibly with corner cases your functions are likely to contain bugs, or be more verbose in working around other peoples bugs.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;C#/WinForms has bugs with zero&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;Let's write a function in &lt;a href="http://en.wikipedia.org/wiki/C_Sharp_(programming_language)"&gt;C#&lt;/a&gt;/&lt;a href="http://en.wikipedia.org/wiki/Windows_Forms"&gt;WinForms&lt;/a&gt; which given a &lt;a href="http://msdn.microsoft.com/en-us/library/system.windows.forms.control.aspx"&gt;&lt;tt&gt;Control&lt;/tt&gt;&lt;/a&gt; (something that can be displayed) produces a &lt;a href="http://msdn.microsoft.com/en-us/library/system.drawing.bitmap.aspx"&gt;&lt;tt&gt;Bitmap&lt;/tt&gt;&lt;/a&gt; of how it will be drawn:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;public Bitmap Draw(Control c)&lt;br /&gt;{&lt;br /&gt;    Bitmap bmp = new Bitmap(c.Width, c.Height);&lt;br /&gt;    c.DrawToBitmap(bmp, new Rectangle(0, 0, c.Width, c.Height));&lt;br /&gt;    return bmp;&lt;br /&gt;}&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Our function &lt;tt&gt;Draw&lt;/tt&gt; makes use of the existing WinForms function &lt;a href="http://msdn.microsoft.com/en-us/library/system.windows.forms.control.drawtobitmap.aspx"&gt;&lt;tt&gt;DrawToBitmap&lt;/tt&gt;&lt;/a&gt;:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;void Control.DrawToBitmap(Bitmap bitmap, Rectangle bounds)&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;The function &lt;tt&gt;DrawToBitmap&lt;/tt&gt; draws the control into &lt;tt&gt;bitmap&lt;/tt&gt; at the position specified by &lt;tt&gt;bounds&lt;/tt&gt;. This function is useful, but impure (it mutates the &lt;tt&gt;bitmap&lt;/tt&gt; argument), and somewhat fiddly (&lt;tt&gt;bounds&lt;/tt&gt; have to satisfy various invariants with respect to the &lt;tt&gt;bitmap&lt;/tt&gt; and the control). Our &lt;tt&gt;Draw&lt;/tt&gt; function only handles the common case where you want the entire bitmap, but is pure and simpler. (Our &lt;tt&gt;Draw&lt;/tt&gt; function can be renamed &lt;tt&gt;DrawToBitmap&lt;/tt&gt; and added as an &lt;a href="http://msdn.microsoft.com/en-us/library/bb383977.aspx"&gt;extension method&lt;/a&gt; of &lt;tt&gt;Control&lt;/tt&gt;, making it quite convenient to use.)&lt;br /&gt;&lt;br /&gt;Unfortunately our &lt;tt&gt;Draw&lt;/tt&gt; function has a bug, due to the incorrect handling of zero in the functions we rely on. Let's consider a control with width 0, and height 10. First we crash with the exception "Parameter is not valid." when executing:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;new Bitmap(0, 10);&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Unfortunately the .NET &lt;tt&gt;Bitmap&lt;/tt&gt; type doesn't allow bitmaps which don't contain any pixels. This limitation probably comes from the &lt;a href="http://msdn.microsoft.com/en-us/library/dd183485(v=vs.85).aspx"&gt;&lt;tt&gt;CreateBitmap&lt;/tt&gt;&lt;/a&gt; Win32 API function, which doesn't allow empty bitmaps. The result is that our function cannot return a 0x10 bitmap, meaning that lots of nice properties (e.g. the resulting bitmap will be the same size as the control) are necessarily violated. We can patch around the limitations of &lt;tt&gt;Bitmap&lt;/tt&gt; by writing:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;Bitmap bmp = new Bitmap(Math.Max(1, c.Width), Math.Max(1, c.Height));&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;This change is horrid, but it's the best we can do within the limitations of the .NET &lt;tt&gt;Bitmap&lt;/tt&gt; type. We run again and now get the exception "targetbounds" when executing:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;c.DrawToBitmap(bmp, new Rectangle(0, 0, 0, 10));&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Unfortunately &lt;tt&gt;DrawToBitmap&lt;/tt&gt; throws an exception when either the width or height of the bounds is zero. We have to add another workaround to avoid calling &lt;tt&gt;DrawToBitmap&lt;/tt&gt; in these cases (or at this stage perhaps just add an if at the top which returns early if either dimension is 0). The &lt;tt&gt;Bitmap&lt;/tt&gt; limitation is annoying but somewhat understandable - it is driven by legacy code. However, &lt;tt&gt;DrawToBitmap&lt;/tt&gt; could easily have been modified to accept 0 width or height and simply avoid doing anything, which would be the only sensible behaviour at this corner case.&lt;br /&gt;&lt;br /&gt;The problem with bugs in corner cases is that they propagate. &lt;tt&gt;Bitmap&lt;/tt&gt; has a limitation, so everything which uses &lt;tt&gt;Bitmap&lt;/tt&gt; inherits this limitation. The &lt;tt&gt;DrawToControl&lt;/tt&gt; function has a bug, so everything built on top of it has a bug (or needs to include a workaround). The documentation for &lt;tt&gt;Bitmap&lt;/tt&gt; and &lt;tt&gt;DrawToControl&lt;/tt&gt; doesn't mention that they fail at corner cases, which is unfortunate.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Induction for Corner Cases&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;One of the advantages of functional programming is that defining functions recursively forces you to consider corner cases. Consider the &lt;a href="http://haskell.org/"&gt;Haskell&lt;/a&gt; function &lt;a href="http://hackage.haskell.org/packages/archive/base/latest/doc/html/Prelude.html#v:replicate"&gt;&lt;tt&gt;replicate&lt;/tt&gt;&lt;/a&gt;, which takes a number and a value, and repeats the value that number of times. To define the function it is natural to use recursion over the number. This scheme leads to the definition:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;replicate 0 x = []&lt;br /&gt;replicate n x = x : replicate (n-1) x&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;The function &lt;tt&gt;replicate 0 'x'&lt;/tt&gt; returns &lt;tt&gt;[]&lt;/tt&gt;. To get the corner case wrong would have required additional effort. As a result, most Haskell functions work the way you would expect in corner cases - and consequently functions built from them also work sensibly in corner cases. When programming in Haskell my code is less likely to fail in corner cases, and  more likely to work first time.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Exercise&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;As a final thought exercise, consider the following function:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;orAnd :: [[Bool]] -&gt; Bool&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Where &lt;tt&gt;orAnd [[a,b],[c,d]]&lt;/tt&gt; returns &lt;tt&gt;True&lt;/tt&gt; if either both &lt;tt&gt;a&lt;/tt&gt; and &lt;tt&gt;b&lt;/tt&gt; are &lt;tt&gt;True&lt;/tt&gt;, or if both &lt;tt&gt;c&lt;/tt&gt; and &lt;tt&gt;d&lt;/tt&gt; are &lt;tt&gt;True&lt;/tt&gt;. What should &lt;tt&gt;[]&lt;/tt&gt; return? What should &lt;tt&gt;[[]]&lt;/tt&gt; return? If you write this function recursively (or on top of other recursive functions such as &lt;tt&gt;or&lt;/tt&gt;/&lt;tt&gt;and&lt;/tt&gt;) there will be a natural answer. Writing the function imperatively makes it hard to ensure the corner cases are correct.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7094652-9154789723564397050?l=neilmitchell.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://neilmitchell.blogspot.com/feeds/9154789723564397050/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=7094652&amp;postID=9154789723564397050' title='6 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/9154789723564397050'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/9154789723564397050'/><link rel='alternate' type='text/html' href='http://neilmitchell.blogspot.com/2011/02/corner-cases-and-zero-and-how-winforms.html' title='Corner Cases And Zero (and how WinForms gets it wrong)'/><author><name>Neil Mitchell</name><uri>http://www.blogger.com/profile/13084722756124486154</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www-users.cs.york.ac.uk/~ndm/elements/my-photo.jpg'/></author><thr:total>6</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7094652.post-6606312841370270143</id><published>2011-01-23T09:35:00.010Z</published><updated>2011-01-23T19:01:13.386Z</updated><category scheme='http://www.blogger.com/atom/ns#' term='hoogle'/><title type='text'>Hoogle Embed</title><content type='html'>&lt;i&gt;Summary: Hoogle Embed lets you include a small interactive Hoogle search box on your web page.&lt;/i&gt;&lt;br /&gt;&lt;br /&gt;I have just released Hoogle 4.2, which adds the feature Hoogle Embed, letting you embed a small Hoogle powered search box on any web page. For an example, visit &lt;a href="http://community.haskell.org/~ndm/hoogle/"&gt;the Hoogle page on my website&lt;/a&gt;, and try typing "database" in the search box on the right. You should see:&lt;br /&gt;&lt;br /&gt;&lt;img src="http://1.bp.blogspot.com/_EUIHJJkVM3M/TTv5dCZvO0I/AAAAAAAAAF8/c-QpWSxQ41E/s1600/hoogle-embed-search.png" /&gt;&lt;br /&gt;&lt;br /&gt;As you type, the search box will perform Hoogle searches on the Hoogle API, and display the results. Selecting a result will visit the associated documentation. Pressing the Search button will perform the search at &lt;a href="http://haskell.org/hoogle/"&gt;the Hoogle website&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;&lt;ul&gt;&lt;br /&gt;&lt;li&gt;Hoogle Embed has been tested in Chrome, Firefox and IE. &lt;strike&gt;Using IE&lt;/strike&gt; Using IE 7 or below you may not see results unless the page being displayed is on the same server as the Hoogle instance (i.e. haskell.org), due to restrictions on cross domain AJAX requests. This limitation can probably be overcome with additional work, if people are interested.&lt;/li&gt; &lt;br /&gt;&lt;li&gt;Hoogle Embed degrades gracefully if the browser does not support Javascript, leaving just the search box.&lt;/li&gt;&lt;br /&gt;&lt;li&gt;Configuration options allow you to automatically add a prefix or suffix to the users search, for example adding &lt;tt&gt;+hoogle&lt;/tt&gt; to search only the Hoogle API.&lt;/li&gt;&lt;br /&gt;&lt;li&gt;This feature works with either a custom Hoogle instance, or the standard version on haskell.org.&lt;/li&gt;&lt;br /&gt;&lt;/ul&gt;&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Using Hoogle Embed in your web page&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;To include Hoogle Embed on a web page, simply add the following piece of HTML:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;&amp;lt;script type="text/javascript" src="http://haskell.org/hoogle/datadir/resources/jquery-1.4.2.js"&amp;gt;&amp;lt;/script&amp;gt;&lt;br /&gt;&amp;lt;script type="text/javascript" src="http://haskell.org/hoogle/datadir/resources/hoogle.js"&amp;gt;&amp;lt;/script&amp;gt;&lt;br /&gt;&amp;lt;form action="http://haskell.org/hoogle/" method="get"&amp;gt;&lt;br /&gt;  &amp;lt;input type="text"   name="hoogle" id="hoogle" accesskey="1" /&amp;gt;&lt;br /&gt;  &amp;lt;input type="hidden" name="prefix" value="+base" /&amp;gt;&lt;br /&gt;  &amp;lt;input type="submit" value="Search" /&amp;gt;&lt;br /&gt;&amp;lt;/form&amp;gt;&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;To use a different Hoogle server change the &lt;tt&gt;action&lt;/tt&gt; field of the &lt;tt&gt;form&lt;/tt&gt;. To specify a prefix/suffix for all searches add an &lt;tt&gt;input&lt;/tt&gt; field with the name &lt;tt&gt;prefix&lt;/tt&gt;/&lt;tt&gt;suffix&lt;/tt&gt;. For example, the above snippet only searches the base pacakge. By eliminating the &lt;tt&gt;prefix&lt;/tt&gt; line it will search using the default Hoogle settings (the Haskell platform).&lt;br /&gt;&lt;br /&gt;The Hoogle Embed feature is usable on any web page, but I think would be particularly effective on pages such as &lt;a href="http://hackage.haskell.org/package/hoogle"&gt;the Hackage page for a package&lt;/a&gt;, or for any &lt;a href="http://hackage.haskell.org/packages/archive/hoogle/latest/doc/html/Hoogle.html"&gt;Haddock documentation&lt;/a&gt; (perhaps when using a flag such as &lt;tt&gt;--hoogle-embed&lt;/tt&gt;). I encourage anyone who is interested to submit patches to the relevant projects.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Hoogle Manual&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;I am currently considering the issue of documentation, and would welcome other peoples thoughts. Currently the Hoogle manual is hosted on the &lt;a href="http://www.haskell.org/haskellwiki/Hoogle"&gt;Haskell Wiki&lt;/a&gt;, but is somewhat out of date. For all other packages, I tend to write an HTML manual stored in the darcs repo, such as for &lt;a href="http://community.haskell.org/~ndm/darcs/hlint/hlint.htm"&gt;hlint&lt;/a&gt;. There are advantages to both formats - the wiki can be easily edited by many people, but the darcs manual can be updated simultaneously with the code and is available offline (most Hoogle work is done on a train without internet access, so this issue is very relevant). My current thought is to remove the wiki page and move it's contents into darcs.&lt;br /&gt;&lt;br /&gt;&lt;i&gt;Edit&lt;/i&gt;: Fixed the Javascript links.&lt;br /&gt;&lt;br /&gt;&lt;i&gt;Edit 2&lt;/i&gt;: Hoogle Embed now works cross domain in IE 8 and above.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7094652-6606312841370270143?l=neilmitchell.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://neilmitchell.blogspot.com/feeds/6606312841370270143/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=7094652&amp;postID=6606312841370270143' title='12 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/6606312841370270143'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/6606312841370270143'/><link rel='alternate' type='text/html' href='http://neilmitchell.blogspot.com/2011/01/hoogle-embed.html' title='Hoogle Embed'/><author><name>Neil Mitchell</name><uri>http://www.blogger.com/profile/13084722756124486154</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www-users.cs.york.ac.uk/~ndm/elements/my-photo.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://1.bp.blogspot.com/_EUIHJJkVM3M/TTv5dCZvO0I/AAAAAAAAAF8/c-QpWSxQ41E/s72-c/hoogle-embed-search.png' height='72' width='72'/><thr:total>12</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7094652.post-1625503927211787054</id><published>2011-01-16T19:19:00.004Z</published><updated>2011-01-16T21:03:35.915Z</updated><category scheme='http://www.blogger.com/atom/ns#' term='hoogle'/><title type='text'>Hoogle At 1.7 Million Searches</title><content type='html'>&lt;i&gt;Summary: I detail some of the changes in today's Hoogle release; I outline future plans for Hoogle; I plot some statistics, noting that Hoogle has been used for 1.7 million searches.&lt;/i&gt;&lt;br /&gt;&lt;br /&gt;&lt;b&gt;New Features&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="http://haskell.org/hoogle"&gt;Hoogle&lt;/a&gt; is a Haskell API search engine, which I've been working on since 2004. Today I updated the web version at &lt;a href="http://haskell.org/hoogle"&gt;http://haskell.org/hoogle&lt;/a&gt;, and uploaded a new version to &lt;a href="http://hackage.haskell.org/package/hoogle"&gt;Hackage&lt;/a&gt;. Since my &lt;a href="http://neilmitchell.blogspot.com/2010/12/new-version-of-hoogle-41.html"&gt;last release&lt;/a&gt; I've been working on several new features:&lt;br /&gt;&lt;br /&gt;&lt;ul&gt;&lt;br /&gt;&lt;li&gt;Instant Search - in the top right corner you will see a link entitled "Instant is off". Click that link to turn instant search on, and then searches will be performed as you type. This feature is still experimental, but I already rely on it for my searches.&lt;/li&gt;&lt;br /&gt;&lt;li&gt;Visual Refresh - I've modified the style and layout, trying to improve the general feel, especially when used in conjunction with instant search. I added links to make the package filtering options more accessible and I collapse identical results available from different modules.&lt;/li&gt;&lt;br /&gt;&lt;li&gt;Database Update - thanks to Ian Lynagh and Ross Paterson I've been able to update all the Hoogle databases, including for the base package. In the future I hope to  keep Hoogle continuously updated.&lt;/li&gt;&lt;br /&gt;&lt;/ul&gt;&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Future Plans&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;There are three big improvements I plan to make to Hoogle:&lt;br /&gt;&lt;br /&gt;&lt;ul&gt;&lt;br /&gt;&lt;li&gt;Better Data - Hoogle relies on data from &lt;a href="http://www.haskell.org/haddock/"&gt;Haddock&lt;/a&gt;, uploaded to &lt;a href="http://hackage.haskell.org/packages/hackage.html"&gt;Hackage&lt;/a&gt;, by &lt;a href="http://www.haskell.org/cabal/"&gt;Cabal&lt;/a&gt;. This release improves the data, but there are still further improvements to be made. There are several bugs in Haddock, and data for the base library is not yet available on Hackage. If Hoogle could integrate more closely with Cabal then Hoogle could search users local packages. Many of these problems require coordination between several projects, and any offers of help would be welcome. Some bugs: &lt;a href="http://code.google.com/p/ndmitchell/issues/detail?id=80"&gt;#80&lt;/a&gt;, &lt;a href="http://code.google.com/p/ndmitchell/issues/detail?id=11"&gt;#11&lt;/a&gt;, &lt;a href="http://code.google.com/p/ndmitchell/issues/detail?id=339"&gt;#339&lt;/a&gt;, &lt;a href="http://code.google.com/p/ndmitchell/issues/detail?id=60"&gt;#60&lt;/a&gt;, &lt;a href="http://code.google.com/p/ndmitchell/issues/detail?id=80"&gt;#183&lt;/a&gt;.&lt;/li&gt;&lt;br /&gt;&lt;li&gt;Better Search Syntax - the search syntax in Hoogle is acceptable, but isn't that close to other search engines, doesn't always mesh well with instant search and has a number of bugs. I intend to overhaul the search syntax, hopefully improving the feel of Hoogle. Some bugs: &lt;a href="http://code.google.com/p/ndmitchell/issues/detail?id=34"&gt;#34&lt;/a&gt;, &lt;a href="http://code.google.com/p/ndmitchell/issues/detail?id=61"&gt;#61&lt;/a&gt;, &lt;a href="http://code.google.com/p/ndmitchell/issues/detail?id=398"&gt;#398&lt;/a&gt;, &lt;a href="http://code.google.com/p/ndmitchell/issues/detail?id=130"&gt;#130&lt;/a&gt;.&lt;/li&gt;&lt;br /&gt;&lt;li&gt;Improved Database/Searching - type search needs to execute faster and give better results. By executing faster Hoogle will be able to search the whole of Hackage at once. Hoogle has gone through four entirely different iterations of type search already, and I have a design for the fifth version. Some bugs: &lt;a href="http://code.google.com/p/ndmitchell/issues/detail?id=79"&gt;#79&lt;/a&gt;, &lt;a href="http://code.google.com/p/ndmitchell/issues/detail?id=324"&gt;#324&lt;/a&gt;, &lt;a href="http://code.google.com/p/ndmitchell/issues/detail?id=30"&gt;#30&lt;/a&gt;, &lt;a href="http://code.google.com/p/ndmitchell/issues/detail?id=336"&gt;#336&lt;/a&gt;, &lt;a href="http://code.google.com/p/ndmitchell/issues/detail?id=381"&gt;#381&lt;/a&gt;.&lt;/li&gt;&lt;br /&gt;&lt;/ul&gt;&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Statistics&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;Hoogle logs the date and contents of each search, but stores no personally identifiable information. These statistics all relate to the number of searches made, not including blank searches or suggestions offered by the Firefox search plugin. In the time between adding a logging facility, and making it log the date (2009-Apr-24), there were 631930 searches. Since then there have been at least 1012414 searches, not taking into account about a month where logging was disabled. Generally, searches range between 1000 and 2500 a day.&lt;br /&gt;&lt;br /&gt;&lt;img src="http://4.bp.blogspot.com/_EUIHJJkVM3M/TTNFYlZsjQI/AAAAAAAAAF0/aCNXQDOkwPw/s1600/searches_per_day.png" &gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7094652-1625503927211787054?l=neilmitchell.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://neilmitchell.blogspot.com/feeds/1625503927211787054/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=7094652&amp;postID=1625503927211787054' title='4 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/1625503927211787054'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/1625503927211787054'/><link rel='alternate' type='text/html' href='http://neilmitchell.blogspot.com/2011/01/hoogle-at-17-million-searches.html' title='Hoogle At 1.7 Million Searches'/><author><name>Neil Mitchell</name><uri>http://www.blogger.com/profile/13084722756124486154</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www-users.cs.york.ac.uk/~ndm/elements/my-photo.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://4.bp.blogspot.com/_EUIHJJkVM3M/TTNFYlZsjQI/AAAAAAAAAF0/aCNXQDOkwPw/s72-c/searches_per_day.png' height='72' width='72'/><thr:total>4</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7094652.post-521882933081528192</id><published>2010-12-19T16:01:00.003Z</published><updated>2010-12-19T16:15:47.634Z</updated><category scheme='http://www.blogger.com/atom/ns#' term='hoogle'/><title type='text'>New Version of Hoogle (4.1)</title><content type='html'>I've just released a new version of Hoogle to both &lt;a href="http://hackage.haskell.org/package/hoogle"&gt;Hackage&lt;/a&gt; and to &lt;a href="http://haskell.org/hoogle"&gt;haskell.org/hoogle&lt;/a&gt;. Hoogle is a Haskell search engine that allows you to search for functions by either name or approximate type signature.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;What's New&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;&lt;ul&gt;&lt;br /&gt;&lt;li&gt;The first release in over a year, building with up to date packages and compilers.&lt;/li&gt;&lt;br /&gt;&lt;li&gt;Up to date library definitions for all of &lt;a href="http://hackage.haskell.org/"&gt;Hackage&lt;/a&gt;.&lt;/li&gt;&lt;br /&gt;&lt;li&gt;Searches the &lt;a href="http://hackage.haskell.org/platform/"&gt;Haskell Platform&lt;/a&gt; by default.&lt;/li&gt;&lt;br /&gt;&lt;li&gt;Significant improvements when installing Hoogle locally.&lt;/li&gt;&lt;br /&gt;&lt;li&gt;Lots of additional small improvements.&lt;/li&gt;&lt;br /&gt;&lt;/ul&gt;&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Searching all of Hackage&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;Hoogle can now search all of Hackage. By default it will search the Haskell Platform, but you can search additional packages using &lt;tt&gt;+package-name&lt;/tt&gt;, for example &lt;a href="http://haskell.org/hoogle/?hoogle=%2Btagsoup+Tag+a+-%3E+Bool"&gt;+tagsoup Tag a -&gt; Bool&lt;/a&gt;. You can search both the platform and additional packages by including &lt;tt&gt;+default&lt;/tt&gt;, for example &lt;a href="http://haskell.org/hoogle/?hoogle=([a]+-%3E+(b,+[a]))+-%3E+[a]+-%3E+[b]+%2Bsplit+%2Bdefault"&gt;([a] -&gt; (b, [a])) -&gt; [a] -&gt; [b] +split +default&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;I'm still not sure what should be searched by default, and which collections of modules should be available, but I'm open to suggestions.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Installing Hoogle Locally&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;Many of the improvements to Hoogle are of specific benefit when installing Hoogle yourself, not using the web version. To install Hoogle:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;cabal update&lt;br /&gt;cabal install hoogle&lt;br /&gt;hoogle data&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;The last step will download information and generate databases for the Haskell Platform. You can then run searches, such as &lt;tt&gt;hoogle filter -n10&lt;/tt&gt;. Hoogle now uses &lt;a href="http://hackage.haskell.org/package/cmdargs"&gt;cmdargs&lt;/a&gt;, so &lt;tt&gt;hoogle --help&lt;/tt&gt; will detail some of the options available.&lt;br /&gt;&lt;br /&gt;You can also run Hoogle as a web server by typing &lt;tt&gt;hoogle server&lt;/tt&gt;. Now visit &lt;tt&gt;localhost&lt;/tt&gt; in a web browser and you'll have the power of Hoogle on your computer. If you often work offline, you can run &lt;tt&gt;hoogle data --local&lt;/tt&gt; and &lt;tt&gt;hoogle server --local&lt;/tt&gt; to use documentation on your local machine where available.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;What Now?&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;Hoogle is currently the spare time project I'm focusing on - there are lots of improvements I am intending to make. Hoogle 4.1 is about getting the code up to a standard that can be easily maintained, allowing future versions to deliver more features. Please try out Hoogle, &lt;a href="http://code.google.com/p/ndmitchell/issues/list?q=proj=Hoogle"&gt;report any bugs you find&lt;/a&gt;, and let me know your thoughts.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7094652-521882933081528192?l=neilmitchell.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://neilmitchell.blogspot.com/feeds/521882933081528192/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=7094652&amp;postID=521882933081528192' title='7 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/521882933081528192'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/521882933081528192'/><link rel='alternate' type='text/html' href='http://neilmitchell.blogspot.com/2010/12/new-version-of-hoogle-41.html' title='New Version of Hoogle (4.1)'/><author><name>Neil Mitchell</name><uri>http://www.blogger.com/profile/13084722756124486154</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www-users.cs.york.ac.uk/~ndm/elements/my-photo.jpg'/></author><thr:total>7</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7094652.post-8386658137883385952</id><published>2010-12-12T10:31:00.007Z</published><updated>2011-08-12T10:47:35.632+01:00</updated><title type='text'>Installing the Haskell network library on Windows</title><content type='html'>&lt;i&gt;Summary: This post describes how to install the Haskell network library on Windows.&lt;/i&gt;&lt;br /&gt;&lt;br /&gt;The &lt;a href="http://hackage.haskell.org/package/network"&gt;network library&lt;/a&gt; used to be bundled with &lt;a href="http://haskell.org/ghc/"&gt;GHC&lt;/a&gt;. Unfortunately it was unbundled from the standard installer (in my opinion, a mistake), and people were required to either install it using &lt;a href="http://www.haskell.org/cabal/"&gt;Cabal&lt;/a&gt; (rather tricky) or switch to using the &lt;a href="http://hackage.haskell.org/platform/"&gt;Haskell Platform&lt;/a&gt; (which is not suitable for power developers who want to test with newer/older GHC versions, or those who want to upgrade/downgrade network). This post describes how to install the library on Windows using Cabal.&lt;br /&gt;&lt;br /&gt;First, install &lt;a href="http://cygwin.com/"&gt;Cygwin&lt;/a&gt;. I selected lots of options (configure tools, auto conf, compilers) - but I'm not sure which are necessary.&lt;br /&gt;&lt;br /&gt;Then start a Cygwin window and run:&lt;br /&gt;&lt;br /&gt;&lt;tt&gt;WHICHGHC=`which ghc` &amp;&amp; PATH=`dirname $WHICHGHC`/../mingw/bin:$PATH &amp;&amp; cabal install network --configure-option --host=i386-unknown-mingw32 --global --enable-library-profiling&lt;br /&gt;&lt;/tt&gt;&lt;br /&gt;&lt;br /&gt;This configures network to use mingw32, and sets up the PATH so that mingw binaries from GHC are first, and get configured in. You can then use the network library as normal, without ever using Cygwin again. I have successfully executed this command on GHC 6.12.3 and 7.0.1.&lt;br /&gt;&lt;br /&gt;Dependencies on foreign libraries are a problem for any cross platform language. Several years ago, installing Haskell libraries was a painful process - but &lt;a href="http://hackage.haskell.org/"&gt;Hackage&lt;/a&gt; and Cabal have made it impressively easy. The only remaining packages that are complex to install are those that bind to foreign libraries, most of which use configure scripts, which are not well suited to Windows. I hope over time even libraries with foreign dependencies will become easy to install.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Update:&lt;/b&gt; Ivan Perez suggests the alternative form:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;WHICHGHC=`which ghc` &amp;&amp; PATH=`dirname $WHICHGHC`/../mingw/bin:$PATH &amp;&amp; cabal install network --configure-option --build=i386-unknown-mingw32 --configure-option --host=i686-pc-cygwin --global --enable-library-profiling&lt;br /&gt;&lt;/pre&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7094652-8386658137883385952?l=neilmitchell.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://neilmitchell.blogspot.com/feeds/8386658137883385952/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=7094652&amp;postID=8386658137883385952' title='15 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/8386658137883385952'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/8386658137883385952'/><link rel='alternate' type='text/html' href='http://neilmitchell.blogspot.com/2010/12/installing-haskell-network-library-on.html' title='Installing the Haskell network library on Windows'/><author><name>Neil Mitchell</name><uri>http://www.blogger.com/profile/13084722756124486154</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www-users.cs.york.ac.uk/~ndm/elements/my-photo.jpg'/></author><thr:total>15</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7094652.post-700723260117425612</id><published>2010-10-14T19:57:00.002+01:00</published><updated>2010-10-14T20:03:35.414+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='neil'/><title type='text'>Enhanced Cabal SDist</title><content type='html'>&lt;i&gt;Summary&lt;/i&gt;: I wrote a little script on top of cabal sdist that checks for common mistakes.&lt;br /&gt;&lt;br /&gt;&lt;a href="http://www.haskell.org/cabal/"&gt;Cabal&lt;/a&gt; is the standard way to distribute Haskell packages, and has significantly simplified the process of installing libraries. However, I often make mistakes when creating cabal distributions using &lt;tt&gt;sdist&lt;/tt&gt;. One common problem is that if you don't include a file in your .cabal file it will not be put in the distribution package and the package will fail to build for other people, but if the file is available locally it will still work for you. To work around this problem, I often perform a &lt;tt&gt;cabal install&lt;/tt&gt; immediately after I upload a cabal package. Last week I discovered a package that I'd requested to be uploaded had suffered the same fate (&lt;a href="http://hackage.haskell.org/package/hws"&gt;hws&lt;/a&gt; – from the fantastic Haskell Workshop paper on writing a scalable web server in Haskell), showing this mistake isn't specific to me.&lt;br /&gt;&lt;br /&gt;&lt;a href="http://www.nltg.brighton.ac.uk/home/Eric.Kow/"&gt;Eric Kow&lt;/a&gt; often says that programmers spend too little time writing tools to help their workflow. In response to this observation I have created the "neil" tool, available in darcs from &lt;tt&gt;http://community.haskell.org/~ndm/darcs/neil&lt;/tt&gt; - darcs get and then cabal install. One feature of this neil tool is &lt;tt&gt;neil sdist&lt;/tt&gt;. The actions &lt;tt&gt;neil sdist&lt;/tt&gt; performs are:&lt;br /&gt;&lt;br /&gt;&lt;ul&gt;&lt;br /&gt;&lt;li&gt;&lt;tt&gt;cabal check&lt;/tt&gt;&lt;/li&gt;&lt;br /&gt;&lt;li&gt;&lt;tt&gt;cabal configure&lt;/tt&gt; to a temporary directory&lt;/li&gt;&lt;br /&gt;&lt;li&gt;&lt;tt&gt;cabal sdist&lt;/tt&gt; to a temporary directory&lt;/li&gt;&lt;br /&gt;&lt;li&gt;change to that temporary directory&lt;/li&gt;&lt;br /&gt;&lt;li&gt;&lt;tt&gt;cabal configure –Werror –fwarn-unused-imports&lt;/tt&gt;&lt;/li&gt;&lt;br /&gt;&lt;li&gt;&lt;tt&gt;cabal build&lt;/tt&gt;&lt;/li&gt;&lt;br /&gt;&lt;li&gt;&lt;tt&gt;cabal haddock –executables&lt;/tt&gt;&lt;/li&gt;&lt;br /&gt;&lt;/ul&gt;&lt;br /&gt;&lt;br /&gt;The intention is that this sequence of actions will fail if anything bad would happen on a users machine. If the file fails to build then cabal build will fail. If the haddock fails to generate than cabal haddock will fail. If there would be warnings during build then &lt;tt&gt;–Werror&lt;/tt&gt; will cause it to fail. If all these checks pass then the package is likely to install without problems (once the &lt;tt&gt;cabal test&lt;/tt&gt; feature is present then that can be included, which will give even more assurance).&lt;br /&gt;&lt;br /&gt;I have deliberately not uploaded the neil tool to &lt;a href="http://hackage.haskell.org/packages/hackage.html"&gt;Hackage&lt;/a&gt;, and have no intention of doing so. The name "neil" is not suitable for Hackage, and I intend to add specific actions to help my workflow. If anyone wants to take any of the actions included in this tool and break them out in to separate tools, roll them back into cabal itself etc, they are very welcome.&lt;br /&gt;&lt;br /&gt;Taking Eric's advice, I have found that by automating aspects of my workflow I can take steps which were error prone, or verbose, and make them concise and accurate. With &lt;tt&gt;neil sdist&lt;/tt&gt; I have eliminated an entire class of potential errors, with very little ongoing work.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7094652-700723260117425612?l=neilmitchell.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://neilmitchell.blogspot.com/feeds/700723260117425612/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=7094652&amp;postID=700723260117425612' title='4 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/700723260117425612'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/700723260117425612'/><link rel='alternate' type='text/html' href='http://neilmitchell.blogspot.com/2010/10/enhanced-cabal-sdist.html' title='Enhanced Cabal SDist'/><author><name>Neil Mitchell</name><uri>http://www.blogger.com/profile/13084722756124486154</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www-users.cs.york.ac.uk/~ndm/elements/my-photo.jpg'/></author><thr:total>4</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7094652.post-6313110452701919984</id><published>2010-09-18T21:56:00.002+01:00</published><updated>2010-09-18T22:05:23.072+01:00</updated><title type='text'>Three Closed GHC Bugs I Wish Were Open</title><content type='html'>&lt;i&gt;Summary&lt;/i&gt;: I want three changes to the Haskell standard libraries. &lt;a href="http://hackage.haskell.org/trac/ghc/ticket/1590"&gt;System.Info.isWindows&lt;/a&gt; should be added, &lt;a href="http://hackage.haskell.org/trac/ghc/ticket/2042"&gt;Control.Monad.concatMapM&lt;/a&gt; should be added, and &lt;a href="http://hackage.haskell.org/trac/ghc/ticket/3159"&gt;Control.Concurrent.QSem&lt;/a&gt; should work with negative quantities.&lt;br /&gt;&lt;br /&gt;Over the last few hours I've been going through my inbox, trying to deal with some of my older emails. In that process, I've had to admit defeat on three GHC bugs that I'd left in my inbox to come back to. All these bugs relate to changes to the Haskell standard libraries, that were opened as bugs, and that got resolved as closed/wontfix. I will never get time to tackle these bugs, but perhaps someone will? The bugs are:&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Add System.Info.isWindows - &lt;a href="http://hackage.haskell.org/trac/ghc/ticket/1590"&gt;bug 1590&lt;/a&gt;&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;module System.Info where&lt;br /&gt;&lt;br /&gt;-- | Check if the operating system is a Windows derivative. Returns True on&lt;br /&gt;--   all Windows systems (Win95, Win98 ... Vista, Win7), and False on all others&lt;br /&gt;isWindows :: Bool&lt;br /&gt;isWindows = os == "mingw"&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Currently the recognised way to test at runtime if your application is being run on Windows is:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;import System.Info&lt;br /&gt;&lt;br /&gt;.... = os == "mingw"&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;This is wrong for many reasons:&lt;br /&gt;&lt;br /&gt;&lt;ul&gt;&lt;br /&gt;&lt;li&gt;The return result of &lt;tt&gt;os&lt;/tt&gt; is &lt;i&gt;not&lt;/i&gt; an operating system, but a ported toolchain.&lt;/li&gt;&lt;br /&gt;&lt;li&gt;The result &lt;tt&gt;"mingw"&lt;/tt&gt; does not imply that MinGW is installed on the computer.&lt;/li&gt;&lt;br /&gt;&lt;li&gt;String comparisons are unsafe and unchecked, a simple typo breaks this code.&lt;/li&gt;&lt;br /&gt;&lt;li&gt;In GHC this comparison will take place at runtime, even though the result is a constant.&lt;/li&gt;&lt;br /&gt;&lt;/ul&gt;&lt;br /&gt;&lt;br /&gt;The Haskell abstractions and command line tools for almost all non-Windows operating systems have converged to the point where most programs with operating system specific behaviour have two cases - one for Windows and one for everything else. It makes sense to directly support what is probably the most common usage of the &lt;tt&gt;os&lt;/tt&gt; function, and to encourage people away from the C preprocessor where possible.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Add Control.Monad.concatMapM - &lt;a href="http://hackage.haskell.org/trac/ghc/ticket/2042"&gt;bug 2042&lt;/a&gt;&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;module Control.Monad where&lt;br /&gt;&lt;br /&gt;-- | The 'concatMapM' function generalizes 'concatMap' to arbitrary monads.&lt;br /&gt;concatMapM        :: (Monad m) =&gt; (a -&gt; m [b]) -&gt; [a] -&gt; m [b]&lt;br /&gt;concatMapM f xs   =  liftM concat (mapM f xs)&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;I've personally defined this function in &lt;a href="http://hackage.haskell.org/package/binarydefer"&gt;binarydefer&lt;/a&gt;, &lt;a href="http://community.haskell.org/~ndm/catch/"&gt;catch&lt;/a&gt;, &lt;a href="http://hackage.haskell.org/package/derive"&gt;derive&lt;/a&gt;, &lt;a href="http://hackage.haskell.org/package/hlint"&gt;hlint&lt;/a&gt;, &lt;a href="http://hackage.haskell.org/package/hoogle"&gt;hoogle&lt;/a&gt;, my &lt;a href="http://community.haskell.org/~ndm/darcs/website/"&gt;website generator&lt;/a&gt; and &lt;a href="http://community.haskell.org/~ndm/yhc/"&gt;yhc&lt;/a&gt;. There's even a copy in &lt;a href="http://darcs.haskell.org/ghc/compiler/utils/MonadUtils.hs"&gt;GHC&lt;/a&gt;. If a function has been defined identically that many times, it clear deserves to be in the standard library. We have mapM, filterM, zipWithM, but concatMapM is suspiciously absent.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Make Control.Concurrent.QSem work with negatives - &lt;a href="http://hackage.haskell.org/trac/ghc/ticket/3159"&gt;bug 3159&lt;/a&gt;&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;The &lt;tt&gt;QSem&lt;/tt&gt; module defines a quantity semaphore, where the quantity semaphores must be natural numbers. Attempts to construct semaphores with negative numbers raise an error. There is, however, a perfectly sensible and obvious interpretation if negative numbers are allowed. It is a shame that this module could provide total functions, which never raise an error, but does not. In addition, for some problems the use of negative quantity semaphores is more natural.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;What Now?&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;I've removed all these bugs from my inbox, and invite someone else to take up the cause - I just don't have the time. Until these issues are resolved, I will test for Windows in a horrible way, define concatMapM whenever I start a new project, and lament the lack of generality in QSem. None of the issues is particularly serious, but all are slightly annoying.&lt;br /&gt;&lt;br /&gt;&lt;i&gt;Email etiquette&lt;/i&gt;: Today I've cleared about 50 emails from my inbox. Unfortunately my inbox remains big and unwieldy. If you ever email me, and I don't get back to you, email me again a week later. As long as you reply to the first message, Gmail will collapse the reply in to the original conversation, and there won't be any additional load on my inbox - it will just remind me that I should have dealt with your email. I apologise for any emails that have fallen through the cracks.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7094652-6313110452701919984?l=neilmitchell.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://neilmitchell.blogspot.com/feeds/6313110452701919984/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=7094652&amp;postID=6313110452701919984' title='10 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/6313110452701919984'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/6313110452701919984'/><link rel='alternate' type='text/html' href='http://neilmitchell.blogspot.com/2010/09/three-closed-ghc-bugs-i-wish-were-open.html' title='Three Closed GHC Bugs I Wish Were Open'/><author><name>Neil Mitchell</name><uri>http://www.blogger.com/profile/13084722756124486154</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www-users.cs.york.ac.uk/~ndm/elements/my-photo.jpg'/></author><thr:total>10</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7094652.post-1394092189701797642</id><published>2010-08-23T19:47:00.005+01:00</published><updated>2010-08-24T13:31:58.937+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='cmdargs'/><title type='text'>CmdArgs Example</title><content type='html'>&lt;i&gt;Summary: A simple CmdArgs parser is incredibly simple (just a data type). A more advanced CmdArgs parser is still pretty simple (a few annotations). People shouldn't be using getArgs even for quick one-off programs.&lt;/i&gt;&lt;br /&gt;&lt;br /&gt;A few days ago I went to the &lt;a href="http://www.meetup.com/hoodlums/"&gt;Haskell Hoodlums&lt;/a&gt; meeting - an event in London aimed towards people learning Haskell. The event was good fun, and very well organised - professional quality Haskell tutition for free by friendly people - the Haskell community really is one of Haskell's strengths! The final exercise was to write a program that picks a random number between 1 and 100, then has the user take guesses, with higher/lower hints. After writing the program, Ganesh suggested adding command line flags to control the minimum/maximum numbers. It's not too hard to do this directly with getArgs:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;import System.Environment&lt;br /&gt;&lt;br /&gt;main = do&lt;br /&gt;    [min,max] &amp;lt;- getArgs&lt;br /&gt;    print (read min, read max)&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Next we discussed adding an optional limit on the number of guesses the user is allowed. It's certainly possible to extend the getArgs variant to take in a limit, but things are starting to get a bit ugly. If the user enters the wrong number of arguments they get a pattern match error. There is no help message to inform the user which flags the program takes. While getArgs is simple to start with, it doesn't have much flexibility, and handles errors very poorly. However, for years I used getArgs for all one-off programs - I found the other command line parsing libraries (including GetOpt) added too much overhead, and always required referring back to the documentation. To solve this problem I wrote &lt;a href="http://community.haskell.org/~ndm/cmdargs/"&gt;CmdArgs&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;A Simple CmdArgs Parser&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;To start using CmdArgs we first define a record to capture the information we want from the command line:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;data Guess = Guess {min :: Int, max :: Int, limit :: Maybe Int} deriving (Data,Typeable,Show)&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;For our number guessing program we need a minimum, a maximum, and an optional limit. The deriving clause is required to operate with the CmdArgs library, and provides some basic reflection capabilities for this data type. Once we've written this data type, a CmdArgs parser is only one function call away:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;{-# LANGUAGE DeriveDataTypeable #-}&lt;br /&gt;import System.Console.CmdArgs&lt;br /&gt;&lt;br /&gt;data Guess = Guess {min :: Int, max :: Int, limit :: Maybe Int} deriving (Data,Typeable,Show)&lt;br /&gt;&lt;br /&gt;main = do&lt;br /&gt;    x &amp;lt;- cmdArgs $ Guess 1 100 Nothing&lt;br /&gt;    print x&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Now we have a simple command line parser. Some sample interactions are:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;$ guess --min=10&lt;br /&gt;NumberGuess {min = 10, max = 100, limit = Nothing}&lt;br /&gt;&lt;br /&gt;$ guess --min=10 --max=1000&lt;br /&gt;NumberGuess {min = 10, max = 1000, limit = Nothing}&lt;br /&gt;&lt;br /&gt;$ guess --limit=5&lt;br /&gt;NumberGuess {min = 1, max = 100, limit = Just 5}&lt;br /&gt;&lt;br /&gt;$ guess --help&lt;br /&gt;The guess program&lt;br /&gt;&lt;br /&gt;guess [OPTIONS]&lt;br /&gt;&lt;br /&gt;  -? --help       Display help message&lt;br /&gt;  -V --version    Print version information&lt;br /&gt;     --min=INT&lt;br /&gt;     --max=INT&lt;br /&gt;  -l --limit=INT&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Adding Features to CmdArgs&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;Our simple CmdArgs parser is probably sufficient for this task. I doubt anyone will be persuaded to use my guessing program without a fancy iPhone interface. However, CmdArgs provides all the power necessary to customise the parser, by adding annotations to the input value. First, we can modify the parser to make it easier to add our annotations:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;guess = cmdArgsMode $ Guess {min = 1, max = 100, limit = Nothing}&lt;br /&gt;&lt;br /&gt;main = do&lt;br /&gt;    x &amp;lt;- cmdArgsRun guess&lt;br /&gt;    print x&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;We have changed Guess to use record syntax for constructing the values, which helps document what we are doing. We've also switched to using cmdArgsMode/cmdArgsRun (cmdArgs which is just a composition of those two functions) - this helps avoid any problems with capturing the annotations when running repeatedly in GHCi. Now we can add annotations to the guess value:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;guess = cmdArgsMode $ Guess&lt;br /&gt;    {min = 1   &amp;= argPos 0 &amp;= typ "MIN"&lt;br /&gt;    ,max = 100 &amp;= argPos 1 &amp;= typ "MAX"&lt;br /&gt;    ,limit = Nothing &amp;= name "n" &amp;= help "Limit the number of choices"}&lt;br /&gt;    &amp;= summary "Neil's awesome guessing program"&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Here we've specified that min/max must be at argument position 0/1, which more closely matches the original getArgs parser - this means the user is always forced to enter a min/max (they could be made optional with the opt annotation). For the limit we've added a name annotation to say that we'd like the flag -n to map to limit, instead of using the default -l. We've also given limit some help text, which will be displayed with --help. Finally, we've given a different summary line to the program.&lt;br /&gt;&lt;br /&gt;We can now interact with our new parser:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;$ guess&lt;br /&gt;Requires at least 2 arguments, got 0&lt;br /&gt;&lt;br /&gt;$ guess 1 100&lt;br /&gt;Guess {min = 1, max = 100, limit = Nothing}&lt;br /&gt;&lt;br /&gt;$ guess 1 100 -n4&lt;br /&gt;Guess {min = 1, max = 100, limit = Just 4}&lt;br /&gt;&lt;br /&gt;$ guess -?&lt;br /&gt;Neil's awesome guessing program&lt;br /&gt;&lt;br /&gt;guess [OPTIONS] MIN MAX&lt;br /&gt;&lt;br /&gt;  -? --help       Display help message&lt;br /&gt;  -V --version    Print version information&lt;br /&gt;  -n --limit=INT  Limit the number of choices&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;&lt;b&gt;The Complete Program&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;For completeness sake, here is the complete program. I think for this program the most suitable CmdArgs parser is the simpler one initially written, which I have used here:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;{-# LANGUAGE DeriveDataTypeable, RecordWildCards #-}&lt;br /&gt;&lt;br /&gt;import System.Random&lt;br /&gt;import System.Console.CmdArgs&lt;br /&gt;&lt;br /&gt;data Guess = Guess {min :: Int, max :: Int, limit :: Maybe Int} deriving (Data,Typeable)&lt;br /&gt;&lt;br /&gt;main = do&lt;br /&gt;    Guess{..} &amp;lt;- cmdArgs $ Guess 1 100 Nothing&lt;br /&gt;    answer &amp;lt;- randomRIO (min,max)&lt;br /&gt;    game limit answer&lt;br /&gt;&lt;br /&gt;game (Just 0) answer = putStrLn "Limit exceeded"&lt;br /&gt;game limit answer = do&lt;br /&gt;    putStr "Have a guess: "&lt;br /&gt;    guess &amp;lt;- fmap read getLine&lt;br /&gt;    if guess == answer then&lt;br /&gt;        putStrLn "Awesome!!!1"&lt;br /&gt;     else do&lt;br /&gt;        putStrLn $ if guess &gt; answer then "Too high" else "Too low"&lt;br /&gt;        game (fmap (subtract 1) limit) answer&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;(The code in this post can be freely reused for any purpose, unless you are porting it to the iPhone, in which case I want 10% of all revenues.)&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7094652-1394092189701797642?l=neilmitchell.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://neilmitchell.blogspot.com/feeds/1394092189701797642/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=7094652&amp;postID=1394092189701797642' title='29 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/1394092189701797642'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/1394092189701797642'/><link rel='alternate' type='text/html' href='http://neilmitchell.blogspot.com/2010/08/cmdargs-example.html' title='CmdArgs Example'/><author><name>Neil Mitchell</name><uri>http://www.blogger.com/profile/13084722756124486154</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www-users.cs.york.ac.uk/~ndm/elements/my-photo.jpg'/></author><thr:total>29</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7094652.post-4708353206162548228</id><published>2010-08-16T20:12:00.002+01:00</published><updated>2010-08-16T20:24:52.739+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='cmdargs'/><title type='text'>CmdArgs 0.2 - command line argument processing</title><content type='html'>I've just released &lt;a href="http://hackage.haskell.org/package/cmdargs"&gt;CmdArgs 0.2&lt;/a&gt; to &lt;a href="http://hackage.haskell.org/"&gt;Hackage&lt;/a&gt;. &lt;a href="http://community.haskell.org/~ndm/cmdargs/"&gt;CmdArgs&lt;/a&gt; is a library for defining and parsing command lines. The focus of CmdArgs is allowing the concise definition of fully-featured command line argument processors, in a mainly declarative manner (i.e. little coding needed). CmdArgs also supports multiple mode programs, as seen in darcs and Cabal. For some examples of CmdArgs, please see the manual or the Haddock documentation.&lt;br /&gt;&lt;br /&gt;For the last month I've been working on a complete rewrite of CmdArgs. The original version of CmdArgs did what I was hoping it would - it was concise and easy to use. However, CmdArgs 0.1 had lots of rough edges - some features didn't work together, there were lots of restrictions on which types of fields appear where, and it was hard to add new features. CmdArgs 0.2 is a ground up rewrite which is designed to make it easy to maintain and improve in the future.&lt;br /&gt;&lt;br /&gt;The CmdArgs 0.2 API is incompatible with CmdArgs 0.1, for which I apologise. Some of the changes you will notice:&lt;br /&gt;&lt;br /&gt;&lt;ul&gt;&lt;br /&gt;&lt;li&gt;Several of the annotations have changed name (&lt;tt&gt;text&lt;/tt&gt; has become &lt;tt&gt;help&lt;/tt&gt;, &lt;tt&gt;empty&lt;/tt&gt; has become &lt;tt&gt;opt&lt;/tt&gt;).&lt;/li&gt;&lt;br /&gt;&lt;li&gt;Instead of writing annotations &lt;tt&gt;value &amp;= a1 &amp; a2&lt;/tt&gt;, you write &lt;tt&gt;value &amp;= a1 &amp;= a2&lt;/tt&gt;.&lt;/li&gt;&lt;br /&gt;&lt;li&gt;Instead of &lt;tt&gt;cmdArgs "Summary" [mode1]&lt;/tt&gt;, you write &lt;tt&gt;cmdArgs (mode1 &amp;= summary "Summary")&lt;/tt&gt;. If you need a multi mode program use the &lt;tt&gt;modes&lt;/tt&gt; function.&lt;/li&gt;&lt;br /&gt;&lt;/ul&gt;&lt;br /&gt;&lt;br /&gt;All the basic principles have remained the same, all the old features have been retained, and translating parsers should be simple local tweaks. If you need any help please contact me.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Explicit Command Line Processor&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;The biggest change is the introduction of an explicit command line framework in &lt;tt&gt;System.Console.CmdArgs.Explicit&lt;/tt&gt;. CmdArgs 0.1 is an implicit command line parser - you define a value with annotations from which a command line parser is inferred. Unfortunately there was not complete separation between determining what parser the user was hoping to define, and then executing it. The result was that even at the flag processing stage there were still complex decisions being made based on type. CmdArgs 0.2 has a fully featured explicit parser that can be used separately, which can process command line flags and display help messages. Now the implicit parser first translates to the explicit parser (capturing the users intentions), then executes it. The advantages of having an explicit parser are substantial.&lt;br /&gt;&lt;br /&gt;&lt;ul&gt;&lt;br /&gt;&lt;li&gt;It is possible to translate the implicit parser to an explicit parser (&lt;tt&gt;cmdArgsMode&lt;/tt&gt;), which makes testing substantially easier. As a result CmdArgs 0.2 has far more tests.&lt;/li&gt;&lt;br /&gt;&lt;li&gt;I was able to write the CmdArgs command line itself in CmdArgs. This command line is a multiple mode explicit parser, which has many sub modes defined by implicit parsers.&lt;/li&gt;&lt;br /&gt;&lt;li&gt;The use of explicit parsers also alleviates one of the biggest worries of users, the impurity. Only the implicit parser relies on impure functions to extract annotations. In particular, you can create the explicit parser once, then rerun it multiple times, without worrying that GHC will optimise away the annotations.&lt;/li&gt;&lt;br /&gt;&lt;li&gt;Once you have an explicit parser, you can modify it afterwards - for example programmatically adding some flags.&lt;/li&gt;&lt;br /&gt;&lt;li&gt;The explicit parser better separates the internals of the program, making each stage simpler.&lt;/li&gt;&lt;br /&gt;&lt;/ul&gt;&lt;br /&gt;&lt;br /&gt;The introduction of the explicit parser was a great idea, and has dramatically improved CmdArgs. However, there are still some loose ends. The translation from implicit to explicit is a multi-stage translation, where each stage infers some additional information. While this process works, it is not very direct - the semantics of an annotation are hard to determine, and there are ordering constrains on these stages. It would be much better if I could concisely and directly express the semantics of the annotations, which is something I will be thinking about for CmdArgs 0.3.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;GetOpt Compatibility&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;While the intention is that CmdArgs users write implicit parsers, it is not required. It is perfectly possible to write explicit parsers directly, and benefit from the argument processing and help text generation. Alternatively, it is possible to define your own command line framework, which is then translated in to an explicit parser. CmdArgs 0.2 now includes a GetOpt translator, which presents an API compatible with GetOpt, but operates by translating to an explicit parser. I hope that other people writing command line frameworks will consider reusing the explicit parser.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Capturing Annotations&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;As part of the enhanced separation of CmdArgs, I have extracted all the code that captures annotations, and made it more robust. The essential operation is now &lt;tt&gt;&amp;=&lt;/tt&gt; which takes a value and attaches an annotation. &lt;tt&gt;(x &amp;= a) &amp;= b&lt;/tt&gt; can then be used to attach the two annotations a and b (the brackets are optional). Previously the capturing of annotations and processing of flags was interspersed, meaning the entire library was impure - now all impure operations are performed inside capture. The capturing framework is not fully generic (that would require module functors, to parameterise over the type of annotation), but otherwise is entirely separate.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Consistentency&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;One of the biggest changes for users should be the increase in consistency. In CmdArgs 0.1 certain annotations were bound to certain types - the &lt;tt&gt;args&lt;/tt&gt; annotation had to be on a &lt;tt&gt;[String]&lt;/tt&gt; field, the &lt;tt&gt;argPos&lt;/tt&gt; annotation had to be on &lt;tt&gt;String&lt;/tt&gt;. In the new version any field can be a list or a maybe, of any atomic type - including Bool, Int/Integer, Double/Float and tuples of atomic types. With this support it's trivial to write:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;data Sample = Sample {size :: Maybe (Int,Int)}&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Now you can run:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;$ sample&lt;br /&gt;Sample {size = Nothing}&lt;br /&gt;&lt;br /&gt;$ sample --size=1,2&lt;br /&gt;Sample {size = Just (1,2)}&lt;br /&gt;&lt;br /&gt;$ sample --size=1,2 --size=3,4&lt;br /&gt;Sample {size = Just (3,4)}&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Hopefully this increase in consistency will make CmdArgs much more predictable for users.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Help Output&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;The one thing I'm still wrestling with is the help output. While the help output is reasonably straightforward for single mode programs, I am still undecided how multiple mode programs should be displayed. Currently CmdArgs displays something I think is acceptable, but not optimal. I am happy to take suggestions for how to improve the help output.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Conclusion&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;CmdArgs 0.1 was an experiment - is it possible to define concise command line argument processors? The result was a working library, but whose potential for enhancement was limited by a lack of internal separation. CmdArgs 0.2 is a complete rewrite, designed to put the ideas from CmdArgs 0.2 into practical use, and allow lots of scope for further improvements. I hope CmdArgs will become the standard choice for most command line processing tasks.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7094652-4708353206162548228?l=neilmitchell.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://neilmitchell.blogspot.com/feeds/4708353206162548228/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=7094652&amp;postID=4708353206162548228' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/4708353206162548228'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/4708353206162548228'/><link rel='alternate' type='text/html' href='http://neilmitchell.blogspot.com/2010/08/cmdargs-02-command-line-argument.html' title='CmdArgs 0.2 - command line argument processing'/><author><name>Neil Mitchell</name><uri>http://www.blogger.com/profile/13084722756124486154</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www-users.cs.york.ac.uk/~ndm/elements/my-photo.jpg'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7094652.post-184534293882599261</id><published>2010-07-10T00:25:00.002+01:00</published><updated>2010-07-10T00:50:34.486+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='supero'/><title type='text'>Rethinking Supercompilation</title><content type='html'>I have written a paper building on my &lt;a href="http://community.haskell.org/~ndm/supero/"&gt;Supero&lt;/a&gt; work that has been accepted to &lt;a href="http://www.icfpconference.org/icfp2010/"&gt;ICFP 2010&lt;/a&gt;. Various people have asked for copies of the paper, so I have put &lt;a href="http://community.haskell.org/~ndm/temp/supero.pdf"&gt;the current version&lt;/a&gt; online. This version will be removed in about three weeks and replaced with the final version, although the linked copy has nearly all the changes suggested by the reviewers. The abstract is:&lt;br /&gt;&lt;br /&gt;&lt;blockquote&gt;Supercompilation is a program optimisation technique that is particularly effective at eliminating unnecessary overheads. We have designed a new supercompiler, making many novel choices, including different termination criteria and handling of let bindings. The result is a supercompiler that focuses on simplicity, compiles programs quickly and optimises programs well. We have benchmarked our supercompiler, with some programs running more than twice as fast than when compiled with GHC.&lt;/blockquote&gt;&lt;br /&gt;&lt;br /&gt;I've also uploaded a corresponding package to &lt;a href="http://hackage.haskell.org/package/supero"&gt;Hackage&lt;/a&gt;, but that code should be considered an early version - I intend to revise it before ICFP, to make it easier to run (currently I change the options by editing the source code) and include all the benchmarks. I don't recommend downloading or using the current version, and won't be supporting it in any way, but it's there for completeness.&lt;br /&gt;&lt;br /&gt;In the near future I will be posting more general discussions about supercompilation, and about the work covered in this paper. In the meantime, if you find any mistakes in the paper, please mention them in the comments!&lt;br /&gt;&lt;br /&gt;&lt;i&gt;PS.&lt;/i&gt; Currently &lt;a href="http://community.haskell.org/"&gt;community.haskell.org&lt;/a&gt; is down, but I have uploaded the paper there anyway. It should be present when the site recovers.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7094652-184534293882599261?l=neilmitchell.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://neilmitchell.blogspot.com/feeds/184534293882599261/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=7094652&amp;postID=184534293882599261' title='6 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/184534293882599261'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/184534293882599261'/><link rel='alternate' type='text/html' href='http://neilmitchell.blogspot.com/2010/07/rethinking-supercompilation.html' title='Rethinking Supercompilation'/><author><name>Neil Mitchell</name><uri>http://www.blogger.com/profile/13084722756124486154</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www-users.cs.york.ac.uk/~ndm/elements/my-photo.jpg'/></author><thr:total>6</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7094652.post-3100730169953975302</id><published>2010-04-25T15:20:00.001+01:00</published><updated>2010-04-25T15:23:42.689+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='uniplate'/><title type='text'>Dangerous Primes - Why Uniplate Doesn't Contain transform'</title><content type='html'>The &lt;a href="http://community.haskell.org/~ndm/uniplate"&gt;Uniplate library&lt;/a&gt; contains many traversal operations, some based on functions in &lt;a href="http://research.microsoft.com/en-us/um/people/simonpj/papers/hmap/"&gt;SYB (Scrap Your Boilerplate)&lt;/a&gt;. SYB provides &lt;tt&gt;everywhere&lt;/tt&gt; for bottom-up traversals and &lt;tt&gt;everywhere'&lt;/tt&gt; for top-down traversals. Uniplate provides &lt;tt&gt;transform&lt;/tt&gt; for bottom-up traversals, but has no operation similar to &lt;tt&gt;everywhere'&lt;/tt&gt;. This article explains why I didn't include a &lt;tt&gt;transform'&lt;/tt&gt; operation, and why I believe that most uses of &lt;tt&gt;everywhere'&lt;/tt&gt; are probably incorrect.&lt;br /&gt;&lt;br /&gt;The &lt;tt&gt;transform&lt;/tt&gt; operation applies a function to every node in a tree, starting at the bottom and working upwards. To give a simple example:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;data Exp = Lit Int | Add Exp Exp | Mul Exp Exp&lt;br /&gt;&lt;br /&gt;f (Mul (Lit 1) x) = x&lt;br /&gt;f (Mul (Lit 3) x) = Add x (Add x x)&lt;br /&gt;f x = x&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Calling &lt;tt&gt;transform f&lt;/tt&gt; on the input &lt;tt&gt;3 * (1 * 2)&lt;/tt&gt; gives &lt;tt&gt;2 + (2 + 2)&lt;/tt&gt;. We can write &lt;tt&gt;transform&lt;/tt&gt; in terms of the Uniplate operation &lt;tt&gt;descend&lt;/tt&gt;, which applies a function to every child node:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;transform f x = f (descend (transform f) x)&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;On every application of &lt;tt&gt;f&lt;/tt&gt; the argument always consists of a root node which has not yet been processed, along with child nodes that have been processed. My thesis explains how we can guarantee a transform reaches a fixed point, by calling &lt;tt&gt;f&lt;/tt&gt; again before every constructor on the RHS of any clause:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;f (Mul (Lit 1) x) = x&lt;br /&gt;f (Mul (Lit 3) x) = f (Add x (f (Add x x)))&lt;br /&gt;f x = x&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Now &lt;tt&gt;transform f&lt;/tt&gt; is guaranteed to reach a fixed point. The &lt;tt&gt;transform&lt;/tt&gt; operation is predictable, and naturally defines bottom-up transformations matching the users intention. Unfortunately, the ordering and predictability of &lt;tt&gt;transform'&lt;/tt&gt; is significantly more subtle. We can easily define &lt;tt&gt;transform'&lt;/tt&gt;:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;transform' f x = descend (transform' f) (f x)&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Here the transformation is applied to every child of the result of &lt;tt&gt;f&lt;/tt&gt;. With &lt;tt&gt;transform&lt;/tt&gt; every node is processed exactly once, but with &lt;tt&gt;transform'&lt;/tt&gt; some nodes are processed multiple times, and some are not processed at all. The first clause of &lt;tt&gt;f&lt;/tt&gt;, which returns &lt;tt&gt;x&lt;/tt&gt;, does not result in the root of &lt;tt&gt;x&lt;/tt&gt; being processed. Similarly, our second cause returns two levels of constructor, causing the inner &lt;tt&gt;Add&lt;/tt&gt; to be both generated and then processed.&lt;br /&gt;&lt;br /&gt;When people look at &lt;tt&gt;transform'&lt;/tt&gt; the intuitive feeling tends to be that all the variables on the RHS will be processed (i.e. &lt;tt&gt;x&lt;/tt&gt;), which in many cases mostly matches the behaviour of &lt;tt&gt;transform'&lt;/tt&gt;. Being mostly correct means that many tests work, but certain cases fail - with our function &lt;tt&gt;f&lt;/tt&gt;, the first example works, but &lt;tt&gt;1 * (1 * 1)&lt;/tt&gt; results in &lt;tt&gt;1 * 1&lt;/tt&gt;. The original version of Uniplate contained &lt;tt&gt;transform'&lt;/tt&gt;, and I spent an entire day tracking down a bug whose cause turned out to be a function whose RHS was just a variable, much like the first clause of &lt;tt&gt;f&lt;/tt&gt;.&lt;br /&gt;&lt;br /&gt;Before describing the solution to top-down transformations, it's interesting to first explore where top-down transformations are necessary. I have identified two cases, but there many be more. Firstly, top-down transformations are useful when one LHS is contained within another LHS, and you wish to favour the larger LHS. For example:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;f (Mul (Add _ _) _) = ...&lt;br /&gt;f (Add _ _) = Mul ...&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Here the second LHS is contained within the first. If we perform a bottom-up transformation then the inner &lt;tt&gt;Add&lt;/tt&gt; expression will be transformed to &lt;tt&gt;Mul&lt;/tt&gt;, and the first clause will never match. Changing to a top-down transformation allows the larger rule to match first.&lt;br /&gt;&lt;br /&gt;Secondly, top-down transformations are useful when some information about ancestors is accumulated as you proceed downwards. The typical example is a language with let expressions which builds up a set of bindings as it proceeds downwards, these bindings then affect which transformations are made. Sadly, &lt;tt&gt;transform'&lt;/tt&gt; cannot express such a transformation.&lt;br /&gt;&lt;br /&gt;The solution to both problems is to use the &lt;tt&gt;descend&lt;/tt&gt; operation, and explicitly control the recursive step. We can rewrite the original example using &lt;tt&gt;descend&lt;/tt&gt;:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;f (Mul (Lit 1) x) = f x&lt;br /&gt;f (Mul (Lit 3) x) = Add (f x) (Add (f x) (f x))&lt;br /&gt;f x = descend f x&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Here we explicitly call &lt;tt&gt;f&lt;/tt&gt; to continue the transformation. The intuition that all variables are transformed is now stated explicitly. In this particular example we can also common-up the three subexpressions &lt;tt&gt;f x&lt;/tt&gt; in the second clause, giving a more efficient transformation. If needed, we can add an extra argument to &lt;tt&gt;f&lt;/tt&gt; to pass down information from the ancestors.&lt;br /&gt;&lt;br /&gt;After experience with Uniplate I decided that using &lt;tt&gt;transform'&lt;/tt&gt;/&lt;tt&gt;everywhere'&lt;/tt&gt; correctly was difficult. I looked at all my uses of &lt;tt&gt;transform'&lt;/tt&gt; and found that a few of them had subtle bugs, and in most cases using &lt;tt&gt;transform&lt;/tt&gt; would have done exactly the same job. I looked at all the code on Hackage (several years ago) and found only six uses of &lt;tt&gt;everywhere'&lt;/tt&gt;, all of which could be replaced with &lt;tt&gt;everywhere&lt;/tt&gt; without changing the meaning. I consider top-down transformations in the style of &lt;tt&gt;everywhere'&lt;/tt&gt; to be dangerous, and strongly recommend using operations like &lt;tt&gt;descend&lt;/tt&gt; instead.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Upcoming Workshops&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;Are you interested in either generics programming or Haskell? Are you going to &lt;a href="http://www.icfpconference.org/icfp2010/"&gt;ICFP 2010&lt;/a&gt;? Why not:&lt;br /&gt;&lt;br /&gt;&lt;a href=""&gt;Submit a paper&lt;/a&gt; to the &lt;a href="http://osl.iu.edu/wgp2010"&gt;&lt;i&gt;Workshop on Generic Programming&lt;/i&gt;&lt;/a&gt; - Generic programming is about making programs more adaptable by making them more general. This workshop brings together leading researchers and practitioners in generic programming from around the world, and features papers capturing the state of the art in this important area.&lt;br /&gt;&lt;br /&gt;&lt;a href="http://haskell.org/haskellwiki/HaskellImplementorsWorkshop/2010/Call_for_Talks"&gt;Offer a talk&lt;/a&gt; at the &lt;a href="http://haskell.org/haskellwiki/HaskellImplementorsWorkshop/2010"&gt;&lt;i&gt;Haskell Implementors Workshop&lt;/i&gt;&lt;/a&gt; - an informal gathering of people involved in the design and development of Haskell implementations, tools, libraries, and supporting infrastructure.&lt;br /&gt;&lt;br /&gt;I'm on the program committee for both, and look forward to receiving your contributions.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7094652-3100730169953975302?l=neilmitchell.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://neilmitchell.blogspot.com/feeds/3100730169953975302/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=7094652&amp;postID=3100730169953975302' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/3100730169953975302'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/3100730169953975302'/><link rel='alternate' type='text/html' href='http://neilmitchell.blogspot.com/2010/04/dangerous-primes-why-uniplate-doesnt.html' title='Dangerous Primes - Why Uniplate Doesn&apos;t Contain transform&apos;'/><author><name>Neil Mitchell</name><uri>http://www.blogger.com/profile/13084722756124486154</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www-users.cs.york.ac.uk/~ndm/elements/my-photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7094652.post-2031507691587844190</id><published>2010-04-07T19:25:00.004+01:00</published><updated>2010-04-07T20:14:32.652+01:00</updated><title type='text'>File Recovery with Haskell</title><content type='html'>Haskell has long been my favoured scripting language, and in this post I thought I'd share one of my more IO heavy scripts. I have an external hard drive, that due to regular dropping, is somewhat unreliable. I have a 1Gb file on this drive, which I'd like to copy, but is partly corrupted. I'd like to copy as much as I can.&lt;br /&gt;&lt;br /&gt;In the past I've used &lt;a href="http://www.jfilerecovery.com/"&gt;JFileRecovery&lt;/a&gt;, which I thoroughly recommend. The basic algorithm is that it copies the file in chunks, and if a chunk copy exceeds a timeout it is discarded. It has a nice graphical interface, and some basic control over timeout and block sizes. Unfortunately, JFileRecovery didn't work for this file - it has three basic problems:&lt;br /&gt;&lt;br /&gt;&lt;ol&gt;&lt;br /&gt;&lt;li&gt;The timeout sometimes fails to stop the IO, causing the program to hang.&lt;/li&gt;&lt;br /&gt;&lt;li&gt;If the copy takes too long, it sometimes gives up before the end of the file.&lt;/li&gt;&lt;br /&gt;&lt;li&gt;If the block size is too small it takes forever, if it is too large it drops large parts of the file.&lt;/li&gt;&lt;br /&gt;&lt;/ol&gt;&lt;br /&gt;&lt;br /&gt;To recover my file I needed something better, so wrote a quick script in Haskell. The basic algorithm is to copy the the file in 10Mb chunks. If any chunk fails to copy, I split the chunk and retry it after all other pending chunks. The result is that the file is complete after the first pass, but the program then goes back and recovers more information where it can. I can terminate the program at any point with a working file, but waiting longer will probably recover more of the file.&lt;br /&gt;&lt;br /&gt;I have included the script at the bottom of this post. I ran this script from GHCi, but am not going to turn it in to a proper program. If someone does wish to build on this script please do so (I hereby place this code in the public domain, or if that is not possible then under the &lt;i&gt;&amp;forall; n . BSD n&lt;/i&gt; licenses).&lt;br /&gt;&lt;br /&gt;The script took about 15 minutes to write, and makes use of exceptions and file handles - not the kind of program traditionally associated with Haskell. A lot of hard work has been spent polishing the GHC runtime, and the Haskell libraries (bytestring, exceptions). Now this work has been done, slotting together reasonably complex scripts is simple.&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;{-# LANGUAGE ScopedTypeVariables #-}&lt;br /&gt;&lt;br /&gt;import Data.ByteString(hGet, hPut)&lt;br /&gt;import System.IO&lt;br /&gt;import System.Environment&lt;br /&gt;import Control.Monad&lt;br /&gt;import Control.Exception&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;src = "file on dodgy drive (source)"&lt;br /&gt;dest = "file on safe drive (destination)"&lt;br /&gt;&lt;br /&gt;main :: IO ()&lt;br /&gt;main =&lt;br /&gt;    withBinaryFile src ReadMode $ \hSrc -&gt; &lt;br /&gt;    withBinaryFile dest WriteMode $ \hDest -&gt; do&lt;br /&gt;    nSrc &lt;- hFileSize hSrc&lt;br /&gt;    nDest &lt;- hFileSize hDest&lt;br /&gt;    when (nSrc /= nDest) $ hSetFileSize hDest nSrc&lt;br /&gt;    copy hSrc hDest $ split start (0,nSrc)&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;copy :: Handle -&gt; Handle -&gt; [(Integer,Integer)] -&gt; IO ()&lt;br /&gt;copy hSrc hDest [] = return ()&lt;br /&gt;copy hSrc hDest chunks = do&lt;br /&gt;    putStrLn $ "Copying " ++ show (length chunks) ++ " of at most " ++ show (snd $ head chunks)&lt;br /&gt;    chunks &amp;lt;- forM chunks $ \(from,len) -&gt; do&lt;br /&gt;        res &amp;lt;- Control.Exception.try $ do&lt;br /&gt;            hSeek hSrc AbsoluteSeek from&lt;br /&gt;            hSeek hDest AbsoluteSeek from&lt;br /&gt;            bs &amp;lt;- hGet hSrc $ fromIntegral len&lt;br /&gt;            hPut hDest bs&lt;br /&gt;        case res of&lt;br /&gt;            Left (a :: IOException) -&gt; do putChar '#' ; return $ split (len `div` 5) (from,len)&lt;br /&gt;            Right _ -&gt; do putChar '.' ; return []&lt;br /&gt;    putChar '\n'&lt;br /&gt;    copy hSrc hDest $ concat chunks&lt;br /&gt;&lt;br /&gt;start = 10000000&lt;br /&gt;stop = 1000&lt;br /&gt;&lt;br /&gt;split :: Integer -&gt; (Integer,Integer) -&gt; [(Integer,Integer)]&lt;br /&gt;split i (a,b) | i &amp;lt; stop = []&lt;br /&gt;              | i &gt;= b = [(a,b)]&lt;br /&gt;              | otherwise = (a,i) : split i (a+i, b-i)&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;There are many limitations in this code, but it was sufficient to recover my file quickly and accurately.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7094652-2031507691587844190?l=neilmitchell.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://neilmitchell.blogspot.com/feeds/2031507691587844190/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=7094652&amp;postID=2031507691587844190' title='6 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/2031507691587844190'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/2031507691587844190'/><link rel='alternate' type='text/html' href='http://neilmitchell.blogspot.com/2010/04/file-recovery-with-haskell.html' title='File Recovery with Haskell'/><author><name>Neil Mitchell</name><uri>http://www.blogger.com/profile/13084722756124486154</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www-users.cs.york.ac.uk/~ndm/elements/my-photo.jpg'/></author><thr:total>6</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7094652.post-5793946885372537076</id><published>2010-01-23T18:07:00.001Z</published><updated>2010-01-23T18:09:42.426Z</updated><category scheme='http://www.blogger.com/atom/ns#' term='hlint'/><category scheme='http://www.blogger.com/atom/ns#' term='uniplate'/><category scheme='http://www.blogger.com/atom/ns#' term='derive'/><title type='text'>Optimising HLint</title><content type='html'>&lt;a href="http://community.haskell.org/~ndm/hlint/"&gt;HLint&lt;/a&gt; is a tool for suggesting improvements to Haskell code. Recently I've put some effort in to optimisation and HLint is now over 20 times faster. The standard benchmark (running HLint over the source code of HLint) has gone from 30 seconds to just over 1 second. This blog post is the story of that optimisation, the dead ends I encountered, and the steps I took. I've deliberately included reasonable chunks of code in this post, so interested readers can see the whole story - less technical readers should feel free to skip them. The results of the optimisation are all available on &lt;a href="http://hackage.haskell.org/"&gt;Hackage&lt;/a&gt;, as new versions of &lt;a href="http://hackage.haskell.org/package/hlint"&gt;hlint&lt;/a&gt;, &lt;a href="http://hackage.haskell.org/package/uniplate"&gt;uniplate&lt;/a&gt; and &lt;a href="http://hackage.haskell.org/package/derive"&gt;derive&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;Before I start, I'd like to share my five guiding principles of optimisation:&lt;br /&gt;&lt;br /&gt;&lt;ul&gt;&lt;br /&gt;&lt;li&gt;&lt;b&gt;Tests&lt;/b&gt; - make sure you have tests, so you don't break anything while optimising.&lt;/li&gt;&lt;br /&gt;&lt;li&gt;&lt;b&gt;Profile&lt;/b&gt; - if you don't profile first, you are almost certainly optimising the wrong thing.&lt;/li&gt;&lt;br /&gt;&lt;li&gt;&lt;b&gt;Necessity&lt;/b&gt; - only start the optimisation process if something is running too slowly.&lt;/li&gt;&lt;br /&gt;&lt;li&gt;&lt;b&gt;Benchmark&lt;/b&gt; - use a sensible and representative test case to benchmark, to make sure you optimise the right thing.&lt;/li&gt;&lt;br /&gt;&lt;li&gt;&lt;b&gt;Optimising&lt;/b&gt; - to make a function faster, either call it less, or write it better.&lt;/li&gt;&lt;br /&gt;&lt;/ul&gt;&lt;br /&gt;&lt;br /&gt;Below are the steps in the optimisation, along with their speed improvement.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Special support for &lt;tt&gt;Rational&lt;/tt&gt; in Uniplate.Data, 30s to 10s&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;HLint uses &lt;a href="http://community.haskell.org/~ndm/uniplate/"&gt;Uniplate&lt;/a&gt; extensively. HLint works over large abstract syntax trees, from the library &lt;a href="http://www.cs.chalmers.se/~d00nibro/haskell-src-exts/"&gt;haskell-src-exts&lt;/a&gt; (HSE), so a generics library is essential. There are two main variants of Uniplate - Data builds on the Scrap Your Boilerplate (SYB) instances, and Direct requires special instances. HLint uses the Data variant, as it requires no instances to be written.&lt;br /&gt;&lt;br /&gt;One of the advantages of Uniplate is that it generally outperforms most generics libraries. In particular, the variant written on top of Data instances is often many times faster than using SYB directly. The reason for outperforming SYB is documented in my &lt;a href="http://community.haskell.org/~ndm/thesis/"&gt;PhD thesis&lt;/a&gt;. The essential idea is that Uniplate builds a "hit table", a mapping noting which types can be contained within which other types - e.g. that there is potentially an &lt;tt&gt;Int&lt;/tt&gt; inside &lt;tt&gt;Either String [Int]&lt;/tt&gt;, but there isn't an &lt;tt&gt;Int&lt;/tt&gt; inside &lt;tt&gt;String&lt;/tt&gt;. By consulting this mapping while traversing a value, Uniplate is able to skip large portions, leading to an efficiency improvement.&lt;br /&gt;&lt;br /&gt;When computing the hit table it is necessary for Uniplate to create dummy values of each type, which it then traverses using SYB functions. To create dummy values Uniplate uses &lt;tt&gt;undefined&lt;/tt&gt;, unfortunately given the definition &lt;tt&gt;data Foo = Foo !Int&lt;/tt&gt; the value &lt;tt&gt;Foo undefined&lt;/tt&gt; will be forced due to the strictness annotation, and the code will raise an error - as described in &lt;a href="http://code.google.com/p/ndmitchell/issues/detail?id=243"&gt;bug 243&lt;/a&gt;. Uniplate 1.2 had a special case for &lt;tt&gt;Rational&lt;/tt&gt;, which is the only type with strict components contained within HSE. Uniplate 1.3 fixed this problem more generally by catching the exception and turning off the hit table facility on such types. Unfortunately this caused Uniplate 1.3 to turn off the hit table for HSE, causing HLint to run three times slower.&lt;br /&gt;&lt;br /&gt;The fix was simple, and pragmatic, but not general. In Uniplate 1.4 I reinstated the special case for &lt;tt&gt;Rational&lt;/tt&gt;, so now HLint makes use of the hit table, and goes three times faster. A more general solution would be to manufacture dummy values for certain types (it's usually an &lt;tt&gt;Int&lt;/tt&gt; or &lt;tt&gt;Integer&lt;/tt&gt; that is a strict component), or to create concrete dummy values using SYB. It's interesting to observe that if HLint used SYB as it's generics library, it would not be using the hit table trick, and would run three times slower.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Use Uniplate.Direct, 10s to 5s, reverted&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;Uniplate also provides a Direct version, which performs better, but requires instances to be written. In order to further improve the speed of HLint I decided to try the Direct version. The biggest hurdle to using the Direct version is that many instances need to be generated, in the case of HLint it required 677. The first step was to modify the &lt;a href="http://community.haskell.org/~ndm/derive/"&gt;Derive&lt;/a&gt; tool to generate these instances (which Derive 2.1 now does), and to write a small script to decide which instances were necessary. With these instances in place, the time dropped to 5 seconds.&lt;br /&gt;&lt;br /&gt;Unfortunately, the downside was that compilation time skyrocketed, and the instances are very specific to a particular version of HSE. While these problems are not insurmountable, I did not consider the benefit to be worthwhile, so reverted the changes. It's worth pointing out that most users of Uniplate won't require so many instances to use the Direct versions, and that a program can be switched between Direct and Data versions without any code changes (just a simple import). I also considered the possibility of discovering which Uniplate instances dominated and using the Direct method only for those - but I did not investigate further.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Add and optimise &lt;tt&gt;eqExpShell&lt;/tt&gt;, 10s to 8s&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;The next step was to profile. I compiled and ran HLint with the following options:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;ghc --make -O1 -prof -auto-all -o hlint&lt;br /&gt;hlint src +RTS -p&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Looking at the profiling output, I saw that the function &lt;tt&gt;unify&lt;/tt&gt; took up over 60% of the execution time:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;unify :: NameMatch -&gt; Exp -&gt; Exp -&gt; Maybe [(String,Exp)]&lt;br /&gt;unify nm (Do xs) (Do ys) | length xs == length ys = concatZipWithM (unifyStmt nm) xs ys&lt;br /&gt;unify nm (Lambda _ xs x) (Lambda _ ys y) | length xs == length ys = liftM2 (++) (unify nm x y) (concatZipWithM unifyPat xs ys)&lt;br /&gt;unify nm x y | isParen x || isParen y = unify nm (fromParen x) (fromParen y)&lt;br /&gt;unify nm (Var (fromNamed -&gt; v)) y | isUnifyVar v = Just [(v,y)]&lt;br /&gt;unify nm (Var x) (Var y) | nm x y = Just []&lt;br /&gt;unify nm x y | ((==) `on` descend (const $ toNamed "_")) x y = concatZipWithM (unify nm) (children x) (children y)&lt;br /&gt;unify nm _ _ = Nothing&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;The function &lt;tt&gt;unify&lt;/tt&gt; is an essential part of the rule matching in HLint, and attempts to compare a rule to a target expression, and if successful returns the correct substitution. Each rule is compared to every expression in all files, which means that &lt;tt&gt;unify&lt;/tt&gt; is called millions of times in a normal HLint run. Looking closely, my first suspicion was the second line from the bottom in the guard - the call to &lt;tt&gt;descend&lt;/tt&gt; and &lt;tt&gt;(==)&lt;/tt&gt;. This line compares the outer shell of two expressions, ignoring any inner expressions. It first uses the Uniplate &lt;tt&gt;descend&lt;/tt&gt; function to insert a dummy value as each subexpression, then compares for equality. To test my hypothesis that this method was indeed the culprit I extracted it to a separate function, and modified &lt;tt&gt;unify&lt;/tt&gt; to call it:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;eqExpShell :: Exp_ -&gt; Exp_ -&gt; Bool&lt;br /&gt;eqExpShell = (==) `on` descend (const $ toNamed "_")&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;I reran the profiling, and now all the time was being spent in &lt;tt&gt;eqExpShell&lt;/tt&gt;. My first thought was to expand out the function to not use Uniplate. I quickly rejected that idea - there are expressions within statements and patterns, and to follow all the intricacies of HSE would be fragile and verbose.&lt;br /&gt;&lt;br /&gt;The first optimisation I tried was to replace &lt;tt&gt;toNamed "_"&lt;/tt&gt;, the dummy expression, with something simpler. The &lt;tt&gt;toNamed&lt;/tt&gt; call expands to many constructors, so instead I used &lt;tt&gt;Do an []&lt;/tt&gt; (&lt;tt&gt;an&lt;/tt&gt; is a dummy source location), which is the simplest expression HSE provides. This change had a noticeable speed improvement.&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;eqExpShell :: Exp_ -&gt; Exp_ -&gt; Bool&lt;br /&gt;eqExpShell = (==) `on` descend (const $ Do an [])&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;My second thought was to add a quick test, so that if the outer constructors were not equal then the expensive test was not tried. Determining the outer constructor of a value can be done by calling &lt;tt&gt;show&lt;/tt&gt; then only looking at the first word (assuming a sensible &lt;tt&gt;Show&lt;/tt&gt; instance, which HSE has).&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;eqExpShell :: Exp_ -&gt; Exp_ -&gt; Bool&lt;br /&gt;eqExpShell x y =&lt;br /&gt;    ((==) `on` constr) x y &amp;&amp;&lt;br /&gt;    ((==) `on` descend (const $ Do an [])) x y&lt;br /&gt;    where constr = takeWhile (not . isSpace) . show&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;This change had a bigger speed improvement. I found that of the 1.5 million times &lt;tt&gt;eqExpShell&lt;/tt&gt; was called, the quick equality test rejected 1 million cases.&lt;br /&gt;&lt;br /&gt;My next thought was to try replacing &lt;tt&gt;constr&lt;/tt&gt; with the SYB function &lt;tt&gt;toConstr&lt;/tt&gt;. There was no noticeable performance impact, but the code is neater, and doesn't rely on the &lt;tt&gt;Show&lt;/tt&gt; instance, so I stuck with it. After all these changes HLint was 2 seconds faster, but &lt;tt&gt;eqExpShell&lt;/tt&gt; was still the biggest culprit on the profiling report.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Write &lt;tt&gt;eqExpShell&lt;/tt&gt; entirely in SYB, 8s to 8s, reverted&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;My next thought was to rewrite &lt;tt&gt;eqExpShell&lt;/tt&gt; entirely in SYB functions, not using the &lt;tt&gt;Eq&lt;/tt&gt; instance at all. The advantages of this approach would be that I can simply disregard all subexpressions, I only walk the expression once, and I can skip source position annotations entirely. Starting from the &lt;tt&gt;geq&lt;/tt&gt; function in SYB, I came up with:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;data Box = forall a . Data a =&gt; Box a&lt;br /&gt;&lt;br /&gt;eqExpShellSYB :: Exp_ -&gt; Exp_ -&gt; Bool&lt;br /&gt;eqExpShellSYB = f&lt;br /&gt;    where&lt;br /&gt;        f :: (Data a, Data b) =&gt; a -&gt; b -&gt; Bool&lt;br /&gt;        f x y = toConstr x == toConstr y &amp;&amp;&lt;br /&gt;                and (zipWith g (gmapQ Box x) (gmapQ Box y))&lt;br /&gt;&lt;br /&gt;        g (Box x) (Box y) = tx == typeAnn || tx == typeExp || f x y&lt;br /&gt;            where tx = typeOf x&lt;br /&gt;&lt;br /&gt;typeAnn = typeOf an&lt;br /&gt;typeExp = typeOf ex&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Unfortunately, this code takes exactly the same time as the previous version, despite being significantly more complex. My guess is that &lt;tt&gt;toConstr&lt;/tt&gt; is not as fast as the &lt;tt&gt;Eq&lt;/tt&gt; instance, and that this additional overhead negates all the other savings. I decided to revert back to the simpler version.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Call &lt;tt&gt;eqExpShell&lt;/tt&gt; less, 8s to 4s&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;Having failed to optimise &lt;tt&gt;eqExpShell&lt;/tt&gt; further, I then thought about how to call it less. I added a trace and found that of the 1.5 million calls, in 1.3 million times at least one of the constructors was an &lt;tt&gt;App&lt;/tt&gt;. Application is very common in Haskell programs, so this is not particularly surprising. By looking back at the code for &lt;tt&gt;unify&lt;/tt&gt; I found several other constructors were already handled, so I added a special case for &lt;tt&gt;App&lt;/tt&gt;.&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;unify nm (App _ x1 x2) (App _ y1 y2) = liftM2 (++) (unify nm x1 y1) (unify nm x2 y2)&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;It is easy to show that if an &lt;tt&gt;App&lt;/tt&gt; (or any specially handled constructor) is passed as either argument to &lt;tt&gt;eqExpShell&lt;/tt&gt; then the result will be &lt;tt&gt;False&lt;/tt&gt;, as if both shells had been equal a previous case would have matched. Taking advantage of this observation, I rewrote the line with the generic match as:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;unify nm x y | isOther x &amp;&amp; isOther y &amp;&amp; eqExpShell x y = concatZipWithM (unify nm) (children x) (children y)&lt;br /&gt;    where &lt;br /&gt;        isOther Do{} = False&lt;br /&gt;        isOther Lambda{} = False&lt;br /&gt;        isOther Var{} = False&lt;br /&gt;        isOther App{} = False&lt;br /&gt;        isOther _ = True&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;With this change the &lt;tt&gt;eqExpShell&lt;/tt&gt; function was called substantially less, it disappeared from the profile, and the speed improved to 4 seconds.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Fix Uniplate bug, 4s to 1.3s&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;The next step was to rerun the profiling. However, the results were very confusing - almost 70% of the execution time was recorded to three CAF's, while I could see no obvious culprit. I reran the profiler with the &lt;tt&gt;-caf-all&lt;/tt&gt; flag to more precisely report the location of CAF's, and was again confused - the optimiser had rearranged the functions to make &lt;tt&gt;-caf-all&lt;/tt&gt; useless. I then reran the profiler with optimisation turned off using &lt;tt&gt;-O0&lt;/tt&gt; and looked again. This time the profiling clearly showed Uniplate being the source of the CAF's. The hit table that Uniplate creates is stored inside a CAF, so was an obvious candidate.&lt;br /&gt;&lt;br /&gt;Turning back to Uniplate, I attempted to reproduce the bug outside HLint. I enhanced the benchmarking suite by adding a method to find all the &lt;tt&gt;String&lt;/tt&gt;'s inside very small HSE values. The standard Uniplate benchmarks are for testing the performance of running code, and I had neglected to check the creation of the hit table, assuming it to be negligible.&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;testN "Module" $ Module ssi Nothing [] [] []&lt;br /&gt;testN "Decl" $ FunBind ssi []&lt;br /&gt;testN "Exp" $ Do ssi []&lt;br /&gt;testN "Pat" $ PWildCard ssi&lt;br /&gt;&lt;br /&gt;testN :: Biplate a String =&gt; String -&gt; a -&gt; IO ()&lt;br /&gt;testN msg x = do&lt;br /&gt;    t &lt;- timer $ evaluate $ length (universeBi x :: [String])&lt;br /&gt;    putStrLn $ "HSE for " ++ msg ++ " takes " ++ dp2 t ++ "s"&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;My initial worry was that at 2 decimal places I was likely to see &lt;tt&gt;0.00&lt;/tt&gt; for all values. However, that turned out not to be a problem! What I saw was:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;HSE for Module takes 0.54&lt;br /&gt;HSE for Decl takes 0.86&lt;br /&gt;HSE for Exp takes 2.54&lt;br /&gt;HSE for Pat takes 0.32&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;These results surprised me. In particular, the hit table from &lt;tt&gt;Exp&lt;/tt&gt; to &lt;tt&gt;String&lt;/tt&gt; is a subset of the &lt;tt&gt;Module&lt;/tt&gt; one, so should always be faster. The computation of the hit table is reasonably complex, and I was unable to isolate the problem. The tricky part of the hit table is that it is necessary to take the fixed point of the transitive closure of reachability - I had tried to keep track of recursive types and reach a fixed point with the minimal number of recomputations. I clearly failed, and probably had an exponential aspect to the algorithm that under certain circumstances caused ridiculously bad behaviour.&lt;br /&gt;&lt;br /&gt;Rather than try to diagnose the bug, I decided instead to rethink the approach, and simplify the design. In particular, the fixed point of the transitive closure is now written as:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;fixEq trans (populate box)&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Where &lt;tt&gt;populate&lt;/tt&gt; finds the immediate children of a constructor, &lt;tt&gt;trans&lt;/tt&gt; takes the transitive closure based on the current state, and &lt;tt&gt;fixEq&lt;/tt&gt; takes the fixed point. Using this new simpler design I was also able to compute which types contained with types recursively and cache it, meaning that now computing the hit table for &lt;tt&gt;Module&lt;/tt&gt; not only computes the hit table for &lt;tt&gt;Exp&lt;/tt&gt;, but does so in a way that means the result can be reused when asking about &lt;tt&gt;Exp&lt;/tt&gt;. After rewriting the code I reran the benchmark:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;HSE for Module takes 0.03&lt;br /&gt;HSE for Decl takes 0.00&lt;br /&gt;HSE for Exp takes 0.00&lt;br /&gt;HSE for Pat takes 0.00&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;I had hoped that the rewrite would fix the performance problems, and it did. I have not diagnosed the original performance bug, but at 0.03 seconds I was satisfied. I have now released Uniplate 1.5 with the revised hit table code. With this change, the time for HLint drops to 1.3 seconds and all the CAF's went away from the profile.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Sharing computation, 1.3s to 1.2s&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;At this point, I was happy to finish, but decided to profile just one last time. The top function in the list was &lt;tt&gt;matchIdea&lt;/tt&gt;:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;matchIdea :: NameMatch -&gt; Setting -&gt; Exp_ -&gt; Maybe Exp_&lt;br /&gt;matchIdea nm MatchExp{lhs=lhs,rhs=rhs,side=side} x = do&lt;br /&gt;    u &lt;- unify nm (fmap (const an) lhs) (fmap (const an) x)&lt;br /&gt;    u &lt;- check u&lt;br /&gt;    guard $ checkSide side u&lt;br /&gt;    let rhs2 = subst u rhs&lt;br /&gt;    guard $ checkDot lhs rhs2&lt;br /&gt;    return $ unqualify nm $ dotContract $ performEval rhs2&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;This function first strips both the rule (&lt;tt&gt;lhs&lt;/tt&gt;) and the expression (&lt;tt&gt;x&lt;/tt&gt;) of their source position information to ensure equality works correctly. However, both the rules and expressions are reused multiple times, so I moved the &lt;tt&gt;fmap&lt;/tt&gt; calls backwards so each rule/expression is only ever stripped of source position information once. With this change the runtime was reduced to 1.2 seconds.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Final Results&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;After all that effort, I reran the profile results and got:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;parseString                    HSE.All               16.2   17.7&lt;br /&gt;matchIdea                      Hint.Match            14.4    1.6&lt;br /&gt;eqExpShellSlow                 HSE.Eq                 9.2   11.4&lt;br /&gt;listDecl                       Hint.List              6.1    4.1&lt;br /&gt;lambdaHint                     Hint.Lambda            5.1    5.6&lt;br /&gt;bracketHint                    Hint.Bracket           4.1    6.4&lt;br /&gt;hasS                           Hint.Extensions        4.1    5.0&lt;br /&gt;monadHint                      Hint.Monad             4.1    2.9&lt;br /&gt;~=                             HSE.Match              4.1    2.5&lt;br /&gt;isParen                        HSE.Util               3.1    0.0&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Now the largest contributor to the HLint runtime is the parsing of Haskell files. There are no obvious candidates for easy optimisations, and the code runs sufficiently fast for my purposes.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Conclusions&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;There is still some scope for optimisation of HLint, but I leave that for future work. One possible avenue for exploration would be turning on selected packages of hints, to see which one takes the most time - profiling on a different measure.&lt;br /&gt;&lt;br /&gt;In optimising HLint I've found two issues in Uniplate, the first of which I was aware of, and the second of which came as a total surprise. These optimisations to Uniplate will benefit everyone who uses it. I have achieved the goal of optimising HLint, simply by following the profile reports, and as a result HLint is now substantially faster than I had ever expected.&lt;br /&gt;&lt;br /&gt;&lt;i&gt;Footnote:&lt;/i&gt; I didn't actually profile first, as I knew that a performance regression was caused by the upgrade to Uniplate 1.3, so knew where to start looking. Generally, I would start with a profile.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7094652-5793946885372537076?l=neilmitchell.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://neilmitchell.blogspot.com/feeds/5793946885372537076/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=7094652&amp;postID=5793946885372537076' title='4 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/5793946885372537076'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/5793946885372537076'/><link rel='alternate' type='text/html' href='http://neilmitchell.blogspot.com/2010/01/optimising-hlint.html' title='Optimising HLint'/><author><name>Neil Mitchell</name><uri>http://www.blogger.com/profile/13084722756124486154</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www-users.cs.york.ac.uk/~ndm/elements/my-photo.jpg'/></author><thr:total>4</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7094652.post-2090717554551135089</id><published>2010-01-14T18:03:00.003Z</published><updated>2010-01-14T18:11:24.496Z</updated><category scheme='http://www.blogger.com/atom/ns#' term='hlint'/><title type='text'>Better .ghci files</title><content type='html'>A few days ago I &lt;a href="http://neilmitchell.blogspot.com/2010/01/using-ghci-files-to-run-projects.html"&gt;posted&lt;/a&gt; about my plan to use .ghci files for all my projects. I am now doing so in at least five projects, and it's working great. There were two disadvantages: 1) every command had to be squeezed on to a single line; 2) some names were introduced into the global namespace. Thanks to a hint from &lt;a href="http://www.reddit.com/user/doliorules"&gt;doliorules&lt;/a&gt;, about &lt;tt&gt;:{ :}&lt;/tt&gt; I can eliminate these disadvantages.&lt;br /&gt;&lt;br /&gt;Let's take the previous example from HLint's .ghci file:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;let cmdHpc _ = return $ unlines [":!ghc --make -isrc -i. src/Main.hs -w -fhpc -odir .hpc -hidir .hpc -threaded -o .hpc/hlint-test",":!del hlint-test.tix",":!.hpc\\hlint-test --help",":!.hpc\\hlint-test --test",":!.hpc\\hlint-test src --report=.hpc\\hlint-test-report.html +RTS -N3",":!.hpc\\hlint-test data --report=.hpc\\hlint-test-report.html +RTS -N3",":!hpc.exe markup hlint-test.tix --destdir=.hpc",":!hpc.exe report hlint-test.tix",":!del hlint-test.tix",":!start .hpc\\hpc_index_fun.html"]&lt;br /&gt;:def hpc cmdHpc&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;It work's, but it's ugly. However, it can be rewritten as:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;:{&lt;br /&gt;:def hpc const $ return $ unlines&lt;br /&gt;    [":!ghc --make -isrc -i. src/Main.hs -w -fhpc -odir .hpc -hidir .hpc -threaded -o .hpc/hlint-test"&lt;br /&gt;    ,":!del hlint-test.tix"&lt;br /&gt;    ,":!.hpc\\hlint-test --help"&lt;br /&gt;    ,":!.hpc\\hlint-test --test"&lt;br /&gt;    ,":!.hpc\\hlint-test src --report=.hpc\\hlint-test-report.html +RTS -N3"&lt;br /&gt;    ,":!.hpc\\hlint-test data --report=.hpc\\hlint-test-report.html +RTS -N3"&lt;br /&gt;    ,":!hpc.exe markup hlint-test.tix --destdir=.hpc"&lt;br /&gt;    ,":!hpc.exe report hlint-test.tix"&lt;br /&gt;    ,":!del hlint-test.tix"&lt;br /&gt;    ,":!start .hpc\\hpc_index_fun.html"]&lt;br /&gt;:}&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;The &lt;tt&gt;:{ :}&lt;/tt&gt; notation allows multi-line input in GHCi. GHCi also allows full expressions after a &lt;tt&gt;:def&lt;/tt&gt;. Combined, we now have a much more readable .ghci file.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7094652-2090717554551135089?l=neilmitchell.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://neilmitchell.blogspot.com/feeds/2090717554551135089/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=7094652&amp;postID=2090717554551135089' title='7 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/2090717554551135089'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/2090717554551135089'/><link rel='alternate' type='text/html' href='http://neilmitchell.blogspot.com/2010/01/better-ghci-files.html' title='Better .ghci files'/><author><name>Neil Mitchell</name><uri>http://www.blogger.com/profile/13084722756124486154</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www-users.cs.york.ac.uk/~ndm/elements/my-photo.jpg'/></author><thr:total>7</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7094652.post-8859450121269781525</id><published>2010-01-09T16:50:00.004Z</published><updated>2010-01-14T18:12:08.186Z</updated><category scheme='http://www.blogger.com/atom/ns#' term='hlint'/><category scheme='http://www.blogger.com/atom/ns#' term='cmdargs'/><title type='text'>Using .ghci files to run projects</title><content type='html'>I develop a reasonable number of different Haskell projects, and tend to switch between them regularly. I often come back to a project after a number of months and can't remember the basics - how to load it up, how to test it. Until yesterday, my technique was to add a &lt;tt&gt;ghci.bat&lt;/tt&gt; file to load the project, and invoke the tests with either &lt;tt&gt;:main test&lt;/tt&gt; or &lt;tt&gt;:main --test&lt;/tt&gt;. For some projects I also had commands to run hpc, or perform profiling. Using &lt;a href="http://www.haskell.org/ghc/docs/latest/html/users_guide/ghci-dot-files.html"&gt;.ghci files&lt;/a&gt;, I can do much better.&lt;br /&gt;&lt;br /&gt;All my projects are now gaining &lt;tt&gt;.ghci&lt;/tt&gt; files in the root directory. For example, the &lt;a href="http://community.haskell.org/~ndm/cmdargs/"&gt;CmdArgs&lt;/a&gt; project has:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;:set -w -fwarn-unused-binds -fwarn-unused-imports&lt;br /&gt;:load Main&lt;br /&gt;&lt;br /&gt;let cmdTest _ = return ":main test"&lt;br /&gt;:def test cmdTest&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;The first line sets some additional warnings. I usually develop my projects in GHCi with lots of useful warnings turned on. I could include these warnings in the .cabal file, but I'd rather have them when developing, and not display them when other people are installing.&lt;br /&gt;&lt;br /&gt;The second line loads the Main.hs file, which for CmdArgs is the where I've put the main tests.&lt;br /&gt;&lt;br /&gt;The last two lines define a command &lt;tt&gt;:test&lt;/tt&gt; which when invoked just runs &lt;tt&gt;:main test&lt;/tt&gt;, which is how I run the test for CmdArgs.&lt;br /&gt;&lt;br /&gt;To load GHCi with this configuration file I simply change to the directory, and type &lt;tt&gt;ghci&lt;/tt&gt;. It automatically loads the right files, and provides a &lt;tt&gt;:test&lt;/tt&gt; command to run the test suite.&lt;br /&gt;&lt;br /&gt;I've also converted the &lt;a href="http://community.haskell.org/~ndm/hlint/"&gt;HLint&lt;/a&gt; project to use a .ghci file. This time the .ghci file is slightly different, but the way I load/test my project is identical:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;:set -fno-warn-overlapping-patterns -w -fwarn-unused-binds -fwarn-unused-imports&lt;br /&gt;:set -isrc;.&lt;br /&gt;:load Main&lt;br /&gt;&lt;br /&gt;let cmdTest _ = return ":main --test"&lt;br /&gt;:def test cmdTest&lt;br /&gt;&lt;br /&gt;let cmdHpc _ = return $ unlines [":!ghc --make -isrc -i. src/Main.hs -w -fhpc -odir .hpc -hidir .hpc -threaded -o .hpc/hlint-test",":!del hlint-test.tix",":!.hpc\\hlint-test --help",":!.hpc\\hlint-test --test",":!.hpc\\hlint-test src --report=.hpc\\hlint-test-report.html +RTS -N3",":!.hpc\\hlint-test data --report=.hpc\\hlint-test-report.html +RTS -N3",":!hpc.exe markup hlint-test.tix --destdir=.hpc",":!hpc.exe report hlint-test.tix",":!del hlint-test.tix",":!start .hpc\\hpc_index_fun.html"]&lt;br /&gt;:def hpc cmdHpc&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;The first section turns on/off the appropriate warnings, then sets the include path and loads the main module.&lt;br /&gt;&lt;br /&gt;The second section defines a command named &lt;tt&gt;:test&lt;/tt&gt;, which runs the tests.&lt;br /&gt;&lt;br /&gt;The final section defines a command named &lt;tt&gt;:hpc&lt;/tt&gt; which runs hpc and pops up a web browser with the result. Unfortunately GHC requires definitions entered in a .ghci file to be on one line, so the formatting isn't ideal, but it's just a list of commands to run at the shell.&lt;br /&gt;&lt;br /&gt;Using a .ghci file for all my projects has a number of advantages:&lt;br /&gt;&lt;br /&gt;&lt;ul&gt;&lt;br /&gt;&lt;li&gt;I have a consistent interface for all my projects.&lt;/li&gt;&lt;br /&gt;&lt;li&gt;Typing &lt;tt&gt;:def&lt;/tt&gt; at the GHCi prompt says which definitions are in scope, and thus which commands exist for this project.&lt;/li&gt;&lt;br /&gt;&lt;li&gt;I've eliminated the Windows specific .bat files.&lt;/li&gt;&lt;br /&gt;&lt;li&gt;The .ghci file mechanism is quite powerful - I've yet to explore it fully, but could imagine much more complex commands.&lt;/li&gt;&lt;br /&gt;&lt;/ul&gt;&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Update:&lt;/b&gt; For a slight improvement on this technique see &lt;a href="http://neilmitchell.blogspot.com/2010/01/better-ghci-files.html"&gt;this post&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;&lt;i&gt;Thanks to Gwern Branwen for submitting a .ghci file for running HLint, and starting my investigation of .ghci files.&lt;/i&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7094652-8859450121269781525?l=neilmitchell.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://neilmitchell.blogspot.com/feeds/8859450121269781525/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=7094652&amp;postID=8859450121269781525' title='3 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/8859450121269781525'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/8859450121269781525'/><link rel='alternate' type='text/html' href='http://neilmitchell.blogspot.com/2010/01/using-ghci-files-to-run-projects.html' title='Using .ghci files to run projects'/><author><name>Neil Mitchell</name><uri>http://www.blogger.com/profile/13084722756124486154</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www-users.cs.york.ac.uk/~ndm/elements/my-photo.jpg'/></author><thr:total>3</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7094652.post-4165368562608175375</id><published>2010-01-03T10:29:00.007Z</published><updated>2011-02-08T17:00:17.187Z</updated><category scheme='http://www.blogger.com/atom/ns#' term='tutorial'/><title type='text'>Explaining Haskell IO without Monads</title><content type='html'>This tutorial explains how to perform IO in &lt;a href="http://haskell.org/"&gt;Haskell&lt;/a&gt;, without attempting to give any understanding of monads. We start with the simplest example of IO, then build up to more complex examples. You can either read the tutorial to the end, or stop at the end of any section - each additional section will let you tackle new problems. We assume basic familiarity with Haskell, such as the material covered in chapters 1 to 6 of &lt;a href="http://www.cs.nott.ac.uk/~gmh/book.html"&gt;Programming in Haskell&lt;/a&gt; by &lt;a href="http://www.cs.nott.ac.uk/~gmh/"&gt;Graham Hutton&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;IO Functions&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;In this tutorial I use four standard IO functions:&lt;br /&gt;&lt;br /&gt;&lt;ul&gt;&lt;br /&gt;&lt;li&gt;&lt;tt&gt;&lt;a href="http://haskell.org/ghc/docs/latest/html/libraries/base/Prelude.html#v:readFile"&gt;readFile&lt;/a&gt; :: FilePath -&gt; IO String&lt;/tt&gt; -- read in a file&lt;/li&gt;&lt;br /&gt;&lt;li&gt;&lt;tt&gt;&lt;a href="http://haskell.org/ghc/docs/latest/html/libraries/base/Prelude.html#v:writeFile"&gt;writeFile&lt;/a&gt; :: FilePath -&gt; String -&gt; IO ()&lt;/tt&gt; -- write out a file&lt;/li&gt;&lt;br /&gt;&lt;li&gt;&lt;tt&gt;&lt;a href="http://haskell.org/ghc/docs/latest/html/libraries/base/System-Environment.html#v:getArgs"&gt;getArgs&lt;/a&gt; :: IO [String]&lt;/tt&gt; -- get the command line arguments, from the module System.Environment&lt;/li&gt;&lt;br /&gt;&lt;li&gt;&lt;tt&gt;&lt;a href="http://haskell.org/ghc/docs/latest/html/libraries/base/Prelude.html#v:putStrLn"&gt;putStrLn&lt;/a&gt; :: String -&gt; IO ()&lt;/tt&gt; -- write out a string, followed by a new line, to the console&lt;/li&gt;&lt;br /&gt;&lt;/ul&gt;&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Simple IO&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;The simplest useful form of IO is to read a file, do something, then write out a file.&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;main :: IO ()&lt;br /&gt;main = do&lt;br /&gt;   src &amp;lt;- readFile "file.in"&lt;br /&gt;   writeFile "file.out" (operate src)&lt;br /&gt;&lt;br /&gt;operate :: String -&gt; String&lt;br /&gt;operate = ... -- your code here&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;This program gets the contents of &lt;tt&gt;file.in&lt;/tt&gt;, runs the operate function on it, then writes the result to &lt;tt&gt;file.out&lt;/tt&gt;. The main function contains all the IO operations, while operate is entirely pure. When writing operate you do not need to understand any details of IO. This pattern of IO was sufficient for my first two years of programming Haskell.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Action List&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;If the pattern described in Simple IO is insufficient, the next step is a list of actions. A main function can be written as:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;main :: IO ()&lt;br /&gt;main = do&lt;br /&gt;    x1 &amp;lt;- expr1&lt;br /&gt;    x2 &amp;lt;- expr2&lt;br /&gt;    ...&lt;br /&gt;    xN &amp;lt;- exprN&lt;br /&gt;    return ()&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;The main function starts with &lt;tt&gt;do&lt;/tt&gt;, then has a sequence of &lt;tt&gt;xI &amp;lt;- exprI&lt;/tt&gt; statements, and ends with &lt;tt&gt;return ()&lt;/tt&gt;. Each statement has a pattern on the left of the arrow (often just a variable), and an expression on the right. If the expression is not of type IO, then you must write &lt;tt&gt;xI &amp;lt;- return (exprI)&lt;/tt&gt;. The return function takes a value, and wraps it in the IO type.&lt;br /&gt;&lt;br /&gt;As a simple example we can write a program that gets the command line arguments, reads the file given by the first argument, operates on it, then writes out to the file given by the second argument:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;main :: IO ()&lt;br /&gt;main = do&lt;br /&gt;    [arg1,arg2] &amp;lt;- getArgs&lt;br /&gt;    src &amp;lt;- readFile arg1&lt;br /&gt;    res &amp;lt;- return (operate src)&lt;br /&gt;    _ &amp;lt;- writeFile arg2 res&lt;br /&gt;    return ()&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;As before, operate is a pure function. The first line after the do uses a pattern match to extract the command line arguments. The second line reads the file specified by the first argument. The third line uses return to wrap a pure value. The fourth line provides no useful result, so we ignore it by writing &lt;tt&gt;_ &amp;lt;-&lt;/tt&gt;.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Simplifying IO&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;The action list pattern is very rigid, and people usually simplify the code using the following three rules:&lt;br /&gt;&lt;br /&gt;&lt;ol&gt;&lt;br /&gt;&lt;li&gt;&lt;tt&gt;_ &amp;lt;- x&lt;/tt&gt; can be rewritten as &lt;tt&gt;x&lt;/tt&gt;.&lt;/li&gt;&lt;br /&gt;&lt;li&gt;If the penultimate line doesn't have a binding arrow (&lt;tt&gt;&amp;lt;-&lt;/tt&gt;) and is of type IO (), then the &lt;tt&gt;return ()&lt;/tt&gt; can be removed.&lt;/li&gt;&lt;br /&gt;&lt;li&gt;&lt;tt&gt;x &amp;lt;- return y&lt;/tt&gt; can be rewritten as &lt;tt&gt;let x = y&lt;/tt&gt; (provided you don't reuse variable names).&lt;/li&gt;&lt;br /&gt;&lt;/ol&gt;&lt;br /&gt;&lt;br /&gt;With these rules we can rewrite our example as:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;main :: IO ()&lt;br /&gt;main = do&lt;br /&gt;    [arg1,arg2] &amp;lt;- getArgs&lt;br /&gt;    src &amp;lt;- readFile arg1&lt;br /&gt;    let res = operate src&lt;br /&gt;    writeFile arg2 res&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Nested IO&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;So far only the main function has been of type IO, but we can create other IO functions, to wrap up common patterns. For example, we can write a utility function to print nice looking titles:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;title :: String -&gt; IO ()&lt;br /&gt;title str = do&lt;br /&gt;    putStrLn str&lt;br /&gt;    putStrLn (replicate (length str) '-')&lt;br /&gt;    putStrLn ""&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;We can use this title function multiple times within main:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;main :: IO ()&lt;br /&gt;main = do&lt;br /&gt;    title "Hello"&lt;br /&gt;    title "Goodbye"&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Returning IO Values&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;The functions we've written so far have all been of type IO (), which lets us perform IO actions, but not give back interesting results. To give back the value x, we write &lt;tt&gt;return x&lt;/tt&gt; as the final line of the do block. Unlike the imperative language return statement, this return must be on the final line.&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;readArgs :: IO (String,String)&lt;br /&gt;readArgs = do&lt;br /&gt;    xs &amp;lt;- getArgs&lt;br /&gt;    let x1 = if length xs &gt; 0 then xs !! 0 else "file.in"&lt;br /&gt;    let x2 = if length xs &gt; 1 then xs !! 1 else "file.out"&lt;br /&gt;    return (x1,x2)&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;This function returns the first two command line arguments, or supplies default values if fewer arguments are given. We can now use this in the main program from before:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;main :: IO ()&lt;br /&gt;main = do&lt;br /&gt;    (arg1,arg2) &amp;lt;- readArgs&lt;br /&gt;    src &amp;lt;- readFile arg1&lt;br /&gt;    let res = operate src&lt;br /&gt;    writeFile arg2 res&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Now, if less than two arguments are given, the program will use default file names instead of crashing.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Optional IO&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;So far we've only seen a static list of IO statements, executed in order. Using if, we can choose what IO to perform. For example, if the user enters no arguments we can tell them:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;main :: IO ()&lt;br /&gt;main = do&lt;br /&gt;    xs &amp;lt;- getArgs&lt;br /&gt;    if null xs then do&lt;br /&gt;        putStrLn "You entered no arguments"&lt;br /&gt;     else do&lt;br /&gt;        putStrLn ("You entered " ++ show xs)&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;For optional IO you make the final statement of the do block an if, then under each branch continue the do. The only subtle point is that the else must be indented by one more space than the if. This caveat is widely considered to be a bug in the definition of Haskell, but for the moment, the extra space before the else is required.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Break Time&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;If you've gone from understanding no IO to this point in the tutorial, I suggest you take a break (a cookie is recommended). The IO presented above is all that imperative languages provide, and is a useful starting point. Just as functional programming provides much more powerful ways of working with functions by treating them as values, it also allows IO to be treated as values, which we explore in the rest of the tutorial.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Working with IO Values&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;The next stage is to work with IO as values. Until now, all IO statements have been executed immediately, but we can also create variables of type IO. Using our title function from above we can write:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;main :: IO ()&lt;br /&gt;main = do&lt;br /&gt;    let x = title "Welcome"&lt;br /&gt;    x&lt;br /&gt;    x&lt;br /&gt;    x&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Instead of running the IO with &lt;tt&gt;x &amp;lt;-&lt;/tt&gt;, we have placed the IO value in the variable x, without running it. The type of x is IO (), so we can now write &lt;tt&gt;x&lt;/tt&gt; on a line to execute the action. By writing the x three times we perform the action three times.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Passing IO Arguments&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;We can also pass IO values as arguments to functions. In the previous example we ran the IO action three times, but how would we run it fifty times? We can write a function that takes an IO action, and a number, and runs the action that number of times:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;replicateM_ :: Int -&gt; IO () -&gt; IO ()&lt;br /&gt;replicateM_ n act = do&lt;br /&gt;    if n == 0 then do&lt;br /&gt;        return ()&lt;br /&gt;     else do&lt;br /&gt;        act&lt;br /&gt;        replicateM_ (n-1) act&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;This definition makes use of optional IO to decide when to stop, and recursion to continue performing the IO. We can now rewrite the previous example as:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;main :: IO ()&lt;br /&gt;main = do&lt;br /&gt;    let x = title "Welcome"&lt;br /&gt;    replicateM_ 3 x&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;In an imperative language the replicateM_ function is built in as a for statement, but the flexibility of Haskell allows us to define new control flow statements - a very powerful feature. The &lt;a href="http://haskell.org/ghc/docs/latest/html/libraries/base/Control-Monad.html#v:replicateM_"&gt;replicateM_&lt;/a&gt; function defined in &lt;a href="http://haskell.org/ghc/docs/latest/html/libraries/base/Control-Monad.html"&gt;Control.Monad&lt;/a&gt; is like ours, but more general, and can be used instead.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;IO in Structures&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;We've seen IO values being passed as arguments, so it's natural that we can also put IO in structures such as lists and tuples. The function sequence_ takes a list of IO actions, and executes each action in turn:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;sequence_ :: [IO ()] -&gt; IO ()&lt;br /&gt;sequence_ xs = do&lt;br /&gt;    if null xs then do&lt;br /&gt;        return ()&lt;br /&gt;     else do&lt;br /&gt;        head xs&lt;br /&gt;        sequence_ (tail xs)&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;If there are no elements in the list then sequence_ stops, with &lt;tt&gt;return ()&lt;/tt&gt;. If there are elements in the list then sequence_ gets the first action (with &lt;tt&gt;head xs&lt;/tt&gt;) and executes it, then calls sequence_ on the remaining actions. As before, &lt;a href="http://haskell.org/ghc/docs/latest/html/libraries/base/Control-Monad.html#v:sequence_"&gt;sequence_&lt;/a&gt; is available in &lt;a href="http://haskell.org/ghc/docs/latest/html/libraries/base/Control-Monad.html"&gt;Control.Monad&lt;/a&gt;, but in a more general form. It is now simple to rewrite replicateM_ in terms of sequence_:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;replicateM_ :: Int -&gt; IO () -&gt; IO ()&lt;br /&gt;replicateM_ n act = sequence_ (replicate n act)&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Pattern Matching&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;A much more natural definition of sequence_, rather than using null/head/tail, is to make use of Haskell's pattern matching. If there is exactly one statement in a do block, you can remove the do. Rewriting sequence_ we can eliminate the do after the equals sign, and the do after the then keyword.&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;sequence_ :: [IO ()] -&gt; IO ()&lt;br /&gt;sequence_ xs =&lt;br /&gt;    if null xs then&lt;br /&gt;        return ()&lt;br /&gt;     else do&lt;br /&gt;        head xs&lt;br /&gt;        sequence_ (tail xs)&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Now we can replace the if with pattern matching, without needing to consider the IO:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;sequence_ :: [IO ()] -&gt; IO ()&lt;br /&gt;sequence_ [] = return ()&lt;br /&gt;sequence_ (x:xs) = do&lt;br /&gt;    x&lt;br /&gt;    sequence_ xs&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Final Example&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;As a final example, imagine we wish to perform some operation on every file given at the command line. Using what we have already learnt, we can write:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;main :: IO ()&lt;br /&gt;main = do&lt;br /&gt;    xs &amp;lt;- getArgs&lt;br /&gt;    sequence_ (map operateFile xs)&lt;br /&gt;&lt;br /&gt;operateFile :: FilePath -&gt; IO ()&lt;br /&gt;operateFile x = do&lt;br /&gt;    src &amp;lt;- readFile x&lt;br /&gt;    writeFile (x ++ ".out") (operate src)&lt;br /&gt;&lt;br /&gt;operate :: String -&gt; String&lt;br /&gt;operate = ...&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;&lt;b&gt;IO Design&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;A Haskell program usually consists of an outer IO shell calling pure functions. In the previous example main and operateFile are part of the IO shell, while operate and everything it uses are pure. As a general design principle, keep the IO layer small. The IO layer should concisely perform the necessary IO, then delegate to the pure part. Use of explicit IO in Haskell is necessary, but should be kept to a minimum - pure Haskell is where the beauty lies.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Where to go now&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;You should now be equipped to do all the IO you need. To become more proficient I recommend any of the following:&lt;br /&gt;&lt;br /&gt;&lt;ul&gt;&lt;br /&gt;&lt;li&gt;Write lots of Haskell code.&lt;/li&gt;&lt;br /&gt;&lt;li&gt;Read chapters 8 and 9 of &lt;a href="http://www.cs.nott.ac.uk/~gmh/book.html"&gt;Programming in Haskell&lt;/a&gt; by &lt;a href="http://www.cs.nott.ac.uk/~gmh/"&gt;Graham Hutton&lt;/a&gt;. You should expect to spend about 6 hours thinking and contemplating on sections 8.1 to 8.4 (I recommend going to a hospital A&amp;E department with a minor injury).&lt;/li&gt;&lt;br /&gt;&lt;li&gt;Read &lt;a href="http://www.haskell.org/haskellwiki/Monads_as_Containers"&gt;Monads as Containers&lt;/a&gt;, an excellent introduction to monads.&lt;/li&gt;&lt;br /&gt;&lt;li&gt;Look at the documentation on &lt;a href="http://www.haskell.org/haskellwiki/Monad_Laws"&gt;the monad laws&lt;/a&gt;, and find where I've used them in this tutorial.&lt;/li&gt;&lt;br /&gt;&lt;li&gt;Read through all the functions in &lt;a href="http://haskell.org/ghc/docs/latest/html/libraries/base/Control-Monad.html"&gt;Control.Monad&lt;/a&gt;, try to define them, and then use them when writing programs.&lt;/li&gt;&lt;br /&gt;&lt;li&gt;Implement and use a &lt;a href="http://hackage.haskell.org/packages/archive/mtl/1.1.0.2/doc/html/Control-Monad-State.html"&gt;state monad&lt;/a&gt;.&lt;/li&gt;&lt;br /&gt;&lt;/ul&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7094652-4165368562608175375?l=neilmitchell.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://neilmitchell.blogspot.com/feeds/4165368562608175375/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=7094652&amp;postID=4165368562608175375' title='12 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/4165368562608175375'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/4165368562608175375'/><link rel='alternate' type='text/html' href='http://neilmitchell.blogspot.com/2010/01/haskell-io-without-monads.html' title='Explaining Haskell IO without Monads'/><author><name>Neil Mitchell</name><uri>http://www.blogger.com/profile/13084722756124486154</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www-users.cs.york.ac.uk/~ndm/elements/my-photo.jpg'/></author><thr:total>12</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7094652.post-5305140815234759316</id><published>2009-11-23T11:58:00.004Z</published><updated>2009-11-23T12:24:30.815Z</updated><category scheme='http://www.blogger.com/atom/ns#' term='ghc'/><title type='text'>Haskell DLL's on Windows</title><content type='html'>The current section of the GHC manual on &lt;a href="http://www.haskell.org/ghc/docs/latest/html/users_guide/win32-dlls.html"&gt;creating DLL's on Windows&lt;/a&gt; is fairly confusing to read, and has some bugs (i.e. &lt;a href="http://hackage.haskell.org/trac/ghc/ticket/3605"&gt;3605&lt;/a&gt;). Since I got tripped up by the current documentation, I offered to rewrite sections 11.6.2 and 11.6.3 (merging them in the process). Creating Windows DLL's with GHC is surprisingly easy, and my revised manual section includes an example which can be called from both Microsoft Word (using VBA) and C++. I've pasted the revised manual section as the rest of this blog post. I'll shortly be submitting it to the GHC team, so any feedback is welcome.&lt;br /&gt;&lt;br /&gt;&lt;hr/&gt;&lt;br /&gt;&lt;br /&gt;&lt;b&gt;11.6.2. Making DLLs to be called from other languages&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;This section describes how to create DLLs to be called from other languages, such as Visual Basic or C++. This is a special case of &lt;a href="http://www.haskell.org/ghc/docs/latest/html/users_guide/ffi-ghc.html#ffi-library"&gt;Section 8.2.1.2, "Making a Haskell library that can be called from foreign code"&lt;/a&gt;; we'll deal with the DLL-specific issues that arise below. Here's an example:&lt;br /&gt;&lt;br /&gt;Use foreign export declarations to export the Haskell functions you want to call from the outside. For example:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;-- Adder.hs&lt;br /&gt;{-# LANGUAGE ForeignFunctionInterface #-}&lt;br /&gt;module Adder where&lt;br /&gt;&lt;br /&gt;adder :: Int -&gt; Int -&gt; IO Int  -- gratuitous use of IO&lt;br /&gt;adder x y = return (x+y)&lt;br /&gt;&lt;br /&gt;foreign export stdcall adder :: Int -&gt; Int -&gt; IO Int&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Add some helper code that starts up and shuts down the Haskell RTS:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;// StartEnd.c&lt;br /&gt;#include &amp;lt;Rts.h&amp;gt;&lt;br /&gt;&lt;br /&gt;extern void __stginit_Adder(void);&lt;br /&gt;&lt;br /&gt;void HsStart()&lt;br /&gt;{&lt;br /&gt;   int argc = 1;&lt;br /&gt;   char* argv[] = {"ghcDll", NULL}; // argv must end with NULL&lt;br /&gt;&lt;br /&gt;   // Initialize Haskell runtime&lt;br /&gt;   char** args = argv;&lt;br /&gt;   hs_init(&amp;argc, &amp;args);&lt;br /&gt;&lt;br /&gt;   // Tell Haskell about all root modules&lt;br /&gt;   hs_add_root(__stginit_Adder);&lt;br /&gt;}&lt;br /&gt;&lt;br /&gt;void HsEnd()&lt;br /&gt;{&lt;br /&gt;   hs_exit();&lt;br /&gt;}&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Here, &lt;tt&gt;Adder&lt;/tt&gt; is the name of the root module in the module tree (as mentioned above, there must be a single root module, and hence a single module tree in the DLL). Compile everything up:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;$ ghc -c Adder.hs&lt;br /&gt;$ ghc -c StartEnd.c&lt;br /&gt;$ ghc -shared -o Adder.dll Adder.o Adder_stub.o StartEnd.o&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Now the file Adder.dll can be used from other programming languages. Before calling any functions in Adder it is necessary to call &lt;tt&gt;HsStart&lt;/tt&gt;, and at the very end call &lt;tt&gt;HsEnd&lt;/tt&gt;.&lt;br /&gt;&lt;br /&gt;NOTE: It may appear tempting to use &lt;tt&gt;DllMain&lt;/tt&gt; to call &lt;tt&gt;hs_init&lt;/tt&gt;/&lt;tt&gt;hs_exit&lt;/tt&gt;, but this won’t work (particularly if you compile with &lt;tt&gt;-threaded&lt;/tt&gt;).&lt;br /&gt;&lt;br /&gt;&lt;b&gt;11.6.2.1. Using from VBA&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;An example of using Adder.dll from VBA is:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;Private Declare Function Adder Lib "Adder.dll" Alias "adder@8" _&lt;br /&gt;      (ByVal x As Long, ByVal y As Long) As Long&lt;br /&gt;&lt;br /&gt;Private Declare Sub HsStart Lib "Adder.dll" ()&lt;br /&gt;Private Declare Sub HsEnd Lib "Adder.dll" ()&lt;br /&gt;&lt;br /&gt;Private Sub Document_Close()&lt;br /&gt;HsEnd&lt;br /&gt;End Sub&lt;br /&gt;&lt;br /&gt;Private Sub Document_Open()&lt;br /&gt;HsStart&lt;br /&gt;End Sub&lt;br /&gt;&lt;br /&gt;Public Sub Test()&lt;br /&gt;MsgBox "12 + 5 = " &amp; Adder(12, 5)&lt;br /&gt;End Sub&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;This example uses the &lt;tt&gt;Document_Open&lt;/tt&gt;/&lt;tt&gt;Close&lt;/tt&gt; functions of Microsoft Word, but provided &lt;tt&gt;HsStart&lt;/tt&gt; is called before the first function, and &lt;tt&gt;HsEnd&lt;/tt&gt; after the last, then it will work fine.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;11.6.2.2. Using from C++&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;An example of using Adder.dll from C++ is:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;// Tester.cpp&lt;br /&gt;#include "HsFFI.h"&lt;br /&gt;#include "Adder_stub.h"&lt;br /&gt;#include &amp;lt;stdio.h&amp;gt;&lt;br /&gt;&lt;br /&gt;extern "C" {&lt;br /&gt;    void HsStart();&lt;br /&gt;    void HsEnd();&lt;br /&gt;}&lt;br /&gt;&lt;br /&gt;int main()&lt;br /&gt;{&lt;br /&gt;    HsStart();&lt;br /&gt;    // can now safely call functions from the DLL&lt;br /&gt;    printf("12 + 5 = %i\n", adder(12,5))    ;&lt;br /&gt;    HsEnd();&lt;br /&gt;    return 0;&lt;br /&gt;}&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;This can be compiled and run with:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;$ ghc -o tester Tester.cpp Adder.dll.a&lt;br /&gt;$ tester&lt;br /&gt;12 + 5 = 17&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;&lt;i&gt;Please give feedback in the comments.&lt;/i&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7094652-5305140815234759316?l=neilmitchell.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://neilmitchell.blogspot.com/feeds/5305140815234759316/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=7094652&amp;postID=5305140815234759316' title='12 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/5305140815234759316'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/5305140815234759316'/><link rel='alternate' type='text/html' href='http://neilmitchell.blogspot.com/2009/11/haskell-dlls-on-windows.html' title='Haskell DLL&apos;s on Windows'/><author><name>Neil Mitchell</name><uri>http://www.blogger.com/profile/13084722756124486154</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www-users.cs.york.ac.uk/~ndm/elements/my-photo.jpg'/></author><thr:total>12</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7094652.post-471843912243762118</id><published>2009-11-16T20:55:00.004Z</published><updated>2009-11-23T21:18:59.713Z</updated><category scheme='http://www.blogger.com/atom/ns#' term='hlint'/><title type='text'>Reviewing View Patterns</title><content type='html'>&lt;a href="http://www.haskell.org/ghc/docs/latest/html/users_guide/syntax-extns.html#view-patterns"&gt;View Patterns&lt;/a&gt; are an interesting extension to the pattern matching capabilities of Haskell, implemented in GHC 6.10 and above. After using view patterns in real world programs, including &lt;a href="http://community.haskell.org/~ndm/hlint/"&gt;HLint&lt;/a&gt;, I've come to like them. I use view patterns in 10 of the 27 modules in HLint.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;View Pattern Overview&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;My intuitive understanding of view patterns is given in my &lt;a href="http://community.haskell.org/~ndm/downloads/paper-deriving_a_relationship_from_a_single_example-04_sep_2009.pdf"&gt;Approaches and Applications of Inductive Programming 2009 paper&lt;/a&gt;, which describes the view pattern translation as:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;f (sort -&gt; min:ascending) = ...&lt;br /&gt;    ==&lt;br /&gt;f v_1 | min:ascending &amp;lt;- sort v_1 = ...&lt;br /&gt;    ==&lt;br /&gt;f v_1 | case v_2 of _:_ -&gt; True ; _ -&gt; False = ...&lt;br /&gt;    where  v_2 = sort v_1 ; min:ascending = v_2&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;The view pattern on the first line sorts the list elements, then binds the lowest element to &lt;tt&gt;min&lt;/tt&gt; and the remaining elements to &lt;tt&gt;ascending&lt;/tt&gt;. If there are no elements in the list then the pattern will not match. This can be translated to a pattern guard, which can then be translated to a case expression. This translation does not preserve the scoping behaviour of the variables, but is sufficient for all my uses of view patterns. It is important to note that the translation from view patterns to pattern guards is fairly simple, and mainly eliminates one redundant intermediate variable. However, the translation from pattern guards to case expressions and guards is substantially harder.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;How I Use View Patterns&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;My uses of view patterns seem to fall into a few distinct categories. Here are some example code snippets (mainly from HLint), along with explanation.&lt;br /&gt;&lt;br /&gt;1) Complex/Nested Matching&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;uglyEta (fromParen -&gt; App f (fromParen -&gt; App g x)) (fromParen -&gt; App h y) = g == h&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Every operation/match pair in a pattern guard requires a separate pattern guard, whereas view patterns can be nested very naturally. Here the abstract syntax tree for expressions has brackets, and the &lt;tt&gt;fromParen&lt;/tt&gt; function unwraps any brackets to find the interesting term inside. View patterns allow us to perform nested matches, which would have required three separate pattern guards.&lt;br /&gt;&lt;br /&gt;2) Matching on a Different Structure&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;isAppend (view -&gt; App2 op _ _) = op ~= "++"&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;The expression tree I use in HLint has lots of expressions which apply a function to two arguments - for example &lt;tt&gt;App (App (Var f) x) y&lt;/tt&gt; and &lt;tt&gt;InfixOp x f y&lt;/tt&gt;. I have a type class &lt;tt&gt;View&lt;/tt&gt; that maps expressions into the data type &lt;tt&gt;data App2 = NoApp2 | App2 String Exp Exp&lt;/tt&gt;, allowing easy matching on a range of expressions.&lt;br /&gt;&lt;br /&gt;3) Safe Normalisation&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;dismogrify (simplify -&gt; x) = .... x ....&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;While working with &lt;a href="http://community.haskell.org/~ndm/yhc/"&gt;Yhc Core&lt;/a&gt; for the &lt;a href="http://community.haskell.org/~ndm/catch/"&gt;Catch&lt;/a&gt; and &lt;a href="http://community.haskell.org/~ndm/supero/"&gt;Supero&lt;/a&gt; tools I often wanted to process a syntax tree after simplifying it. If you name the original tree &lt;tt&gt;x&lt;/tt&gt;, and the simplified tree &lt;tt&gt;y&lt;/tt&gt;, then it's an easy (and type-safe) mistake to use &lt;tt&gt;x&lt;/tt&gt; instead of &lt;tt&gt;y&lt;/tt&gt;. To avoid this I wrote:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;dismogrify bad_x = .... x ....&lt;br /&gt;   where x = simplify bad_x&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Using &lt;tt&gt;bad_x&lt;/tt&gt; in the expression makes the mistake easy for a human to spot. Using a view pattern makes the mistake impossible.&lt;br /&gt;&lt;br /&gt;4) Mapping&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;classify (Ident (getRank -&gt; x)) = ...&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Sometimes I want to take a variable in one domain, and work with it in another. In the above example &lt;tt&gt;getRank&lt;/tt&gt; converts a &lt;tt&gt;String&lt;/tt&gt; to a &lt;tt&gt;Rank&lt;/tt&gt; enumeration. Within the &lt;tt&gt;classify&lt;/tt&gt; function I only wish to work with the rank as an enumeration, so it's convenient to never bind the string. This pattern is similar to safe normalisation, but it's purpose isn't safety - just making things a little neater.&lt;br /&gt;&lt;br /&gt;5) Abstraction&lt;br /&gt;&lt;br /&gt;The view pattern example in the GHC manual is all about abstraction. I have mainly used HLint in programs which don't use abstract data types, just algebraic data types which are intended to be manipulated directly. I don't think there are many data types which are both abstract and have a structural view, so I suspect this use will be less common (&lt;a href="http://hackage.haskell.org/packages/archive/containers/0.2.0.1/doc/html/Data-Sequence.html"&gt;Data.Sequence&lt;/a&gt; is the only type that comes to mind).&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Improvements I Suggest&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;I think there are three improvements that should be made to the view patterns in GHC 6.10.4:&lt;br /&gt;&lt;br /&gt;1) Warnings&lt;br /&gt;&lt;br /&gt;In GHC 6.10 all view patterns are incorrectly considered overlapping (see &lt;a href="http://hackage.haskell.org/trac/ghc/ticket/2395"&gt;bug #2395&lt;/a&gt;), so all users of view patterns need to supply &lt;tt&gt;-fno-warn-overlapping-patterns&lt;/tt&gt;. This problem has been fixed in GHC 6.12, which is great news.&lt;br /&gt;&lt;br /&gt;2) Scoping&lt;br /&gt;&lt;br /&gt;The current scoping behaviour seems undesirable:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;apply (f -&gt; y) = ...&lt;br /&gt;    where f = ...&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Here the &lt;tt&gt;f&lt;/tt&gt; in the view pattern isn't the &lt;tt&gt;f&lt;/tt&gt; bound at the &lt;tt&gt;where&lt;/tt&gt;. I suggest that the lhs of the &lt;tt&gt;-&gt;&lt;/tt&gt; can use variables from the &lt;tt&gt;where&lt;/tt&gt;, in a similar manner to pattern guards. (It's possible this suggestion is misguided, as the scoping rules can be quite subtle.)&lt;br /&gt;&lt;br /&gt;3) Implicit Patterns&lt;br /&gt;&lt;br /&gt;The original &lt;a href="http://hackage.haskell.org/trac/ghc/wiki/ViewPatterns"&gt;view patterns wiki document&lt;/a&gt; asks what should become of &lt;tt&gt;(-&gt; ...)&lt;/tt&gt;, and proposes it become &lt;tt&gt;(view -&gt; ...)&lt;/tt&gt;. I like this idea as HLint already contains 12 instances of &lt;tt&gt;(view -&gt; ...)&lt;/tt&gt;. The only question is which &lt;tt&gt;view&lt;/tt&gt; should be used? I think there are two possible answers:&lt;br /&gt;&lt;br /&gt;a) The &lt;tt&gt;view&lt;/tt&gt; currently in scope&lt;br /&gt;&lt;br /&gt;If the desugaring is simply to &lt;tt&gt;view&lt;/tt&gt;, then people can select their imports appropriately to choose their &lt;tt&gt;view&lt;/tt&gt; function. This proposal is similar to the rebindable syntax already supported, but in this case may be a legitimate default, due to several possible &lt;tt&gt;view&lt;/tt&gt; interpretations. If one day everyone starts using &lt;tt&gt;Data.View.view&lt;/tt&gt;, then the default could be switched. As an example (in combination with proposal 2) we could have:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;uglyEta (-&gt; App f (-&gt; App g x)) (-&gt; App h y) = g == h&lt;br /&gt;    where view = fromParen&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;b) &lt;tt&gt;Data.View.view&lt;/tt&gt;&lt;br /&gt;&lt;br /&gt;In HLint I have used:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;class View a b where&lt;br /&gt;    view :: a -&gt; b&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;I haven't needed any functional dependencies, as the matching always constrains the types sufficiently. I have mapped one source type (i.e. &lt;tt&gt;Exp&lt;/tt&gt;) to several matching types (&lt;tt&gt;App2&lt;/tt&gt; and &lt;tt&gt;App1&lt;/tt&gt;), but never mapped multiple source types onto one matching type. If I was to add a dependency it should be that &lt;tt&gt;b&lt;/tt&gt; uniquely determines &lt;tt&gt;a&lt;/tt&gt;, as usually &lt;tt&gt;b&lt;/tt&gt; will have a pattern on the RHS which will constrain &lt;tt&gt;b&lt;/tt&gt; already.&lt;br /&gt;&lt;br /&gt;I think my preference is for using &lt;tt&gt;Data.view.view&lt;/tt&gt;, primarily because all other Haskell syntax is bound to a fixed name, rather than using the name currently in scope. However, my opinions on functional dependencies should be taken with skepticism - I'm not normally a user of functional dependencies.&lt;br /&gt;&lt;br /&gt;4) Rejected Suggestions&lt;br /&gt;&lt;br /&gt;I do not support the idea of implicit view patterns without some leading syntax (see &lt;a href="http://hackage.haskell.org/trac/ghc/ticket/3583"&gt;bug 3583&lt;/a&gt;) - view patterns are nice, but I don't think they are important enough to be first-class, like they are in F# (note that F# interoperates with OO languages, so first-class view patterns are much more reasonable there).&lt;br /&gt;&lt;br /&gt;I also do not support the idea of &lt;a href="http://hackage.haskell.org/trac/ghc/wiki/ViewPatterns#ImplicitMaybe"&gt;implicit &lt;tt&gt;Maybe&lt;/tt&gt;&lt;/a&gt; in view patterns - &lt;tt&gt;Maybe&lt;/tt&gt; should not be special, and this suggestion doesn't seem to fit with the rest of Haskell.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Conclusion&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;View patterns are a nice enhancement to pattern guards, increasing their compositionality and reducing the need for redundant intermediate variables. I could live without view patterns, but I don't think I should have to - the design is good, and they fit in nicely with the language. As for pattern guards, I consider them an essential part of the Haskell language that really makes a substantially difference to some pieces of code that would otherwise be rather ugly.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Edit:&lt;/b&gt; Fix as per Christophe's comment.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7094652-471843912243762118?l=neilmitchell.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://neilmitchell.blogspot.com/feeds/471843912243762118/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=7094652&amp;postID=471843912243762118' title='6 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/471843912243762118'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/471843912243762118'/><link rel='alternate' type='text/html' href='http://neilmitchell.blogspot.com/2009/11/reviewing-view-patterns.html' title='Reviewing View Patterns'/><author><name>Neil Mitchell</name><uri>http://www.blogger.com/profile/13084722756124486154</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www-users.cs.york.ac.uk/~ndm/elements/my-photo.jpg'/></author><thr:total>6</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7094652.post-3690295918968382169</id><published>2009-09-12T15:14:00.004+01:00</published><updated>2009-09-12T18:18:51.281+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='hlint'/><title type='text'>How I Use HLint</title><content type='html'>&lt;a href="http://community.haskell.org/~ndm/hlint/"&gt;HLint&lt;/a&gt; is a tool for automatically suggesting improvements to your Haskell code. This post describes how I use HLint, and provides and some background on its development. Before reading this article, if you are an active Haskell programmer who has not yet tried out HLint, I suggest you perform the following steps:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;cabal update &amp;&amp; cabal install hlint&lt;br /&gt;cd &lt;i&gt;your-current-project&lt;/i&gt;&lt;br /&gt;hlint . --report&lt;br /&gt;# open report.html in your web browser&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;The original purpose of HLint was to help teach beginners. When helping with the functional programming course at &lt;a href="http://www.cs.york.ac.uk/"&gt;York&lt;/a&gt;, I used to wander round the students, looking at their code, and suggesting improvements. After three years helping with the same course, I found myself regularly suggesting the same improvements. For example, the pattern &lt;tt&gt;if a then True else b&lt;/tt&gt; came up a lot, which can be written more succinctly as &lt;tt&gt;a || b&lt;/tt&gt;. Of course, having turned myself into a pattern recognition tool, the obvious step was to automate myself - and HLint is the result.&lt;br /&gt;&lt;br /&gt;I am no longer at a University, and so the way I use HLint has changed. Often on the Haskell Cafe mailing list people ask for code reviews - intermediate level Haskellers trying to gain knowledge from those around them. The suggestions resulting from a code review are often split into two categories. There are small-scale suggestions about things such as using a better library function, and large-scale suggestions about what the structure of the program should be. Often it is useful to tackle the small-scale issues, tidying and polishing what is already there, before investigating any large-scale issues. Unfortunately reviewers are often short of time, so they may not get round to making large-scale suggestions. The hope is that HLint can automate much of the small-scale suggestions, allowing clever people to use their time more effectively on the more complex problems.&lt;br /&gt;&lt;br /&gt;Another reason to use HLint is one of developer pride. Some developers do not react well to criticism, and take comments about their code in a very personal way. Worse still, if you declare that some small syntactic pattern is the "wrong way to do it", then you can inadvertently end up just point out the failings. In contrast, if HLint is run first, then the human suggestions are typically deeper, and are design trade-offs that can be debated.&lt;br /&gt;&lt;br /&gt;HLint is not designed as a tool to fix existing code, but more as a tool to promote learning, thus pre-emptively fixing future code. I do not intend people to slavishly apply the hints given by HLint - each hint should be carefully considered. For example, the &lt;a href="http://darcs.net"&gt;darcs&lt;/a&gt; project uses HLint, but has decided that they are not interested in eta reduction hints, so have used HLint's ignoring facility. &lt;br /&gt;&lt;br /&gt;One use of HLint is to provide an easy mechanism to start participating in an open source project. One of the largest hurdles in project participation is writing your first patch. Many projects have different conventions and requirements, plus there is usually a large code base that needs to be learnt. A good first step might be to run HLint over the code. While many of the hints suggested by HLint might be design decisions, or minor issues, there are likely to be a few more unambiguous improvements. As a simple example, taking the &lt;a href="http://xmonad.org/"&gt;xmonad&lt;/a&gt; code base and applying HLint shows that the &lt;tt&gt;import Data.Maybe&lt;/tt&gt; statements in &lt;a href="http://code.haskell.org/xmonad/XMonad/Core.hs"&gt;XMonad\Core.hs&lt;/a&gt; could be combined. This would be a perfect first patch for a budding xmonad developer.&lt;br /&gt;&lt;br /&gt;HLint can be used in many ways, but my two golden rules for HLint usage are:&lt;br /&gt;&lt;br /&gt;&lt;ol&gt;&lt;br /&gt;&lt;li&gt;Do not blindly apply the output of HLint&lt;/li&gt;&lt;br /&gt;&lt;li&gt;Never review code that hasn't had HLint applied&lt;/li&gt;&lt;br /&gt;&lt;/ol&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7094652-3690295918968382169?l=neilmitchell.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://neilmitchell.blogspot.com/feeds/3690295918968382169/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=7094652&amp;postID=3690295918968382169' title='8 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/3690295918968382169'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/3690295918968382169'/><link rel='alternate' type='text/html' href='http://neilmitchell.blogspot.com/2009/09/how-i-use-hlint.html' title='How I Use HLint'/><author><name>Neil Mitchell</name><uri>http://www.blogger.com/profile/13084722756124486154</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www-users.cs.york.ac.uk/~ndm/elements/my-photo.jpg'/></author><thr:total>8</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7094652.post-1826575836216565046</id><published>2009-06-16T00:14:00.005+01:00</published><updated>2009-06-16T00:29:09.869+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='firstify'/><category scheme='http://www.blogger.com/atom/ns#' term='derive'/><title type='text'>Draft paper on Derive, comments wanted</title><content type='html'>It's been a long time since I last blogged (about 3 months). Since then I've had a paper on &lt;a href="http://community.haskell.org/~ndm/firstify/"&gt;Firstify&lt;/a&gt; accepted in to the &lt;a href="http://haskell.org/haskell-symposium/2009/"&gt;Haskell Symposium&lt;/a&gt; (I'll post the final version to my website shortly). I've also been writing a paper on &lt;a href="http://community.haskell.org/~ndm/derive/"&gt;Derive&lt;/a&gt; to go with my invited talk at &lt;a href="http://www.cogsys.wiai.uni-bamberg.de/aaip09/"&gt;Approaches and Applications of Inductive Programming&lt;/a&gt; (co-located with &lt;a href="http://www.cs.nott.ac.uk/~gmh/icfp09.html"&gt;ICFP&lt;/a&gt; this year). I have to submit a final version by the 22nd of June (6 days time), but any comments on this draft would be gratefully received - either add them as comments to this post or send an email to &lt;tt&gt;ndmitchell AT gmail DOT com&lt;/tt&gt;.&lt;br /&gt;&lt;br /&gt;Download link: &lt;a href="http://community.haskell.org/~ndm/temp/derive_draft.pdf"&gt;http://community.haskell.org/~ndm/temp/derive_draft.pdf&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Title: Deriving a DSL from One Example&lt;br /&gt;&lt;br /&gt;Abstract: Given an appropriate domain specific language (DSL), it is possible to describe the relationship between Haskell data types and many generic functions, typically type class instances. While describing the relationship is possible, it is not always an easy task. There is an alternative -- simply give one example output for a carefully chosen input, and have the relationship derived.&lt;br /&gt;&lt;br /&gt;When deriving a relationship from only one example, it is important that the derived relationship is the intended one. We identify general restrictions on the DSL, and on the provided example, to ensure a level of predictability. We then apply these restrictions in practice, to derive the relationship between Haskell data types and generic functions. We have used our scheme in the Derive tool, where over 60% of type classes are derived from a single example.&lt;br /&gt;&lt;br /&gt;Home page: &lt;a href="http://community.haskell.org/~ndm/derive/"&gt;http://community.haskell.org/~ndm/derive/&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Darcs repo: &lt;a href="http://community.haskell.org/~ndm/darcs/derive"&gt;http://community.haskell.org/~ndm/darcs/derive&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;The work presented in this paper will become the basis of Derive 2.0. Many thanks for any comments!&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7094652-1826575836216565046?l=neilmitchell.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://neilmitchell.blogspot.com/feeds/1826575836216565046/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=7094652&amp;postID=1826575836216565046' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/1826575836216565046'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/1826575836216565046'/><link rel='alternate' type='text/html' href='http://neilmitchell.blogspot.com/2009/06/draft-paper-on-derive-comments-wanted.html' title='Draft paper on Derive, comments wanted'/><author><name>Neil Mitchell</name><uri>http://www.blogger.com/profile/13084722756124486154</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www-users.cs.york.ac.uk/~ndm/elements/my-photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7094652.post-1790264824398560607</id><published>2009-03-21T08:11:00.002Z</published><updated>2009-03-21T15:17:02.484Z</updated><category scheme='http://www.blogger.com/atom/ns#' term='uniplate'/><title type='text'>Concise Generic Queries</title><content type='html'>A few weeks ago David Miani asked &lt;a href="http://thread.gmane.org/gmane.comp.lang.haskell.cafe/54031"&gt;how to write concise queries over a data type&lt;/a&gt;. The answer is certainly generic programming, a technique that I feel is underused in the Haskell community. I suggested David look at &lt;a href="http://community.haskell.org/~ndm/uniplate/"&gt;Uniplate&lt;/a&gt;, but he found greater success with SYB. Sean Leather gave a solution using &lt;a href="http://splonderzoek.blogspot.com/2009/03/experiments-with-emgm-emacs-org-files.html"&gt;EMGM&lt;/a&gt;. One of the advantages of Uniplate is conciseness, so I decided to tackle the same problem and compare.&lt;br /&gt;&lt;br /&gt;A full description of the task, including data type definitions, is at &lt;a href="http://splonderzoek.blogspot.com/2009/03/experiments-with-emgm-emacs-org-files.html"&gt;Sean's blog&lt;/a&gt;. From a data type representing structured files (tables, headings, paragraphs) find a heading with a particular name then within that heading find a paragraph starting with "Description". The rest of this post contains solutions using Uniplate, EMGM (taken from Sean) and SYB (from David). The SYB solution is slightly different from the EMGM or Uniplate solutions, but they all do roughly the same generic operations. It is entirely possible that the EMGM/SYB solutions could be improved, but that is a job for other people.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Uniplate Solution&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;The &lt;a href="http://community.haskell.org/~ndm/uniplate/"&gt;Uniplate&lt;/a&gt; solution is:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;projDesc :: String -&amp;gt; OrgFileP -&amp;gt; Maybe String&lt;br /&gt;projDesc name p = listToMaybe [y |&lt;br /&gt; OrgHeadingP _ x ys &amp;lt;- universeBi p, name == x,&lt;br /&gt; ParagraphP y &amp;lt;- universeBi ys, "Description" `isPrefixOf` y]&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;The code can be read as:&lt;br /&gt;&lt;br /&gt;&lt;ul&gt;&lt;br /&gt;&lt;li&gt;Line 1: Type signature, given a name and a file, return the paragraph if you find one&lt;/li&gt;&lt;br /&gt;&lt;li&gt;Line 3: Find a heading with the right name&lt;/li&gt;&lt;br /&gt;&lt;li&gt;Line 4: Find a paragraph below that heading, whose name starts with "Description"&lt;/li&gt;&lt;br /&gt;&lt;li&gt;Line 2: Pick the paragraph&lt;/li&gt;&lt;br /&gt;&lt;/ul&gt;&lt;br /&gt;&lt;br /&gt;I find this code to be a clear, concise and simple description of the problem. The thought process to come up with the solution was as follows: You want to search, or perform a query. The first question is whether this is a deep (all nodes) or shallow (just the children) query - David doesn't say but the example seems to imply deep. If it's deep use &lt;tt&gt;universeBi&lt;/tt&gt;. Operations are combined with a list comprehension that finds an element, check it has the necessary properties (the name), then performs more operations. The result is the code you see above.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;EMGM Solution&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;Sean's solution can be found at &lt;a href="http://splonderzoek.blogspot.com/2009/03/experiments-with-emgm-emacs-org-files.html"&gt;his blog&lt;/a&gt;:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;projDesc :: String -&amp;gt; OrgFileP -&amp;gt; Maybe String&lt;br /&gt;projDesc name file = do&lt;br /&gt;  hdg &amp;lt;- G.firstr (headings name file)&lt;br /&gt;  para &amp;lt;- firstPara hdg&lt;br /&gt;  if para =~ "Description" then return para else Nothing&lt;br /&gt;&lt;br /&gt;headings :: String -&amp;gt; OrgFileP -&amp;gt; [OrgHeadingP]&lt;br /&gt;headings name = filter check . G.collect&lt;br /&gt;  where&lt;br /&gt;    check (OrgHeadingP _ possible _) = name == possible&lt;br /&gt;&lt;br /&gt;firstPara :: OrgHeadingP -&amp;gt; Maybe String&lt;br /&gt;firstPara hdg = paraStr =&amp;lt;&amp;lt; G.firstr (G.collect hdg)&lt;br /&gt;  where&lt;br /&gt;    paraStr (ParagraphP str) = Just str&lt;br /&gt;    paraStr _                = Nothing&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;This solution isn't bad, but is more verbose than the Uniplate solution. Perhaps it could be rewritten with list comprehensions? It seems that &lt;tt&gt;G.collect&lt;/tt&gt; is similar to &lt;tt&gt;universeBi&lt;/tt&gt; - although I am not sure.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;SYB Solution&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;David's SYB solution can be found &lt;a href="http://moonpatio.com/fastcgi/hpaste.fcgi/view?id=1778#a1778"&gt;here&lt;/a&gt; along with another solution using different combinators.&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;eitherOr :: Either a b -&amp;gt; Either a b -&amp;gt; Either a b&lt;br /&gt;eitherOr x@(Right _) _ = x&lt;br /&gt;eitherOr  _ y  = y&lt;br /&gt;&lt;br /&gt;getP14Desc :: OrgElement -&amp;gt; Either ErrString String&lt;br /&gt;getP14Desc org = everything eitherOr (Left descError `mkQ` findDesc) =&amp;lt;&amp;lt;&lt;br /&gt;                 everything eitherOr (Left findError `mkQ` findP14) org&lt;br /&gt;    where&lt;br /&gt;      findP14 h@(Heading {headingName=name})&lt;br /&gt;          | name == "Project14" = Right h&lt;br /&gt;      findP14 _ = Left findError&lt;br /&gt;&lt;br /&gt;      findDesc (Paragraph {paragraphText=text})&lt;br /&gt;          | text =~ "Description" = Right text&lt;br /&gt;      findDesc _ = Left findError&lt;br /&gt;&lt;br /&gt;      descError = "Couldn't find description for project"&lt;br /&gt;      findError = "Couldn't find project."&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Summary&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;The relative merits of each solution are highly subjective, but I believe the Uniplate solution is concise. The Uniplate solution is a simple translation of the problem, without any clever steps, so hopefully other users (who didn't write the library!) will be able to achieve similar results. The Uniplate solution required only one function from the Uniplate library, so has a small learning curve. Even if you don't choose Uniplate, generic programming techniques are very useful, and can make your code concise and robust.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7094652-1790264824398560607?l=neilmitchell.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://neilmitchell.blogspot.com/feeds/1790264824398560607/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=7094652&amp;postID=1790264824398560607' title='8 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/1790264824398560607'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/1790264824398560607'/><link rel='alternate' type='text/html' href='http://neilmitchell.blogspot.com/2009/03/concise-generic-queries.html' title='Concise Generic Queries'/><author><name>Neil Mitchell</name><uri>http://www.blogger.com/profile/13084722756124486154</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www-users.cs.york.ac.uk/~ndm/elements/my-photo.jpg'/></author><thr:total>8</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7094652.post-17247452422248196</id><published>2009-03-09T21:40:00.005Z</published><updated>2009-03-09T21:52:44.131Z</updated><title type='text'>Website move</title><content type='html'>Today I spotted that I could no longer push to my darcs repos hosted at York University. A little more checking showed that my home page had also been removed - I guess that's what happens when you are no longer a student there (although a warning email before would have been nice...). So I am pleased to announce my new website address:&lt;br /&gt;&lt;br /&gt;&lt;big&gt;&lt;a href="http://community.haskell.org/~ndm/"&gt;http://community.haskell.org/~ndm/&lt;/a&gt;&lt;/big&gt;&lt;br /&gt;&lt;br /&gt;Thanks to the wondrous Haskell community for providing all the resources I needed to move my website with no human intervention at haste. Expect my darcs repos to move somewhere shortly too.&lt;br /&gt;&lt;br /&gt;I have now submitted the final bound copies of my thesis, and have &lt;a href="http://community.haskell.org/~ndm/thesis/"&gt;uploaded a copy&lt;/a&gt; to my website (I had uploaded it to York, but didn't get chance to announce it!). I should say a great thank you to everyone who helped with my work/thesis, in particular Colin Runciman for supervising me for six years, and Detlef Plump and Simon Peyton Jones for examining me and really helping improve the final document with their comments.&lt;br /&gt;&lt;br /&gt;The thesis has four content chapters, corresponding to Uniplate, Supero, Firstify and Catch. I have submitted a paper to ICFP 09 which expands/clarifies the Firstify work, which I'll upload as a draft shortly. For the other chapters, the version in the thesis is an improvement on the version in any papers I've published.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7094652-17247452422248196?l=neilmitchell.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://neilmitchell.blogspot.com/feeds/17247452422248196/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=7094652&amp;postID=17247452422248196' title='4 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/17247452422248196'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/17247452422248196'/><link rel='alternate' type='text/html' href='http://neilmitchell.blogspot.com/2009/03/website-move.html' title='Website move'/><author><name>Neil Mitchell</name><uri>http://www.blogger.com/profile/13084722756124486154</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www-users.cs.york.ac.uk/~ndm/elements/my-photo.jpg'/></author><thr:total>4</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7094652.post-8727050060778324099</id><published>2009-02-22T21:05:00.005Z</published><updated>2009-02-22T21:26:23.999Z</updated><category scheme='http://www.blogger.com/atom/ns#' term='hoogle'/><title type='text'>Hoogle package search</title><content type='html'>Recently on the Haskell mailing list there has been some discussions of which packages Hoogle searches by default. One person remarked that it was unfortunate that the network package isn't searched by default. There are lots of packages on Hackage, and Hoogle needs to decide how to cope with so much choice. There are a number of questions that I need to answer in Hoogle:&lt;br /&gt;&lt;br /&gt;&lt;ol&gt;&lt;br /&gt;&lt;li&gt;What packages should Hoogle search by default? All of hackage? The base libraries? Only the packages a user has installed? Only packages that make it in to the Haskell Platform?&lt;/li&gt;&lt;br /&gt;&lt;li&gt;What groups of packages should Hoogle have available? Each package individually? All packages which compile on Windows? All packages by a certain author? All packages whose minor version number is even?&lt;/li&gt;&lt;br /&gt;&lt;li&gt;What UI should Hoogle show? Should there be checkboxes for each os's package? Should their be a checkbox for each compiler/version? Should their be no UI but some documentation?&lt;/li&gt;&lt;br /&gt;&lt;/ol&gt;&lt;br /&gt;&lt;br /&gt;And these questions present a number of trade offs:&lt;br /&gt;&lt;br /&gt;&lt;ul&gt;&lt;br /&gt;&lt;li&gt;The packages have to be divided under sensible and clear lines - I don't want to (and shouldn't) arbitrate divisions like "good" or "popular".&lt;/li&gt;&lt;br /&gt;&lt;li&gt;The more packages you search, the less relevant the results will be.&lt;/li&gt;&lt;br /&gt;&lt;li&gt;The fewer packages you search, the more chance that you miss something.&lt;/li&gt;&lt;br /&gt;&lt;li&gt;The more UI that is added the more confusing things get.&lt;/li&gt;&lt;br /&gt;&lt;li&gt;My development time for Hoogle derives Bounded, Finite and increasingly also derives Small.&lt;/li&gt;&lt;br /&gt;&lt;/ul&gt;&lt;br /&gt;&lt;br /&gt;Thoughts and suggestions are very welcome. I've set up a wiki page to track peoples thoughts, please make your view and arguments known: &lt;a href="http://haskell.org/haskellwiki/Hoogle/Packages"&gt; http://haskell.org/haskellwiki/Hoogle/Packages&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;&lt;i&gt;(As an aside, I recently found that dolphin friendly tuna is actually really harmful to the environment, far more harmful than dolphin unfriendly tuna. Read &lt;a href="http://southernfriedscientist.wordpress.com/2009/02/16/the-ecological-disaster-that-is-dolphin-safe-tuna/"&gt;more here&lt;/a&gt;.)&lt;/i&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7094652-8727050060778324099?l=neilmitchell.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://neilmitchell.blogspot.com/feeds/8727050060778324099/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=7094652&amp;postID=8727050060778324099' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/8727050060778324099'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/8727050060778324099'/><link rel='alternate' type='text/html' href='http://neilmitchell.blogspot.com/2009/02/hoogle-package-search.html' title='Hoogle package search'/><author><name>Neil Mitchell</name><uri>http://www.blogger.com/profile/13084722756124486154</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www-users.cs.york.ac.uk/~ndm/elements/my-photo.jpg'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7094652.post-8575289432112235262</id><published>2009-02-03T21:09:00.000Z</published><updated>2009-02-03T21:17:33.811Z</updated><title type='text'>Monomorphism and Defaulting</title><content type='html'>Haskell has some ugly corners - not many, but a few. One that many people consider exceptionally ugly is the monomorphism restriction. In this post I'm going to discuss three related issues - Constant Applicative Forms (CAFs), the monomorphism restriction and defaulting. But before we start, lets take a simple example.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Computing Pi&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;Haskell already provides the &lt;tt&gt;&lt;a href="http://haskell.org/hoogle/?hoogle=pi"&gt;pi&lt;/a&gt;&lt;/tt&gt; function which represents the value of pi, but lets assume it didn't. Taking a quick look at &lt;a href="http://en.wikipedia.org/wiki/Pi#Calculating_.CF.80"&gt;Wikipedia&lt;/a&gt; we can see that one way of computing Pi is the &lt;a href="http://en.wikipedia.org/wiki/Leibniz_formula_for_pi"&gt;Gregory-Leibniz series&lt;/a&gt;. We can calculate pi as:&lt;br /&gt;&lt;br /&gt;pi = (4/1) + (-4/3) + (4/5) + (-4/7) + (4/9) + (-4/11) ...&lt;br /&gt;&lt;br /&gt;So let's write that as a Haskell program:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;pie = sum $ take 1000000 $ zipWith (/) (iterate negate 4) [1,3..]&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Here the constant 1000000 gives the accuracy of our approach, increasing this value will give a higher precision. As it currently stands, the Haskell library says &lt;tt&gt;pi = 3.14159265358979&lt;/tt&gt; and our program says &lt;tt&gt;pie = 3.14159165358977&lt;/tt&gt;. Thirteen matching digits should be suffient for most uses of pi :-)&lt;br /&gt;&lt;br /&gt;&lt;b&gt;CAFs&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;The disadvantage of our &lt;tt&gt;pie&lt;/tt&gt; function is that (under Hugs) it takes about 4 seconds to evaluate. If we are performing lots of calculations with pi, calculating &lt;tt&gt;pie&lt;/tt&gt; each time will be a serious problem. CAFs are the solution!&lt;br /&gt;&lt;br /&gt;A CAF is a top-level constant, which doesn't take any arguments, and will be computed at most once per program execution. As a slight subtlety, if the constant has class constraints on it (i.e. is &lt;tt&gt;Num a =&gt; a&lt;/tt&gt;, instead of &lt;tt&gt;a&lt;/tt&gt;) then it isn't a CAF because the class constraints act like implicit arguments. Our &lt;tt&gt;pie&lt;/tt&gt; function above doesn't take any arguments, so is a CAF.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Defaulting&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;While &lt;tt&gt;pie&lt;/tt&gt; doesn't have any class constraints, the right-hand side of &lt;tt&gt;pie&lt;/tt&gt; does! Take a look in Hugs:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;Main&gt; :t sum $ take 1000000 $ zipWith (/) (iterate negate 4) [1,3..]&lt;br /&gt;:: (Enum a, Fractional a) =&gt; a&lt;br /&gt;&lt;br /&gt;Main&gt; :t pie&lt;br /&gt;:: Double&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;The right-hand side works for any &lt;tt&gt;Enum&lt;/tt&gt; and &lt;tt&gt;Fractional&lt;/tt&gt; type, for example &lt;tt&gt;Float&lt;/tt&gt;, but &lt;tt&gt;pie&lt;/tt&gt; is restricted to &lt;tt&gt;Double&lt;/tt&gt;. The reason is the defaulting mechanism in Haskell - if a type can't be nailed down precisely, but is one of a handful of built-in classes, then it will default to a particular type. This feature is handy for working at an interactive environment, but can sometimes be a little unexpected.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Monomorphism restriction&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;Without defaulting the compiler would infer the type of &lt;tt&gt;pie&lt;/tt&gt; as &lt;tt&gt;::(Enum a, Fractional a) =&gt; a&lt;/tt&gt;. However, such a definition would be rejected by the monomorphism restriction. The monomorphism restriction states that a function with no explicit arguments, but with class constraints, must be given a type annotation. This rejects functions like:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;snub = sort . nub&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;To fix the problem there are two solutions:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;snub i_hate_the_evil_mr = (sort . nub) i_hate_the_evil_mr&lt;br /&gt;&lt;br /&gt;snub :: Ord a =&gt; [a] -&gt; [a]&lt;br /&gt;snub = sort . nub&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;For a function like &lt;tt&gt;pie&lt;/tt&gt; only the second approach is applicable. The addition of dummy arguments to avoid the monomorphism restriction is sufficiently common that the &lt;a href="http://www.cs.york.ac.uk/~ndm/hlint/"&gt;HLint tool&lt;/a&gt; never suggests eta-reduction if the argument is named &lt;tt&gt;mr&lt;/tt&gt;.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Conclusion&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;So why was the monomorphism restriction first introducted? For a function with no explicit arguments, the programmer might think they had written a CAF, but class constraints may substantially degrade the performance. Defaulting reduces the number of cases where the monomorphism restriction would otherwise bite, but it is still useful to be aware of the ugly corners.&lt;br /&gt;&lt;br /&gt;There are proposals afoot to remove the monomorphism restriction and to increase the power of the default mechanism - hopefully both will be included in to Haskell'.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7094652-8575289432112235262?l=neilmitchell.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://neilmitchell.blogspot.com/feeds/8575289432112235262/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=7094652&amp;postID=8575289432112235262' title='13 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/8575289432112235262'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/8575289432112235262'/><link rel='alternate' type='text/html' href='http://neilmitchell.blogspot.com/2009/02/monomorphism-and-defaulting.html' title='Monomorphism and Defaulting'/><author><name>Neil Mitchell</name><uri>http://www.blogger.com/profile/13084722756124486154</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www-users.cs.york.ac.uk/~ndm/elements/my-photo.jpg'/></author><thr:total>13</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7094652.post-7581404546979601962</id><published>2009-01-27T20:43:00.002Z</published><updated>2009-02-19T13:30:56.139Z</updated><title type='text'>Small scripts with Haskell</title><content type='html'>Normally I give blog posts detailing the fun, interesting or advanced stuff I do with Haskell. But that isn't a real representation of my programming life! Most of the time I am doing small scripts that do little tasks, so I thought I'd describe one of those. This post is written as Literate Haskell, which means you can save the whole contents as a .lhs file and run it in GHCi or Hugs.&lt;br /&gt;&lt;br /&gt;The task I had to complete was to take a directory of files, and for each file &lt;tt&gt;foo.txt&lt;/tt&gt; generate the files &lt;tt&gt;foo_m1.txt&lt;/tt&gt; to &lt;tt&gt;foo_m3.txt&lt;/tt&gt;, where each one file is a block of lines from the original delimited by a blank line. i.e. given the file with the lines &lt;tt&gt;["","1","1","","2","","3"]&lt;/tt&gt;, the numbers &lt;tt&gt;"1"&lt;/tt&gt; would go in &lt;tt&gt;foo_m1.txt&lt;/tt&gt; etc.&lt;br /&gt;&lt;br /&gt;This blog post isn't how I actually wrote the original script - I didn't use literate Haskell (since I find it ugly), I didn't give explicit import lists (since they are needlessly verbose), I didn't give type signatures (but I should have) and I didn't split the IO and non-IO as well (but again, I should have). It is intended as a guide to the simple things you can easily do with Haskell. Now on to the code...&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;&gt; import System.FilePath(takeExtension, dropExtension, (&amp;lt;.&gt;), (&amp;lt;/&gt;))&lt;br /&gt;&gt; import System.Directory(getDirectoryContents)&lt;br /&gt;&gt; import Data.Char(isSpace)&lt;br /&gt;&gt; import Control.Monad&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;First, let's import some useful modules. To find more about a particular function just use &lt;a href="http://haskell.org/hoogle/"&gt;Hoogle&lt;/a&gt; and search for it, but a quick summary:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;takeExtension "foo.txt" = ".txt"&lt;br /&gt;dropExtension "foo.txt" = "foo"&lt;br /&gt;"foo" &amp;lt;.&amp;gt; "txt" = "foo.txt"&lt;br /&gt;"bar" &amp;lt;/&amp;gt; "foo.txt" = "bar/foo.txt"&lt;br /&gt;getDirectoryContents "C:\Windows" = running "dir C:\Windows" at the command prompt&lt;br /&gt;isSpace ' ' = True&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Every Haskell program starts with a &lt;tt&gt;main&lt;/tt&gt; function, which is an IO action. For this program, we are going to keep all the IO in main, and only use other pure functions. With most file processing applications its best to read files from one directory, and write them to another. That way, if anything goes wrong, its usually easy to recover. In this case we read from "data" and write to "res".&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;&gt; main :: IO ()&lt;br /&gt;&gt; main = do&lt;br /&gt;&gt;     files &lt;- getDirectoryContents "data"&lt;br /&gt;&gt;     forM_ files $ \file -&gt; when (takeExtension file == ".txt") $ do&lt;br /&gt;&gt;         src &lt;- readFile $ "data" &amp;lt;/&gt; file&lt;br /&gt;&gt;         forM_ (zip [1..] (splitFile src)) $ \(i,x) -&gt;&lt;br /&gt;&gt;              writeFile ("res" &amp;lt;/&amp;gt; dropExtension file ++ "_m" ++ show i &amp;lt;.&gt; "txt") x&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Or in some kind of pseudo-code:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;main =&lt;br /&gt;    set files to be the list of files in the directory "data"&lt;br /&gt;    for each file in files which has the extension ".txt"&lt;br /&gt;    {&lt;br /&gt;        set src to be the result of reading the file&lt;br /&gt;        for each numbered result of splitFile&lt;br /&gt;        {&lt;br /&gt;            write out the value from splitFile to the location "res/file_m#.txt"&lt;br /&gt;            where # is the 1-based index into the list of results&lt;br /&gt;        }&lt;br /&gt;    }&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;We can now move on to the pure bits left over. We want a function &lt;tt&gt;splitFile&lt;/tt&gt; that takes a file, and splits it in to three chunks for each of the blocks in the file. When processing text, often there will be stray blank lines, and the term "blank lines" will also apply to lines consisting only of spaces. The code is below:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;&gt; splitFile :: String -&gt; [String]&lt;br /&gt;&gt; splitFile xs = map (tabify . unlines) [s1,s2,s3]&lt;br /&gt;&gt;     where&lt;br /&gt;&gt;         xs2 = dropWhile null $ map (dropWhile isSpace) $ lines xs&lt;br /&gt;&gt;         (s1,_:rest) = break null xs2&lt;br /&gt;&gt;         (s2,_:s3) = break null $ dropWhile null rest&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;And now presented more as a list of steps:&lt;br /&gt;&lt;br /&gt;&lt;ul&gt;&lt;br /&gt;&lt;li&gt;split the text in to lines&lt;/li&gt;&lt;br /&gt;&lt;li&gt;for each line drop all the leading spaces from it&lt;/li&gt;&lt;br /&gt;&lt;li&gt;drop all the leading blank lines&lt;/li&gt;&lt;br /&gt;&lt;li&gt;break on the first empty line, the bits before are chunk 1&lt;/li&gt;&lt;br /&gt;&lt;li&gt;drop all leading blank lines for the rest&lt;/li&gt;&lt;br /&gt;&lt;li&gt;break on the first empty line in the rest, before is chunk 2, after is chunk 3&lt;/li&gt;&lt;br /&gt;&lt;li&gt;for each of the chunks, put the lines back together, then tabify them&lt;/li&gt;&lt;br /&gt;&lt;/ul&gt;&lt;br /&gt;&lt;br /&gt;The tabify requirement was added after. The person decided that all continuous runs of spaces should be converted to tabs, so the file could better be loaded in to a spread sheet. Easy enough to add, just a simple bit of recursive programming:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;&gt; tabify (' ':xs) = '\t' : tabify (dropWhile (== ' ') xs)&lt;br /&gt;&gt; tabify (x:xs) = x : tabify xs&lt;br /&gt;&gt; tabify [] = []&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;And again in English:&lt;br /&gt;&lt;br /&gt;&lt;ul&gt;&lt;br /&gt;&lt;li&gt;if you encouter a space, drop it and all successive spaces, and write out a tab&lt;/li&gt;&lt;br /&gt;&lt;li&gt;otherwise just continue onwards&lt;/li&gt;&lt;br /&gt;&lt;/ul&gt;&lt;br /&gt;&lt;br /&gt;Haskell is a great language for writing short scripts, and as the libraries improve it just keeps getting better.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7094652-7581404546979601962?l=neilmitchell.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://neilmitchell.blogspot.com/feeds/7581404546979601962/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=7094652&amp;postID=7581404546979601962' title='10 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/7581404546979601962'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/7581404546979601962'/><link rel='alternate' type='text/html' href='http://neilmitchell.blogspot.com/2009/01/small-scripts-with-haskell.html' title='Small scripts with Haskell'/><author><name>Neil Mitchell</name><uri>http://www.blogger.com/profile/13084722756124486154</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www-users.cs.york.ac.uk/~ndm/elements/my-photo.jpg'/></author><thr:total>10</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7094652.post-4214121603928813059</id><published>2009-01-18T18:00:00.002Z</published><updated>2009-01-18T18:16:15.287Z</updated><title type='text'>FsCheck changes</title><content type='html'>Kurt Schelfthout has just released &lt;a href="http://fortysix-and-two.blogspot.com/2009/01/announcing-fscheck-04.html"&gt;FsCheck 0.4&lt;/a&gt;, a tool similar to QuickCheck but for F#. While working at my internship for &lt;a href="http://www.credit-suisse.com/"&gt;Credit Suisse&lt;/a&gt; I spent a little bit of time modifying FsCheck to include automatic generators (so you don't have to describe how to generate arbitrary values) and failure shrinking (so the counter-examples are smaller). Both these changes have now been incorporated in to the main FsCheck tool. It is really nice to see the work being contributed back, and that big companies are taking the time to get the necessary legal clearance etc.&lt;br /&gt;&lt;br /&gt;I find shrinking to be a particularly potent feature. In one real-world task I struggled to debug a test failure for 8 hours, before shrinking was available. Attacking the same example with FsCheck and shrinking made the reason for the test failure immediately obvious.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7094652-4214121603928813059?l=neilmitchell.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://neilmitchell.blogspot.com/feeds/4214121603928813059/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=7094652&amp;postID=4214121603928813059' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/4214121603928813059'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/4214121603928813059'/><link rel='alternate' type='text/html' href='http://neilmitchell.blogspot.com/2009/01/fscheck-changes.html' title='FsCheck changes'/><author><name>Neil Mitchell</name><uri>http://www.blogger.com/profile/13084722756124486154</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www-users.cs.york.ac.uk/~ndm/elements/my-photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7094652.post-3561767299635558115</id><published>2008-12-11T12:03:00.002Z</published><updated>2008-12-11T12:16:05.466Z</updated><title type='text'>mapM, mapM_ and monadic statements</title><content type='html'>In my last post on F# I mentioned that &lt;tt&gt;do mapM f xs; return 1&lt;/tt&gt; caused a space leak, and that the programmer should have written &lt;tt&gt;mapM_&lt;/tt&gt;. I also proposed that monadic statements should work more like in F# where non-unit return values can't be ignored. Various people seemed to misunderstand both points, so I thought I'd elaborate.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;mapM as a Space Leak&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;First, I should clarify what I understand as a space leak. A space leak is not a memory leak in the C sense. A space leak is when a computation retains more live memory than necessary for some period of time. One sign of a possible space leak is that lots of memory is retained by garbage collection.&lt;br /&gt;&lt;br /&gt;Comparing &lt;tt&gt;mapM&lt;/tt&gt; and &lt;tt&gt;mapM_&lt;/tt&gt; on the following program:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;main = do&lt;br /&gt;    mapM* putChar (replicate 10000 'a')&lt;br /&gt;    return ()&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;The &lt;tt&gt;mapM_&lt;/tt&gt; variant has a maximum heap residency of 2Kb, while the &lt;tt&gt;mapM&lt;/tt&gt; variant has 226Kb. Given an input list of length &lt;i&gt;n&lt;/i&gt;, the residency of &lt;tt&gt;mapM_&lt;/tt&gt; is &lt;i&gt;O(1)&lt;/i&gt;, while &lt;tt&gt;mapM&lt;/tt&gt; is &lt;i&gt;O(n)&lt;/i&gt;.&lt;br /&gt;&lt;br /&gt;The exact reasons for the space leak are quite detailed, and I'm not going to attempt to cover them. My intuition is that the return list is wrapped in the IO monad, and therefore can't be deallocated until the IO action finishes. In summary, unless you are going to use the end result of a monadic map, always use &lt;tt&gt;mapM_&lt;/tt&gt;.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Monadic Statements&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;In the above example it would be nice if the compiler had complained. You generated a value, but you didn't use it. Fortunately, it is a very easy fix - change the type of monadic bind &lt;tt&gt;(&gt;&gt;) :: Monad m =&gt; m a -&gt; m b -&gt; b&lt;/tt&gt; to &lt;tt&gt;Monad m =&gt; m () -&gt; m b -&gt; m b&lt;/tt&gt;. Now, if a monadic statement generates a value that isn't &lt;tt&gt;()&lt;/tt&gt;, you get a type error. The above examples with &lt;tt&gt;mapM&lt;/tt&gt; would be rejected by the type checker.&lt;br /&gt;&lt;br /&gt;But what if we really wanted to call &lt;tt&gt;mapM&lt;/tt&gt;? There are two options. The first is to bind the result, for example &lt;tt&gt;do _ &amp;lt;- mapM f xs; return 1&lt;/tt&gt;. The second option, which F# favours, is &lt;tt&gt;do ignore $ mapM f xs ; return 1&lt;/tt&gt;, with the auxiliary &lt;tt&gt;ignore :: Monad m =&gt; m a -&gt; m ()&lt;/tt&gt;. I prefer the second option, as it clearly states that you want to ignore the result of a computation. You could even write a rule &lt;tt&gt;ignore . mapM f == mapM_ f&lt;/tt&gt;.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7094652-3561767299635558115?l=neilmitchell.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://neilmitchell.blogspot.com/feeds/3561767299635558115/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=7094652&amp;postID=3561767299635558115' title='13 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/3561767299635558115'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/3561767299635558115'/><link rel='alternate' type='text/html' href='http://neilmitchell.blogspot.com/2008/12/mapm-mapm-and-monadic-statements.html' title='mapM, mapM_ and monadic statements'/><author><name>Neil Mitchell</name><uri>http://www.blogger.com/profile/13084722756124486154</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www-users.cs.york.ac.uk/~ndm/elements/my-photo.jpg'/></author><thr:total>13</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7094652.post-525131105787082299</id><published>2008-12-07T19:16:00.001Z</published><updated>2008-12-08T08:22:49.228Z</updated><title type='text'>F# from a Haskell perspective</title><content type='html'>I've recently started a full-time job at &lt;a href="http://www.standardchartered.com/"&gt;Standard Chartered&lt;/a&gt;. Before that I was doing an internship with &lt;a href="http://www.credit-suisse.com/"&gt;Credit Suisse&lt;/a&gt;, where I spent a reasonable amount of time doing F# programming. Before I started F# I had 6 years of Haskell experience, plenty of C# experience, but little exposure to ML. I've now had 3 months to experiment with F#, using an old version (the one before the latest Community Technology Preview) and here are my impressions.&lt;br /&gt;&lt;br /&gt;&lt;a href="http://research.microsoft.com/fsharp/fsharp.aspx"&gt;F#&lt;/a&gt; is a functional language from Microsoft, previously a Microsoft Research language, which is moving towards a fully supported language. F# is based on ML, and some (perhaps many) ML programs will compile with F#. At the same time, F# has complete access to the .NET framework and can interoperate with languages such as C#. F# is a hybrid language - at one extreme you can write purely functional ML, and at the other extreme you can write imperative C#, just using a different syntax. F# seems to be designed as a practical language - it isn't elegant or small, but does interoperate very nicely with every .NET feature.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;The Language&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;The F# language is based on ML, with the addition of indentation based layout, and many of the weaknesses in F# come from ML. F# is certainly more verbose than Haskell: in some places you need an extra keyword (often a &lt;tt&gt;let&lt;/tt&gt;); pattern matching is not nearly as complete; the indentation isn't as natural as Haskell. However there are some nice syntactic features in F# that are not in Haskell, including generalised list/array/sequence comprehensions and active patterns.&lt;br /&gt;&lt;br /&gt;The type checker in F# is powerful, but unpredictable. I often get surprised by where type annotations need to go, particularly when working with .NET object types. The tuple type is treated specially in many cases, and this also leads to surprise - inserting or removing a pair of brackets can effect the type checker. Much of this complexity is necessary to manage the interaction with .NET, but it does complicate the language. Unfortunately, even with the advanced type features in F#, there are no type classes. The lack of type classes precludes the standard implementation of things such as &lt;a href="http://www-users.cs.york.ac.uk/~ndm/uniplate/"&gt;Uniplate&lt;/a&gt; and &lt;a href="http://www.cs.chalmers.se/~rjmh/QuickCheck/"&gt;QuickCheck&lt;/a&gt;. However, F# does have some nice reflection capabilities, and often entirely generic implementations can be given using reflection. There is certainly an interesting design trade-off between reflection based operations and type classes, something I have looked at in the past and hope to explore again in future.&lt;br /&gt;&lt;br /&gt;F# is an impure language, which offers some practical benefits from Haskell, but also encourages a less functional style. In Haskell I sometimes work within a localised state monad - F# makes this much more natural. The impurity also allows simple interaction with .NET. Having programmed with an impure language I did find myself reaching for localised state much more often - and was occasionally tempted into using global state. In most cases, this state became problematic later. Before using F# I thought purity was a good thing, now I'm convinced that purity is a good thing but that impurity is often very useful!&lt;br /&gt;&lt;br /&gt;Haskell could learn some things from F#. Every statement in F# must either be bound to a value or evaluate to &lt;tt&gt;()&lt;/tt&gt;. In Haskell it is possible to write &lt;tt&gt;do mapM f xs; return 1&lt;/tt&gt;. Any experienced Haskell programmer should spot that the &lt;tt&gt;mapM&lt;/tt&gt; is a space leak (it should be &lt;tt&gt;mapM_&lt;/tt&gt;), but the type system doesn't enforce it. In F# the type system does. The change in Haskell is simple, and in my opinion, desirable.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;The Platform&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;F# is a fully fledged member of the .NET platform. You can write a class in VB.NET, write a derived class in F#, and then derive from that class in C#. All the languages can produce and consume the same libraries. This integration with .NET allows companies that already use the Microsoft tools to easily migrate - even on a function by function basis. However, the combination of both an imperative framework and a functional language at some times leads to confusing choices. All of the standard .NET libraries work with arrays, but for a functional program the list is a more natural type. F# provides both, and it was never clear which I should use where, leading to lots of conversions. The .NET libraries are very powerful, but often are overly imperative. For example, the XSD libraries (Xml Schema Description) are very imperative - you have to create objects, mutate properties, then make calls. However, in Haskell, I probably wouldn't have had &lt;i&gt;any&lt;/i&gt; XSD support, certainly nothing as well-supported as in .NET.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;The Tool Chain&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;The most impressive aspect of F# is the integration with the Visual Studio environment. F# contains a debugger, profiler, auto-completion, identifier lookup and many other tools. While other functional languages have some of these tools, the Visual Studio environment tends to have very refined and polished implementations. The integration with F# is sometimes a little fragile, or at least was in the version I was using, but the tools are already very powerful and are likely to continue to improve.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Overall&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;The F# language isn't the most beautiful language ever, but it's not bad. The integration with .NET is incredible, and while this requires compromises in the language, the benefits are considerable. I still prefer Haskell as a language, but for many users the tool chain is a more important consideration, and here F# excels.&lt;br /&gt;&lt;br /&gt;&lt;i&gt;This post was brought to you by &lt;tt&gt;Ctrl&lt;/tt&gt; and &lt;tt&gt;v&lt;/tt&gt;, as the computer I am currently using doesn't have a # key!&lt;/i&gt;&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Update:&lt;/b&gt; I'd recommend reading Vesa Karvonen's comment below - he has additional perspectives on F# from a more ML perspective.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7094652-525131105787082299?l=neilmitchell.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://neilmitchell.blogspot.com/feeds/525131105787082299/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=7094652&amp;postID=525131105787082299' title='25 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/525131105787082299'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/525131105787082299'/><link rel='alternate' type='text/html' href='http://neilmitchell.blogspot.com/2008/12/f-from-haskell-perspective.html' title='F# from a Haskell perspective'/><author><name>Neil Mitchell</name><uri>http://www.blogger.com/profile/13084722756124486154</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www-users.cs.york.ac.uk/~ndm/elements/my-photo.jpg'/></author><thr:total>25</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7094652.post-406256834135592873</id><published>2008-12-06T09:39:00.003Z</published><updated>2011-12-14T21:24:57.286Z</updated><title type='text'>Enabling Reply To All in Outlook</title><content type='html'>&lt;b&gt;Update:&lt;/b&gt; See &lt;a href="http://neilmitchell.blogspot.com/2011/12/enabling-reply-to-all-in-outlook.html"&gt;http://neilmitchell.blogspot.com/2011/12/enabling-reply-to-all-in-outlook.html&lt;/a&gt; for an updated version of this functionality.&lt;br /&gt;&lt;br /&gt;Some companies lock down the use of Outlook by disabling the Reply To All button. This makes it harder to manage email, and requires manually copying email addresses to get the same effect. But using a bit of Office VBA, it is possible to make a functioning Reply To All button. The following solution has been tested in Outlook 2003, but should work for older versions as well.&lt;br /&gt;&lt;br /&gt;First, enable macros in Outlook. Go to Tools, Macro, Security and select Medium or Low security.&lt;br /&gt;&lt;br /&gt;Second, add a Reply To All action. Go to Tools, Macro, Visual Basic Editor and put the following code in the text editor.&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;Option Explicit&lt;br /&gt;&lt;br /&gt;Public Sub ReallyReplyAll()&lt;br /&gt;Dim o As MailItem&lt;br /&gt;Set o = Application.ActiveExplorer.Selection.Item(1)&lt;br /&gt;o.ReplyAll.Display&lt;br /&gt;End Sub&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Finally, add a toolbar button to invoke the action. Go to Tools, Customise, Commands, Macros, and drag and drop the command Project1.ThisOutlookSession.ReallyReplyAll on to the toolbar. You can put this command exactly where you used to have Reply To All, and give it the same icon/name.&lt;br /&gt;&lt;br /&gt;To test, select an email and click on the button you just added, it should do exactly what Reply To All would have done. There are some minor limitations to this method:&lt;br /&gt;&lt;br /&gt;&lt;ul&gt;&lt;li&gt;&lt;br /&gt;The button will not disable itself when it isn't applicable, i.e. when there are no emails selected. You will still be able to click on the button, but it won't do anything.&lt;br /&gt;&lt;/li&gt;&lt;li&gt;&lt;br /&gt;If you select a medium level of macro security, you will have to go through a security confirmation the first time you click Reply To All in an Outlook session.&lt;br /&gt;&lt;/li&gt;&lt;/ul&gt;&lt;br /&gt;&lt;br /&gt;If possible, try to educate the person in charge that Reply To All is perfectly good email etiquette, and that people should be trusted to use it responsibly. However, if that fails, the above method is a useful fallback.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7094652-406256834135592873?l=neilmitchell.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://neilmitchell.blogspot.com/feeds/406256834135592873/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=7094652&amp;postID=406256834135592873' title='15 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/406256834135592873'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/406256834135592873'/><link rel='alternate' type='text/html' href='http://neilmitchell.blogspot.com/2008/12/enabling-reply-to-all-in-outlook.html' title='Enabling Reply To All in Outlook'/><author><name>Neil Mitchell</name><uri>http://www.blogger.com/profile/13084722756124486154</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www-users.cs.york.ac.uk/~ndm/elements/my-photo.jpg'/></author><thr:total>15</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7094652.post-3286800939331662255</id><published>2008-09-29T20:21:00.003+01:00</published><updated>2008-09-29T21:53:14.501+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='catch'/><category scheme='http://www.blogger.com/atom/ns#' term='uniplate'/><title type='text'>General Updates</title><content type='html'>It's been a little while since I last posted. I've recently got back from &lt;a href="http://www.icfpconference.org/icfp2008/"&gt;ICFP 2008&lt;/a&gt;, and quite a few people asked me what I was doing now. I've also got a few comments on a a few other things. The following as a section of disjointed paragraphs on a variety of topics, both academic and personal.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Catch Talk&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;I gave a talk at the &lt;a href="http://haskell.org/haskell-symposium/2008/"&gt;Haskell Symposium&lt;/a&gt;, about Catch. A video of the talk is &lt;a href="http://video.google.com/videoplay?docid=8250544235079789504"&gt;now online&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Generics Talk&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;Alexey &lt;a href="http://video.google.com/videoplay?docid=1269998691689979629"&gt;gave a talk&lt;/a&gt; about generic programming libraries at the Haskell Symposium. I was particularly interested in this talk as it is somewhat like a competition between libraries, where &lt;a href="http://www.cs.york.ac.uk/~ndm/uniplate/"&gt;Uniplate&lt;/a&gt; is one of the competitors. One thing I noticed is that the Uniplate version of the SYB example in the talk can be written as one single lexeme, namely &lt;tt&gt;uniplateBi&lt;/tt&gt;. The talk was much more about generics libraries, while Uniplate is probably more accurately described as a traversal library, so issues such as conciseness of code were left out. One thing I did disagree with from the talk was the assertion that Uniplate &lt;i&gt;requires&lt;/i&gt; Template Haskell and Data/Typeable deriving. In reality Uniplate requires neither, but if they are present, then you have the option of using them to write even less code.&lt;br /&gt;&lt;br /&gt;From a combination of the paper and the talk I think its fair to conclude that if Uniplate does what you want, its a pretty good choice. This fits well with the Uniplate philosophy of giving up a small amount of power, to allow a massive simplification, while still being powerful enough for most tasks.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;PhD/Work&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;I've had my PhD viva (passed with minor corrections), and have nearly finished making the minor corrections. I'll update my website with a revised copy of the thesis shortly. I'm currently working at &lt;a href="http://www.credit-suisse.com/"&gt;Credit Suisse&lt;/a&gt; on a three month internship. I'm not working on Haskell stuff, but instead am doing F# programming. To get a feel for some of the things that are done by Credit Suisse I recommend looking at Ganesh's ICFP talk/paper and Howard's CUFP talk. Disclaimer: Nothing I say on this blog, or anywhere public, has anything to do with Credit Suisse, but are my personal thoughts.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Personal Life&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;I've just moved to Cambridge, and got engaged to my girlfriend (now fiancee), Emily King. I'll be commuting to Credit Suisse for the next two months.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;My Libraries/Tools&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;Now I'm working full-time (long hours plus a long commute), its hard for me to put the same amount of time into updating and maintaining my Haskell libraries and tools. I will still be accepting patches and answering questions, but probably not fixing too many bugs at any great speed. I'm still maintaining my &lt;a href="http://code.google.com/p/ndmitchell/issues/list"&gt;bug tracker&lt;/a&gt;, so feel free to add bugs, fix bugs, or comment on bugs. If anyone has any particular interest in a tool, I'd consider taking on a co-maintainer to reduce some of the maintenance burden.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;&lt;a href="http://www.well-typed.com/"&gt;Well Typed&lt;/a&gt;&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;There is now a Haskell consultancy, &lt;a href="http://www.well-typed.com/"&gt;Well Typed&lt;/a&gt;, comprising of Duncan Coutts and Ian Lynagh. These are two very good Haskell hackers, who are now selling their knowledge and experience. Between them, they've had substantial experience with GHC, Cabal, Hackage, ByteString, TemplateHaskell and numerous Haskell libraries. They've also taught lots of students Haskell, and helped lots of beginners on IRC and mailing lists. If I want help with Haskell, or with the general infrastructure and tools, they are usually the first people I approach. I strongly recommend that anyone needing Haskell help in a commercial environment get in contact with them - they can help you get the most out of Haskell. Disclaimer: I haven't been asked to write this section, and haven't checked with Ian/Duncan first, but I do wish them luck!&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7094652-3286800939331662255?l=neilmitchell.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://neilmitchell.blogspot.com/feeds/3286800939331662255/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=7094652&amp;postID=3286800939331662255' title='5 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/3286800939331662255'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/3286800939331662255'/><link rel='alternate' type='text/html' href='http://neilmitchell.blogspot.com/2008/09/general-updates.html' title='General Updates'/><author><name>Neil Mitchell</name><uri>http://www.blogger.com/profile/13084722756124486154</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www-users.cs.york.ac.uk/~ndm/elements/my-photo.jpg'/></author><thr:total>5</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7094652.post-6465547669523148877</id><published>2008-08-28T18:02:00.001+01:00</published><updated>2008-08-28T18:04:57.700+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='hoogle'/><title type='text'>Running your own Hoogle on a Web Server</title><content type='html'>As promised, here is a guide on deploying Hoogle on a web server. Before doing so, you need to generate the necessary Hoogle databases, as &lt;a href="http://neilmitchell.blogspot.com/2008/08/hoogle-database-generation.html"&gt;described yesterday&lt;/a&gt;, and place them in the datadir configured with Cabal. Then:&lt;br /&gt;&lt;br /&gt;&lt;ul&gt;&lt;br /&gt;&lt;li&gt;Move the &lt;tt&gt;hoogle&lt;/tt&gt; binary to a location where it can act as a CGI binary, perhaps changing its name to &lt;tt&gt;index.cgi&lt;/tt&gt;, if necessary. Configure the CGI program to run, possibly changing the program to be executable or adding settings somewhere.&lt;/li&gt;&lt;br /&gt;&lt;li&gt;Copy the files from &lt;tt&gt;src/res&lt;/tt&gt; in the &lt;a href="http://code.haskell.org/hoogle/"&gt;darcs repo&lt;/a&gt; into a &lt;tt&gt;res&lt;/tt&gt; directory located beside the binary.&lt;/li&gt;&lt;br /&gt;&lt;li&gt;Create a file &lt;tt&gt;log.txt&lt;/tt&gt; and give it global write permissions.&lt;/li&gt;&lt;br /&gt;&lt;/ul&gt;&lt;br /&gt;&lt;br /&gt;Now you should have Hoogle running on a web server! Some of the features, such as OpenSearch integration, won't work - but Hoogle should be usable. If anyone does get Hoogle running on a web server I'd love to hear, any feedback appreciated. In particular, if there are any tweaks required please let me know.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7094652-6465547669523148877?l=neilmitchell.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://neilmitchell.blogspot.com/feeds/6465547669523148877/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=7094652&amp;postID=6465547669523148877' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/6465547669523148877'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/6465547669523148877'/><link rel='alternate' type='text/html' href='http://neilmitchell.blogspot.com/2008/08/running-your-own-hoogle-on-web-server.html' title='Running your own Hoogle on a Web Server'/><author><name>Neil Mitchell</name><uri>http://www.blogger.com/profile/13084722756124486154</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www-users.cs.york.ac.uk/~ndm/elements/my-photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7094652.post-4584511224016417397</id><published>2008-08-27T10:06:00.002+01:00</published><updated>2008-08-27T10:46:35.258+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='hoogle'/><title type='text'>Hoogle Database Generation</title><content type='html'>&lt;i&gt;Brief Annoucement:&lt;/i&gt; A new release of the &lt;a href="http://hackage.haskell.org/cgi-bin/hackage-scripts/package/hoogle"&gt;Hoogle command line&lt;/a&gt; is out, including bug fixes and additional features. Upgrading is recommended.&lt;br /&gt;&lt;br /&gt;Two interesting features of Hoogle 4 are working with mulitple function databases (from multiple packages), and running your own web server. Both these features aren't fully developed yet, and may change in their use, but can be used with care. This post covers how to generate your own databases, and how the web version databases are generated. Tomorrow I'm going to post on how to run your own Hoogle web server, but you'll need to generate your databases first! I'm going to walk through all the steps to create a database from the &lt;a href="http://www.cs.york.ac.uk/~ndm/filepath/"&gt;filepath library&lt;/a&gt;, as an example&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Hoogle Databases&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;A Hoogle database is a set of searchable things, including text and type searching, and has a ".hoo" extension. A database may include the definitions from one package, or from multiple packages. Typically the Hoogle databases installed would include one database for each package (i.e. base.hoo, filepath.hoo), a default database (default.hoo) comprising of all the standard search items, and any number of custom databases (all.hoo) which comprise of different combinations of the other databases.&lt;br /&gt;&lt;br /&gt;When using Hoogle, adding &lt;tt&gt;+name&lt;/tt&gt; will include the given database in the search list, and &lt;tt&gt;-name&lt;/tt&gt; will exclude the given package from the search. By default, Hoogle will use default.hoo, but if any &lt;tt&gt;+name&lt;/tt&gt; commands are given then those databases will be used instead.&lt;br /&gt;&lt;br /&gt;Hoogle looks for databases in the current directory, in the data directory specified by Cabal, and in any &lt;tt&gt;--include&lt;/tt&gt; directories passed at the command line.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Step 1: Creating a Textbase&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;A Textbase is a textual representation of a function database. To generate a textbase you need to install the darcs version of &lt;a href="http://haskell.org/haddock/"&gt;Haddock&lt;/a&gt;, then use &lt;tt&gt;runhaskell Setup haddock --hoogle&lt;/tt&gt; on your package. For filepath, this will create the file &lt;tt&gt;dist/doc/html/filepath/filepath.txt&lt;/tt&gt;, which is a textbase.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Step 2: Converting a Textbase to a Database&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;To convert a textbase to a database use the command &lt;tt&gt;hoogle --convert=filepath.txt&lt;/tt&gt; in the appropriate folder. If a package depends on any other packages, then adding &lt;tt&gt;+package&lt;/tt&gt; will allow Hoogle to use the dependencies to generate a more accurate database. In the case of filepath, which depends on base, we use &lt;tt&gt;hoogle --convert=filepath.txt +base&lt;/tt&gt;. This command requires base.hoo to be present.&lt;br /&gt;&lt;br /&gt;Adding the dependencies is not strictly necessary, but will allow Hoogle to generate a more accurate database. For example, the base package defines &lt;tt&gt;type String = [Char]&lt;/tt&gt;, without the &lt;tt&gt;+base&lt;/tt&gt; flag this type synonym would not be known to Hoogle.&lt;br /&gt;&lt;br /&gt;We now have filepath.hoo, which can be used as a search database.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Step 3: Combining Databases&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;To generate a database comprising of both filepath and base, type &lt;tt&gt;hoogle --output=default.hoo --combine=filepath.hoo --combine=base.hoo&lt;/tt&gt;. By combining databases you allow easy access to common groups of packages, and searching all these packages at once becomes faster than listing each database separately.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Web Version Databases&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;The web version uses the Hackage tarballs to generate documentation for most of its databases, but also has three custom databases:&lt;br /&gt;&lt;br /&gt;&lt;ul&gt;&lt;br /&gt;&lt;li&gt;&lt;b&gt;base&lt;/b&gt; - the base package is just too weird, and isn't even on hackage. A darcs version and some tweaking is required.&lt;/li&gt;&lt;br /&gt;&lt;li&gt;&lt;b&gt;keyword&lt;/b&gt; - the keyword database is a list of the keywords in Haskell, and is taken from the web page &lt;a href="http://haskell.org/haskellwiki/Keyword"&gt;on the wiki&lt;/a&gt;.&lt;/li&gt;&lt;br /&gt;&lt;li&gt;&lt;b&gt;hackage&lt;/b&gt; - the hackage database is a list of all the packages on Hackage, indexed only by the package name.&lt;/li&gt;&lt;br /&gt;&lt;/ul&gt;&lt;br /&gt;&lt;br /&gt;All the code for generating the web version databases is found in &lt;tt&gt;data/generate&lt;/tt&gt; in the Hoogle darcs repo at &lt;a href="http://code.haskell.org/hoogle"&gt;http://code.haskell.org/hoogle&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Future Improvements&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;There are two database related tasks that still need to be done: Cabal integration and indexing all of Hackage.&lt;br /&gt;&lt;br /&gt;&lt;a href="http://code.google.com/p/ndmitchell/issues/detail?id=80"&gt;Bug 80:&lt;/a&gt; In the future I would like Hoogle databases to be generated by Cabal automatically on installing a package. Unfortunately, I don't have the time to implement such a feature currently, and even if I did implement it, I'm unlikely to ever use it. If anyone wants to work on this, please get in contact. This is mainly a project working with Cabal.&lt;br /&gt;&lt;br /&gt;&lt;a href="http://code.google.com/p/ndmitchell/issues/detail?id=79"&gt;Bug 79:&lt;/a&gt; The other work is to index all the packages on Hackage. The problem here is generating the textbases, once they have been created the rest is fairly simple. However, to run Haddock 2 over a package requires that the package builds, and that all the dependencies are present. Unfortunatley my machine is not powerful enough to cope with the number of packages on Hackage. Hopefully at some point the machinery that builds Haddock documentation for Hackage will also generate textbases, however in the mean time if someone wants to take on the task of generating textbases for Hackage, please get in contact.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Bug Tracker&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;I'm not working on Hoogle full-time anymore, so am using my &lt;a href="http://code.google.com/p/ndmitchell/issues/list?q=proj:Hoogle"&gt;bug tracker&lt;/a&gt; to keep track of outstanding issues. In order to interact more effectively with my bug tracker, you might want to read &lt;a href="http://code.google.com/p/ndmitchell/wiki/Issues"&gt;this guide&lt;/a&gt;. It describes how to vote for bugs etc.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7094652-4584511224016417397?l=neilmitchell.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://neilmitchell.blogspot.com/feeds/4584511224016417397/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=7094652&amp;postID=4584511224016417397' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/4584511224016417397'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/4584511224016417397'/><link rel='alternate' type='text/html' href='http://neilmitchell.blogspot.com/2008/08/hoogle-database-generation.html' title='Hoogle Database Generation'/><author><name>Neil Mitchell</name><uri>http://www.blogger.com/profile/13084722756124486154</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www-users.cs.york.ac.uk/~ndm/elements/my-photo.jpg'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7094652.post-3137352460120603538</id><published>2008-08-20T15:18:00.003+01:00</published><updated>2008-08-20T15:58:53.253+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='hoogle'/><title type='text'>Hoogle New Features</title><content type='html'>I've now finished my Hoogle Summer of Code work, though I still intend to continue working on Hoogle when I get the chance. Before the coding period expired, I was able to add a number of new features to Hoogle. These features are all available at Hoogle, under &lt;a href="http://haskell.org/hoogle/"&gt;http://haskell.org/hoogle/&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;More Compact Text Searching&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;The old text search feature was very fast, using an on disk trie to navigate around the possible matches. The downside to this trie was the space it consumed, about half the database was devoted to it. Fortunately, I came up with an alternative way to get fast text searching (albeit slightly slower), in a lot more compact form.&lt;br /&gt;&lt;br /&gt;Much smaller database files also mean much faster database generation, as the time spent in the IO routines is the main bottleneck.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Faster IO routines&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;I rewrote the underlying binary layer in Hoogle, to make it faster. It's not as fast as I would like, and I think that moving to memory-mapped files is probably a good idea. With these improvements, along with the compact text searching, I am able to generate databases in about 2 seconds (compared to about 20 seconds before).&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Database Restricted Searches&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;Hoogle has been able to run database restricted searches for some time, but now the databases contain enough information to make it practical. By adding &lt;tt&gt;+package&lt;/tt&gt; or &lt;tt&gt;-package&lt;/tt&gt; to the search you can include or exclude certain packages. For example, to find out which map functions are in the containers package try &lt;a href="http://haskell.org/hoogle/?q=map+%2Bcontainers"&gt;map +containers&lt;/a&gt;. To find out which map functions are &lt;i&gt;not&lt;/i&gt; in the containers or bytestring packages try &lt;a href="http://haskell.org/hoogle/?q=map+-containers+-bytestring"&gt;map -containers -bytestring&lt;/a&gt;. I have also split out the GHC.* modules from base, so if you want to find some unboxed types in GHC's libraries try &lt;a href="http://haskell.org/hoogle/?q=%23+%2Bghc"&gt;# +ghc&lt;/a&gt;. Note that not all the documentation links work from the GHC modules, I am still trying to fix this.&lt;br /&gt;&lt;br /&gt;By default Hoogle searches the following packages: array, base, bytestring, cabal, containers, directory, filepath, haskell-src, hunit, keyword, mtl, parallel, parsec, pretty, process, quickcheck, random, stm, template-haskell, time, xhtml&lt;br /&gt;&lt;br /&gt;The "ghc" package is also available if specified with &lt;tt&gt;+ghc&lt;/tt&gt; and includes the GHC.* modules of base only.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Hoogle 3&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;I have now replaced the default Hoogle with Hoogle 4, but have copied Hoogle 3 to &lt;a href="http://haskell.org/hoogle/3"&gt;http://haskell.org/hoogle/3&lt;/a&gt;. Unfortunately, it doesn't yet work, as I need some admin help. But it will in the next few days, I hope. The only reason I can think of for using Hoogle 3 is &lt;a href="http://haskell.org/gtk2hs/"&gt;Gtk2hs&lt;/a&gt; library searching, which I do want to add to Hoogle 4 when possible.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Give Me Feedback&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;There are quite a lot of enhancements to Hoogle that I still want to make. I have tried to list all these improvements in my &lt;a href="http://code.google.com/p/ndmitchell/issues/list"&gt;bug tracker&lt;/a&gt;. If you find a bug, or want some feature, open an issue. If you have a particular interest in a bug, you can star it, to be informed on its progress and to indicate to me that you care.&lt;br /&gt;&lt;br /&gt;I'm particularly interested in two pieces of feedback:&lt;br /&gt;&lt;br /&gt;&lt;i&gt;I don't use Hoogle 4 because ...&lt;/i&gt;&lt;br /&gt;&lt;br /&gt;Do you use any type/name search engine? Do you want to still use Hoogle 3? Do you use &lt;a href="http://holumbus.fh-wedel.de/hayoo/hayoo.html"&gt;Hayoo&lt;/a&gt;? If you use something else, what feature draws you to it? What do you dislike about Hoogle 4?&lt;br /&gt;&lt;br /&gt;&lt;i&gt;I use Hoogle 4, but my life would be nicer if ...&lt;/i&gt;&lt;br /&gt;&lt;br /&gt;There are many things which effect Hoogle 4 users that I'm not aware of. If you open a bug saying what annoys you (or leave a comment and I'll do it for you) then I can keep track of this information. Even if you don't necessarily see any way to fix the problems, I'd still like to know them.&lt;br /&gt;&lt;br /&gt;Thanks for everyone who has given feedback on Hoogle so far, it has been very useful.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7094652-3137352460120603538?l=neilmitchell.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://neilmitchell.blogspot.com/feeds/3137352460120603538/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=7094652&amp;postID=3137352460120603538' title='3 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/3137352460120603538'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/3137352460120603538'/><link rel='alternate' type='text/html' href='http://neilmitchell.blogspot.com/2008/08/hoogle-new-features.html' title='Hoogle New Features'/><author><name>Neil Mitchell</name><uri>http://www.blogger.com/profile/13084722756124486154</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www-users.cs.york.ac.uk/~ndm/elements/my-photo.jpg'/></author><thr:total>3</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7094652.post-5361554469918141920</id><published>2008-08-15T20:17:00.002+01:00</published><updated>2008-08-15T20:18:51.115+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='soc'/><category scheme='http://www.blogger.com/atom/ns#' term='hoogle'/><title type='text'>GSoC Hoogle: Week 12</title><content type='html'>This week I've been trying to get Hoogle 4 to the point where it can replace Hoogle 3. This is the final official week of Google Summer of Code, but I'm planning to continue hacking Hoogle next week, and then as time allows after that.&lt;br /&gt;&lt;br /&gt;The priority this week was getting the documentation links working. The problem was not with Hoogle - displaying the links is trivial - but ensuring that Cabal + Haddock + Hoogle + random build scripts combine to generate the correct databases. This work involved lots of little changes in lots of places, but is now working properly. Included in this work is dependency tracking of packages (so that all packages using base know that String = [Char] etc), and merging multiple databases to create a single one.&lt;br /&gt;&lt;br /&gt;After the Hoogle database was generated correctly, I started looking at using some of the additional information present. I have now added Haddock documentation inline in the search results. If the documentation is too long to fit comfortably, Hoogle uses AJAX wizzy-ness (or more accurately, DHTML) to allow the user to expand and show all the documentation. I suspect that this will eliminate many cases of the user actually following to the Haddock webpages. This feature is fairly new, and I have pushed it out because its useful - there are still many small improvements that need to be made.&lt;br /&gt;&lt;br /&gt;This week I also spent some time attempting to generate documentation for all the Hackage libraries. I had some success, but the computer I am currently using is years old and lacks the necessary processing power. I will tackle this at some point in the future, once I have purchased a new machine (which should be quite soon).&lt;br /&gt;&lt;br /&gt;With all these changes, I find Hoogle 4 to be significantly more usable than Hoogle 3. Please give it a try, and give feedback. At this point I'm particularly interested in any issues that would cause you to use Hoogle 3 instead of Hoogle 4.&lt;br /&gt;&lt;br /&gt;Hoogle 3: &lt;a href="http://haskell.org/hoogle"&gt;http://haskell.org/hoogle&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Hoogle 4: &lt;a href="http://haskell.org/hoogle/beta"&gt;http://haskell.org/hoogle/beta&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;If there are no major issues, I will be replacing Hoogle 4 as the standard Hoogle sometime next week.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Next week:&lt;/b&gt; I will be no longer doing Google Summer of Code :-) I plan to refine some of the existing bits of Hoogle, and ensure that anything I haven't done is in a bug tracker for later.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;User visible changes:&lt;/b&gt; The web search engine now gives Haddock links and displays Haddock documentation inline.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7094652-5361554469918141920?l=neilmitchell.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://neilmitchell.blogspot.com/feeds/5361554469918141920/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=7094652&amp;postID=5361554469918141920' title='3 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/5361554469918141920'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/5361554469918141920'/><link rel='alternate' type='text/html' href='http://neilmitchell.blogspot.com/2008/08/gsoc-hoogle-week-12.html' title='GSoC Hoogle: Week 12'/><author><name>Neil Mitchell</name><uri>http://www.blogger.com/profile/13084722756124486154</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www-users.cs.york.ac.uk/~ndm/elements/my-photo.jpg'/></author><thr:total>3</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7094652.post-5274716142436632996</id><published>2008-08-11T23:54:00.004+01:00</published><updated>2008-08-12T00:09:13.923+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='soc'/><category scheme='http://www.blogger.com/atom/ns#' term='hoogle'/><title type='text'>GSoC Hoogle: Week 11</title><content type='html'>This week I've been releasing lots. Hoogle 4 is finally starting to come together, and should be a worthy replacement for Hoogle 3 very shortly. Rather than go into detail about the past week, I'm just going to give some of the bullet points:&lt;br /&gt;&lt;br /&gt;&lt;ul&gt;&lt;br /&gt;&lt;li&gt;I have &lt;a href="http://hackage.haskell.org/cgi-bin/hackage-scripts/package/hoogle"&gt;released 4 versions&lt;/a&gt; of the command line version of Hoogle, available on Hackage. Many bugs have been spotted by some very useful testers, and improvements have been made.&lt;/li&gt;&lt;br /&gt;&lt;li&gt;I have released a &lt;a href="http://haskell.org/hoogle/beta/"&gt;web version&lt;/a&gt; of Hoogle 4, and encourage feedback.&lt;/li&gt;&lt;br /&gt;&lt;li&gt;I have started to update the &lt;a href="http://haskell.org/haskellwiki/Hoogle"&gt;wiki Manual&lt;/a&gt;, which now contains some details of Hoogle's query syntax.&lt;/li&gt;&lt;br /&gt;&lt;li&gt;I gave a talk at AngloHaskell 2008, which is &lt;a href="http://www.wellquite.org/anglohaskell2008/"&gt;available online&lt;/a&gt;, as slides and an audio stream. All of the other talks were excellent and are well worth listening to.&lt;/li&gt;&lt;br /&gt;&lt;li&gt;I have started to build Hoogle documentation for all of Hackage. The machine I'm doing this on is very slow, so its not a quick process!&lt;/li&gt;&lt;br /&gt;&lt;/ul&gt;&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Next week:&lt;/b&gt; I'm hoping to work on generating better Hoogle databases, including a Hoogle database for the whole of Hackage. I also have a number of bugs to fix.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;User visible changes:&lt;/b&gt; Users can download and use Hoogle, and the web interface is online.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7094652-5274716142436632996?l=neilmitchell.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://neilmitchell.blogspot.com/feeds/5274716142436632996/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=7094652&amp;postID=5274716142436632996' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/5274716142436632996'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/5274716142436632996'/><link rel='alternate' type='text/html' href='http://neilmitchell.blogspot.com/2008/08/gsoc-hoogle-week-11.html' title='GSoC Hoogle: Week 11'/><author><name>Neil Mitchell</name><uri>http://www.blogger.com/profile/13084722756124486154</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www-users.cs.york.ac.uk/~ndm/elements/my-photo.jpg'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7094652.post-1662792436216792641</id><published>2008-08-05T22:40:00.003+01:00</published><updated>2008-08-05T23:59:05.631+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='hoogle'/><title type='text'>Hoogle 4.0 web client preview</title><content type='html'>Since releasing a &lt;a href="http://neilmitchell.blogspot.com/2008/08/hoogle-40-release-beta-command-line.html"&gt;command line version&lt;/a&gt; of Hoogle 4 yesterday, I've had some useful feedback from a number of people. As a result, I have added a few &lt;a href="http://code.google.com/p/ndmitchell/issues/list"&gt;bugs to the bug tracker&lt;/a&gt;, and fixed a few mistakes in the searching and ranking. The &lt;a href="http://hackage.haskell.org/cgi-bin/hackage-scripts/package/hoogle"&gt;Hoogle on Hackage&lt;/a&gt; is currently 4.0.0.3 and is a recommended upgrade to all early testers.&lt;br /&gt;&lt;br /&gt;I've now written a web interface to Hoogle 4, which has been uploaded to &lt;a href="http://haskell.org/hoogle/beta/"&gt;http://haskell.org/hoogle/beta/&lt;/a&gt;. This web interface is primarily so people can test searching/ranking without installing anything. There are a number of limitations:&lt;br /&gt;&lt;br /&gt;&lt;ul&gt;&lt;br /&gt;&lt;li&gt;The links to documentation do not work - this is the most severe problem, and probably stops people permanently changing to the new version.&lt;/li&gt;&lt;br /&gt;&lt;li&gt;The Haddock documentation is not present.&lt;/li&gt;&lt;br /&gt;&lt;li&gt;Some database entries are duplicates.&lt;/li&gt;&lt;br /&gt;&lt;li&gt;The Lambdabot says feature is missing.&lt;/li&gt;&lt;br /&gt;&lt;li&gt;The Suggestion feature is incomplete.&lt;/li&gt;&lt;br /&gt;&lt;li&gt;The AJAX style client features are not present.&lt;/li&gt;&lt;br /&gt;&lt;/ul&gt;&lt;br /&gt;&lt;br /&gt;The first three issues are fixed in Hoogle, but need various support through Haddock and Cabal to work. Other than these limitations, I am very interested in hearing what people think. As before, particularly regressions from Hoogle 3 or poor results/ranking.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7094652-1662792436216792641?l=neilmitchell.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://neilmitchell.blogspot.com/feeds/1662792436216792641/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=7094652&amp;postID=1662792436216792641' title='8 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/1662792436216792641'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/1662792436216792641'/><link rel='alternate' type='text/html' href='http://neilmitchell.blogspot.com/2008/08/hoogle-40-web-client-preview.html' title='Hoogle 4.0 web client preview'/><author><name>Neil Mitchell</name><uri>http://www.blogger.com/profile/13084722756124486154</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www-users.cs.york.ac.uk/~ndm/elements/my-photo.jpg'/></author><thr:total>8</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7094652.post-16451925465827371</id><published>2008-08-04T18:54:00.004+01:00</published><updated>2008-08-04T19:30:09.427+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='hoogle'/><title type='text'>Hoogle 4.0 release (beta, command line)</title><content type='html'>I am pleased to announce Hoogle 4.0, &lt;a href="http://hackage.haskell.org/cgi-bin/hackage-scripts/package/hoogle"&gt;available on Hackage&lt;/a&gt;. A couple of things to note:&lt;br /&gt;&lt;br /&gt;&lt;ul&gt;&lt;br /&gt;&lt;li&gt;This is a release of the command-line version only. It will have identical searching abilities to the web-based version, which I'm about to write.&lt;/li&gt;&lt;br /&gt;&lt;li&gt;It currently only searches the same packages as Hoogle 3 (the final release will search more).&lt;/li&gt;&lt;br /&gt;&lt;li&gt;It currently doesn't support the &lt;tt&gt;--info&lt;/tt&gt; flag as previously described (problems with Haddock, not with Hoogle).&lt;/li&gt;&lt;br /&gt;&lt;/ul&gt;&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Walkthrough: Installation&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;If you have cabal-install available, it should be as simple as:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;$ cabal update &amp;&amp; cabal install hoogle&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Otherwise, follow the standard Cabal/Hackage guidelines. Hoogle depends on about 4 packages on Hackage which are not available with a standard GHC install, so these will need to be built.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Walkthrough: A few searches&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;Here are some example searches. I have used &lt;tt&gt;--count=5&lt;/tt&gt; to limit the number of results displayed. If you are using a terminal with ANSI escape codes I recommend also passing &lt;tt&gt;--color&lt;/tt&gt; to enable colored output.&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;$ hoogle map --count=5&lt;br /&gt;Prelude map :: (a -&gt; b) -&gt; [a] -&gt; [b]&lt;br /&gt;Data.ByteString map :: (Word8 -&gt; Word8) -&gt; ByteString -&gt; ByteString&lt;br /&gt;Data.IntMap map :: (a -&gt; b) -&gt; IntMap a -&gt; IntMap b&lt;br /&gt;Data.IntSet map :: (Int -&gt; Int) -&gt; IntSet -&gt; IntSet&lt;br /&gt;Data.List map :: (a -&gt; b) -&gt; [a] -&gt; [b]&lt;br /&gt;&lt;br /&gt;$ hoogle "(a -&gt; b) -&gt; [a] -&gt; [b]" --count=5&lt;br /&gt;Prelude map :: (a -&gt; b) -&gt; [a] -&gt; [b]&lt;br /&gt;Data.List map :: (a -&gt; b) -&gt; [a] -&gt; [b]&lt;br /&gt;Control.Parallel.Strategies parMap :: Strategy b -&gt; (a -&gt; b) -&gt; [a] -&gt; [b]&lt;br /&gt;Prelude fmap :: Functor f =&gt; (a -&gt; b) -&gt; f a -&gt; f b&lt;br /&gt;Control.Applicative &lt;$&gt; :: Functor f =&gt; (a -&gt; b) -&gt; f a -&gt; f b&lt;br /&gt;&lt;br /&gt;$ hoogle Data.Map.map --count=5&lt;br /&gt;Data.Map map :: (a -&gt; b) -&gt; Map k a -&gt; Map k b&lt;br /&gt;Data.Map data Map k a&lt;br /&gt;module Data.Map&lt;br /&gt;Data.Map mapAccum :: (a -&gt; b -&gt; (a, c)) -&gt; a -&gt; Map k b -&gt; (a, Map k c)&lt;br /&gt;Data.Map mapAccumWithKey :: (a -&gt; k -&gt; b -&gt; (a, c)) -&gt; a -&gt; Map k b -&gt; (a, Map k c)&lt;br /&gt;&lt;br /&gt;$ hoogle "Functor f =&gt; (a -&gt; b) -&gt; f a -&gt; f b" --count=5&lt;br /&gt;Prelude fmap :: Functor f =&gt; (a -&gt; b) -&gt; f a -&gt; f b&lt;br /&gt;Control.Applicative &lt;$&gt; :: Functor f =&gt; (a -&gt; b) -&gt; f a -&gt; f b&lt;br /&gt;Control.Monad fmap :: Functor f =&gt; (a -&gt; b) -&gt; f a -&gt; f b&lt;br /&gt;Control.Monad.Instances fmap :: Functor f =&gt; (a -&gt; b) -&gt; f a -&gt; f b&lt;br /&gt;Data.Traversable fmapDefault :: Traversable t =&gt; (a -&gt; b) -&gt; t a -&gt; t b&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;&lt;b&gt;How you can help&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;I've released a command line version of the search to solicit feedback. I'm interested in all comments, but especially ones of the form:&lt;br /&gt;&lt;br /&gt;&lt;ul&gt;&lt;br /&gt;&lt;li&gt;I prefer the command line version of Hoogle 3 because ...&lt;/li&gt;&lt;br /&gt;&lt;li&gt;When I search for ... I would expect result ... to appear, or to appear above result ...&lt;/li&gt;&lt;br /&gt;&lt;li&gt;I was hoping for the feature ...&lt;/li&gt;&lt;br /&gt;&lt;li&gt;It takes too long when I ...&lt;/li&gt;&lt;br /&gt;&lt;/ul&gt;&lt;br /&gt;&lt;br /&gt;I'm going to be accumulating Hoogle 4 bugs in &lt;a href="http://code.google.com/p/ndmitchell/issues/list"&gt;my bug tracker&lt;/a&gt;, or by email (&lt;a href="http://www-users.cs.york.ac.uk/~ndm/contact/"&gt;http://www-users.cs.york.ac.uk/~ndm/contact/&lt;/a&gt;) - whichever you find more convenient.&lt;br /&gt;&lt;br /&gt;Now I'm going to start work on the Web search :-)&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7094652-16451925465827371?l=neilmitchell.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://neilmitchell.blogspot.com/feeds/16451925465827371/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=7094652&amp;postID=16451925465827371' title='6 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/16451925465827371'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/16451925465827371'/><link rel='alternate' type='text/html' href='http://neilmitchell.blogspot.com/2008/08/hoogle-40-release-beta-command-line.html' title='Hoogle 4.0 release (beta, command line)'/><author><name>Neil Mitchell</name><uri>http://www.blogger.com/profile/13084722756124486154</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www-users.cs.york.ac.uk/~ndm/elements/my-photo.jpg'/></author><thr:total>6</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7094652.post-4209023695899494330</id><published>2008-08-03T14:41:00.003+01:00</published><updated>2008-08-03T14:54:42.747+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='soc'/><category scheme='http://www.blogger.com/atom/ns#' term='hoogle'/><title type='text'>GSoC Hoogle: Week 10</title><content type='html'>This week I've been in Bristol, and am just about to head off to the &lt;a href="http://www.bristol.gov.uk/ccm/content/Leisure-Culture/Arts-Entertainment/edf-energy-bristol-harbour-festival-2007.en"&gt;Harbour Festival&lt;/a&gt;. Next week I'm heading off to &lt;a href="http://www.haskell.org/haskellwiki/AngloHaskell/2008"&gt;AngloHaskell 2008&lt;/a&gt;, and will be talking about Hoogle type searching on the Saturday.&lt;br /&gt;&lt;br /&gt;This week has been type search, yet again. There were issues with algorithmic complexity, combinatorial explosions and other fun stuff. However, its now finished. The type search is now fast enough (you can run Hoogle in Hugs against the core libraries) and gives good results. Rather than describe type searching, its easier to give an example. Searching for &lt;tt&gt;(a -&gt; b) -&gt; [a] -&gt; [b]&lt;/tt&gt; in Hoogle 3 gives:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;Prelude.map :: (a -&gt; b) -&gt; [a] -&gt; [b]&lt;br /&gt;Data.List.map :: (a -&gt; b) -&gt; [a] -&gt; [b]&lt;br /&gt;Control.Parallel.S... parMap :: Strategy b -&gt; (a -&gt; b) -&gt; [a] -&gt; [b]&lt;br /&gt;Prelude.scanr :: (a -&gt; b -&gt; b) -&gt; b -&gt; [a] -&gt; [b]&lt;br /&gt;Data.List.scanr :: (a -&gt; b -&gt; b) -&gt; b -&gt; [a] -&gt; [b]&lt;br /&gt;Prelude.scanl :: (a -&gt; b -&gt; a) -&gt; a -&gt; [b] -&gt; [a]&lt;br /&gt;Data.List.scanl :: (a -&gt; b -&gt; a) -&gt; a -&gt; [b] -&gt; [a]&lt;br /&gt;Prelude.concatMap :: (a -&gt; [b]) -&gt; [a] -&gt; [b]&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;But in Hoogle 4 gives:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;Prelude map :: (a -&gt; b) -&gt; [a] -&gt; [b]&lt;br /&gt;Data.List map :: (a -&gt; b) -&gt; [a] -&gt; [b]&lt;br /&gt;Prelude fmap :: Functor f =&gt; (a -&gt; b) -&gt; f a -&gt; f b&lt;br /&gt;Control.Applicative &lt;$&gt; :: Functor f =&gt; (a -&gt; b) -&gt; f a -&gt; f b&lt;br /&gt;Control.Monad fmap :: Functor f =&gt; (a -&gt; b) -&gt; f a -&gt; f b&lt;br /&gt;Control.Monad.Instances fmap :: Functor f =&gt; (a -&gt; b) -&gt; f a -&gt; f b&lt;br /&gt;Control.Applicative liftA :: Applicative f =&gt; (a -&gt; b) -&gt; f a -&gt; f b&lt;br /&gt;Data.Traversable fmapDefault :: Traversable t =&gt; (a -&gt; b) -&gt; t a -&gt; t b&lt;br /&gt;Control.Monad liftM :: Monad m =&gt; (a1 -&gt; r) -&gt; m a1 -&gt; m r&lt;br /&gt;Control.Parallel.Strategies parMap :: Strategy b -&gt; (a -&gt; b) -&gt; [a] -&gt; [b]&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;I think the new results are better. For more details, come to the AngloHaskell talk.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Next Week:&lt;/b&gt; I want to release a public beta of Hoogle 4 in command line form. I want to start on the web search engine and tweak the ranking algorithm. I'll also be writing up type search in the form of a presentation.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;User Visible Changes:&lt;/b&gt; Type search works well and fast.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7094652-4209023695899494330?l=neilmitchell.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://neilmitchell.blogspot.com/feeds/4209023695899494330/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=7094652&amp;postID=4209023695899494330' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/4209023695899494330'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/4209023695899494330'/><link rel='alternate' type='text/html' href='http://neilmitchell.blogspot.com/2008/08/gsoc-hoogle-week-10.html' title='GSoC Hoogle: Week 10'/><author><name>Neil Mitchell</name><uri>http://www.blogger.com/profile/13084722756124486154</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www-users.cs.york.ac.uk/~ndm/elements/my-photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7094652.post-6253628027256126478</id><published>2008-07-24T13:48:00.003+01:00</published><updated>2008-07-24T13:57:32.602+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='soc'/><category scheme='http://www.blogger.com/atom/ns#' term='hoogle'/><title type='text'>GSoC Hoogle: Week 9</title><content type='html'>I'm off camping for the next weekend in a couple of hours, so this is my early weekly summary. From next week, for a week and a half, I'll actually have an SSH connection so expect to see 200+ patches flow into the Hoogle repo in a few days.&lt;br /&gt;&lt;br /&gt;This week I've been rewriting the type search. I spent 3 days writing code, type checking it, but not actually having enough written to run it. Late last night I finished the code, and this morning I debugged it. Amazingly (although actually quite commonly for Haskell) it worked with only minor tweaks. I now have a type search which should scale to large databases and provide fast and accurate searches.&lt;br /&gt;&lt;br /&gt;All the basic tests work, and I can generate a Hoogle database for the array library. I still can't generate a Hoogle database for the base library, due to a stack overflow, but I think the cause of the stack overflow has changed and should be easy to debug.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Next Week:&lt;/b&gt; A public beta of the command line version is now overdue, and hopefully will happen next week. I aim to finish the actual search side of Hoogle, and move on to the web interface.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;User Visible Changes:&lt;/b&gt; Type search works again, mostly.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7094652-6253628027256126478?l=neilmitchell.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://neilmitchell.blogspot.com/feeds/6253628027256126478/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=7094652&amp;postID=6253628027256126478' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/6253628027256126478'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/6253628027256126478'/><link rel='alternate' type='text/html' href='http://neilmitchell.blogspot.com/2008/07/gsoc-hoogle-week-9.html' title='GSoC Hoogle: Week 9'/><author><name>Neil Mitchell</name><uri>http://www.blogger.com/profile/13084722756124486154</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www-users.cs.york.ac.uk/~ndm/elements/my-photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7094652.post-8037894516132038671</id><published>2008-07-20T23:27:00.003+01:00</published><updated>2008-07-20T23:45:37.141+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='soc'/><category scheme='http://www.blogger.com/atom/ns#' term='hoogle'/><title type='text'>GSoC Hoogle: Week 8</title><content type='html'>This week I've been travelling quite a bit, and rather busy with other things. Hopefully next week I'll be able to focus more time on Hoogle!&lt;br /&gt;&lt;br /&gt;This week I fleshed out the final part of type search, including support for instances and alpha renaming of variables. After having implemented all the bits in the type search, I tried to convert the base libraries - and it failed, taking up too much time/memory to feasibly finish.&lt;br /&gt;&lt;br /&gt;The type search is based around the idea of having nodes in a graph representing types, and then moving between these nodes, at a cost. In order to avoid a blow-up in the number of nodes in the graph, types are alpha-normalised and then alpha-renaming is performed afterwards. Instead of having 3 type nodes for &lt;tt&gt;(a,b)&lt;/tt&gt;, &lt;tt&gt;(c,d)&lt;/tt&gt; and &lt;tt&gt;(a,a)&lt;/tt&gt; there is just one named &lt;tt&gt;(a,b)&lt;/tt&gt; and a 3 sets of alpha-renamings. All is good.&lt;br /&gt;&lt;br /&gt;However, once you introduce instance restrictions, the types blow up. For example, from the type node &lt;tt&gt;a&lt;/tt&gt;, you can move to &lt;tt&gt;Eq a =&gt; a&lt;/tt&gt;, &lt;tt&gt;Ord a =&gt; a&lt;/tt&gt;, &lt;tt&gt;Show a =&gt; a&lt;/tt&gt; etc. The large (but feasible) number of type nodes, combined with even a small number of class names, gives a huge number of nodes. In fact, for every type variable in a node there are 2^n possible instance contexts it could take. All is bad.&lt;br /&gt;&lt;br /&gt;Fortunately there is a solution - move instance checking outside the type graph. This makes the number of nodes feasible, and should work fairly well. It also has a few other benefits, including slightly better scoring and a simpler implementation in a few places. I also came up with a strategy for moving the cost associated with alpha-renaming into the graph search, which further simplifies things.&lt;br /&gt;&lt;br /&gt;Of course, all this work takes time, so overall progress is slower than I would have liked. However, the results so far are promising, and the problems of scale seem to have been successfully addressed. The problem of fast and accurate type searching is hard, but hopefully Hoogle 4 will have a scalable solution that should be useful.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Next week:&lt;/b&gt; I want to finish the implementation of type searching, and check it works on the full base libraries. A release would be good, although may take place early in the following week.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;User visible changes:&lt;/b&gt; Creating a database for the base library will now fail with a stack overflow. Hopefully next weeks changes will fix this!&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7094652-8037894516132038671?l=neilmitchell.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://neilmitchell.blogspot.com/feeds/8037894516132038671/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=7094652&amp;postID=8037894516132038671' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/8037894516132038671'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/8037894516132038671'/><link rel='alternate' type='text/html' href='http://neilmitchell.blogspot.com/2008/07/gsoc-hoogle-week-8.html' title='GSoC Hoogle: Week 8'/><author><name>Neil Mitchell</name><uri>http://www.blogger.com/profile/13084722756124486154</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www-users.cs.york.ac.uk/~ndm/elements/my-photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7094652.post-8908382609408584278</id><published>2008-07-11T23:46:00.003+01:00</published><updated>2008-07-12T00:03:48.091+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='soc'/><category scheme='http://www.blogger.com/atom/ns#' term='hoogle'/><title type='text'>GSoC Hoogle: Week 7</title><content type='html'>This week I've continued to improve the type searching, and generated Hoogle databases for the core libraries. I'm away from a computer all weekend until Tuesday evening, which has happened every &lt;a href="http://www.icfpcontest.org/"&gt;ICFP contest&lt;/a&gt; for the last 3 years.&lt;br /&gt;&lt;br /&gt;I've substantially refactored the type searching, basing it on a proper abstract Graph data type. Now the mechanisms for dealing with type search and graph traversal are separate it is much easier to express clearly what type search is doing. I've also fleshed out the type searching code so that it can accurately perform searches with all the necessary features. There are still a number of tasks to do before the type searching code is finished, but each is a fairly discrete unit of work with well-understood problems.&lt;br /&gt;&lt;br /&gt;The other challenge for the week has been generating Hoogle databases for the core libraries - the base library and all the other libraries GHC ships with a release. With these libraries in place, it is feasible to use Hoogle to perform useful queries. The libraries are generated using a combination of Cabal, Haddock and Hoogle. I've made changes in both the Haddock and Hoogle layers so that the full base libraries can now be processed.&lt;br /&gt;&lt;br /&gt;In order to deal with the full base libraries there are numerous GHC extensions that must be supported. In particular, Hoogle now supports multi-parameter type classes, higher-ranked types, type operators, unboxed types, unboxed tuples and NDP style arrays. All of these features are translated down into Haskell 98 types, but most closely approximate their behaviour in GHC, and can be used in searches.&lt;br /&gt;&lt;br /&gt;Throughout the week I've been profiling the database creation code in Hoogle. The databases for the core libraries come to about 4.5Mb, and are highly optimised for performing searches - often at the cost of making them harder to create. I've halved the time to create databases during the week, using profiling to direct improvements. The time required to process the Core libraries takes 60 seconds, which is a certainly an acceptable timeframe, but could always be faster. Currently the biggest culprit in the profile is the &lt;tt&gt;hPutByte&lt;/tt&gt; function:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;hPutByte :: Handle -&gt; Int -&gt; IO ()&lt;br /&gt;hPutByte hndl i = hPutChar hndl $ chr i&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Since the underlying databases are written using lots of &lt;tt&gt;hTell&lt;/tt&gt; and &lt;tt&gt;hSeek&lt;/tt&gt; commands, it is not possible to use something like the &lt;tt&gt;Data.Binary&lt;/tt&gt; library. However, if anyone has any suggestions on how to improve performance they would be gratefully received.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Next week:&lt;/b&gt; I want to finish off the remaining type search features, and then package up a command line release for Hackage. Hopefully Hoogle 4 will be ready for initial use by early testers.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;User visible changes:&lt;/b&gt; Type search is more robust, but still not fully featured. Database creation if faster and robust. You can search the base libraries.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7094652-8908382609408584278?l=neilmitchell.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://neilmitchell.blogspot.com/feeds/8908382609408584278/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=7094652&amp;postID=8908382609408584278' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/8908382609408584278'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/8908382609408584278'/><link rel='alternate' type='text/html' href='http://neilmitchell.blogspot.com/2008/07/gsoc-hoogle-week-7.html' title='GSoC Hoogle: Week 7'/><author><name>Neil Mitchell</name><uri>http://www.blogger.com/profile/13084722756124486154</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www-users.cs.york.ac.uk/~ndm/elements/my-photo.jpg'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7094652.post-8880686691704898747</id><published>2008-07-06T20:48:00.003+01:00</published><updated>2008-07-06T21:03:21.505+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='soc'/><category scheme='http://www.blogger.com/atom/ns#' term='hoogle'/><title type='text'>GSoC Hoogle: Week 6</title><content type='html'>This week I've been tackling type searching. I have just (in the last few minutes) got my first type search to work. At the moment type search is very limited, but all the ideas and scafolding are in place, so should now proceed relatively quickly.&lt;br /&gt;&lt;br /&gt;In all previous versions on Hoogle, type searching was &lt;i&gt;O(n)&lt;/i&gt;, where &lt;i&gt;n&lt;/i&gt; is the number of functions in the database. Hoogle compared the type search to each possible answer, computed a closeness score, then at the end wrote out the closest matches. This meant that before the first answer could be given, all functions had to be checked, i.e. the time for the first answer was &lt;i&gt;O(n)&lt;/i&gt;. As the Hoogle database is about to get massively bigger, this approach is insufficient.&lt;br /&gt;&lt;br /&gt;The new version of Hoogle is much cleverer. It works by exploring a graph, following similar ideas to &lt;a href="http://en.wikipedia.org/wiki/Dijkstra's_algorithm"&gt;Dijkstra's algorithm&lt;/a&gt;, to reach more suitable results first. Typically, the best answers will be given without any search of the graph, and then as the graph is searched more results will appear with lower closeness. With the new scheme the complexity is &lt;i&gt;O(m)&lt;/i&gt;, where &lt;i&gt;m&lt;/i&gt; is the number of results you want. I hope at some point after the SoC is finished to describe the algorithm properly, so others can understand it, and hopefully improve upon it.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Next week:&lt;/b&gt; Finishing off type searching, so it supports all the features planned. Build system work, and potentially a cabal pre-release.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;User visible changes:&lt;/b&gt; Type search works to some degree, but not perfectly. Database debugging options (conversion and dumping to a text file) have been added.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7094652-8880686691704898747?l=neilmitchell.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://neilmitchell.blogspot.com/feeds/8880686691704898747/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=7094652&amp;postID=8880686691704898747' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/8880686691704898747'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/8880686691704898747'/><link rel='alternate' type='text/html' href='http://neilmitchell.blogspot.com/2008/07/gsoc-hoogle-week-6.html' title='GSoC Hoogle: Week 6'/><author><name>Neil Mitchell</name><uri>http://www.blogger.com/profile/13084722756124486154</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www-users.cs.york.ac.uk/~ndm/elements/my-photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7094652.post-1524166190363551181</id><published>2008-06-26T19:10:00.003+01:00</published><updated>2008-06-26T19:22:38.511+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='soc'/><category scheme='http://www.blogger.com/atom/ns#' term='hoogle'/><title type='text'>GSoC Hoogle: Week 5</title><content type='html'>This week I was going to tackle type searching, but then realised I'm going to spend 6 hours on Friday on a train (hence the weekly update on Thursday), so can spend that time productively working on paper tackling type search. So instead of type search, I worked on a few other pieces, some of which make type search easier:&lt;br /&gt;&lt;br /&gt;&lt;i&gt;Haddock Database Generation&lt;/i&gt; More patches to get better output from Haddock. The code now handles class methods properly, and deals with some FFI bits.&lt;br /&gt;&lt;br /&gt;&lt;i&gt;Lazy Name Searching&lt;/i&gt; Searching for a name is now fairly lazy. When searching for a name, Hoogle can return the prefix of the results without doing too much computation to calculate all the results. This work is useful in its own right, but very necessary for the type searching, and can be reused.&lt;br /&gt;&lt;br /&gt;&lt;i&gt;Hoogle --info&lt;/i&gt; The biggest feature added this week is the &lt;tt&gt;--info&lt;/tt&gt; flag. When this flag is given, Hoogle picks the first result and gives more details, including any Haddock documentation associated with the function. For example:&lt;br /&gt;&lt;br /&gt;&lt;tt&gt;&lt;br /&gt;$ hoogle +tagsoup openurl --info&lt;br /&gt;Text.HTML.Download openURL :: String -&gt; IO String&lt;br /&gt;&lt;br /&gt;This function opens a URL on the internet. Any http:// prefix is ignored.&lt;br /&gt;&lt;br /&gt;&gt; openURL "www.haskell.org/haskellwiki/Haskell"&lt;br /&gt;&lt;br /&gt;Known Limitations:&lt;br /&gt;&lt;br /&gt;* Only HTTP on port 80&lt;br /&gt;* Outputs the HTTP Headers as well&lt;br /&gt;* Does not work with all servers&lt;br /&gt;&lt;br /&gt;It is hoped that a more reliable version of this function will be placed in a new HTTP library at some point! &lt;br /&gt;&lt;/tt&gt;&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Next week:&lt;/b&gt; Type searching! See last week for a description of what I hope to achieve.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;User visible changes:&lt;/b&gt; The &lt;tt&gt;--info&lt;/tt&gt; flag now exists.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7094652-1524166190363551181?l=neilmitchell.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://neilmitchell.blogspot.com/feeds/1524166190363551181/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=7094652&amp;postID=1524166190363551181' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/1524166190363551181'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/1524166190363551181'/><link rel='alternate' type='text/html' href='http://neilmitchell.blogspot.com/2008/06/gsoc-hoogle-week-5.html' title='GSoC Hoogle: Week 5'/><author><name>Neil Mitchell</name><uri>http://www.blogger.com/profile/13084722756124486154</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www-users.cs.york.ac.uk/~ndm/elements/my-photo.jpg'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7094652.post-727343212937094724</id><published>2008-06-22T14:42:00.004+01:00</published><updated>2008-06-22T14:56:34.950+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='soc'/><category scheme='http://www.blogger.com/atom/ns#' term='hoogle'/><title type='text'>GSoC Hoogle: Week 4</title><content type='html'>This week I've stayed in one place, and had lots of opportunity to get on with Hoogle. I've done a number of different things this week:&lt;br /&gt;&lt;br /&gt;&lt;i&gt;More on Haddock databases&lt;/i&gt; I fixed a number of issues with the Haddock generated Hoogle information. These patches have been submitted back to Haddock.&lt;br /&gt;&lt;br /&gt;&lt;i&gt;Binary Defer library&lt;/i&gt; I merged the binary defer library into the Hoogle sources, and modified it substantially. Some of the modifications were thanks to suggestions from the Haskell community, particularly David Roundy. The library is now more robust, and is being used as a solid foundation to build the rest of Hoogle on top of.&lt;br /&gt;&lt;br /&gt;&lt;i&gt;Text Searching&lt;/i&gt; You can now search for words, even multiple words, and the search will be performed. The text searching uses efficient data structures, scales excellently, and returns better results first.&lt;br /&gt;&lt;br /&gt;&lt;i&gt;Suggestions&lt;/i&gt; These improvements were detailed &lt;a href="http://neilmitchell.blogspot.com/2008/06/hoogle-4-new-features.html"&gt;earlier in the week&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Next week:&lt;/b&gt; Type searching. I have various ideas on how to go about this, but it is the most tricky part of the whole project. I hope to come up with the perfect solution by the end of the week, but if not, will come up with something good enough for Hoogle 4 then revise it after the Summer is over (it could easily suck in a whole Summer of time if I am not careful!). Much of the low-level infrastructure is already present, so it is just the search algorithm.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;User visible changes:&lt;/b&gt; Text searching works. A session with Hoogle as it currently stands:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;&gt; cabal haddock --hoogle&lt;br /&gt;-- generates tagsoup.txt&lt;br /&gt;&gt; hoogle --convert=tagsoup.txt&lt;br /&gt;Generating Hoogle database&lt;br /&gt;Written tagsoup.hoo&lt;br /&gt;&gt; hoogle +tagsoup is open --color&lt;br /&gt;Text.HTML.TagSoup.Type &lt;b&gt;is&lt;/b&gt;Tag&lt;b&gt;Open&lt;/b&gt; :: Tag -&gt; Bool&lt;br /&gt;Text.HTML.TagSoup.Type &lt;b&gt;is&lt;/b&gt;Tag&lt;b&gt;Open&lt;/b&gt;Name :: String -&gt; Tag -&gt; Bool&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7094652-727343212937094724?l=neilmitchell.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://neilmitchell.blogspot.com/feeds/727343212937094724/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=7094652&amp;postID=727343212937094724' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/727343212937094724'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/727343212937094724'/><link rel='alternate' type='text/html' href='http://neilmitchell.blogspot.com/2008/06/gsoc-hoogle-week-4.html' title='GSoC Hoogle: Week 4'/><author><name>Neil Mitchell</name><uri>http://www.blogger.com/profile/13084722756124486154</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www-users.cs.york.ac.uk/~ndm/elements/my-photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7094652.post-4642009999357425993</id><published>2008-06-18T22:37:00.003+01:00</published><updated>2008-06-19T09:35:12.930+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='hoogle'/><title type='text'>Hoogle 4 New Features</title><content type='html'>I'm still developing Hoogle 4, and there are many things that don't work (such as searching for types and the web version). However, it's starting to come together, and I'm beginning to implement new features that aren't in Hoogle 3. Today I've implemented two useful features.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Multi Word Text Search&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;In Hoogle 3, if you entered "is just" it would be treated as a type search, exactly the same as "m a". Now, it will search for "is" and search for "just" and intersect the results. This seems to be something that people often try, so hopefully will make Hoogle more intuitive.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Intelligent Suggestions&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;Hoogle 3 tries to give suggestions, for example if I search for "a -&gt; maybe a" it will helpfully suggest "a -&gt; Maybe a". Unfortunately it's not that clever. If your search term contains a type variable (starting with a lower-case letter), which is more than one letter, it will suggest you try the capitalised version. For example, "(fst,snd) -&gt; snd" will suggest "(Fst,Snd) -&gt; Snd", which isn't very helpful.&lt;br /&gt;&lt;br /&gt;The new mechanism uses knowledge about the types, arities and constructors present in the Hoogle database. Some examples:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;"Just a -&gt; a"  ===&gt; "Maybe a -&gt; a"&lt;br /&gt;"a -&gt; Maybe"   ===&gt; "a -&gt; Maybe b"&lt;br /&gt;"a -&gt; MayBe a" ===&gt; "a -&gt; Maybe a"&lt;br /&gt;"a -&gt; maybe a" ===&gt; "a -&gt; Maybe a"&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7094652-4642009999357425993?l=neilmitchell.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://neilmitchell.blogspot.com/feeds/4642009999357425993/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=7094652&amp;postID=4642009999357425993' title='3 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/4642009999357425993'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/4642009999357425993'/><link rel='alternate' type='text/html' href='http://neilmitchell.blogspot.com/2008/06/hoogle-4-new-features.html' title='Hoogle 4 New Features'/><author><name>Neil Mitchell</name><uri>http://www.blogger.com/profile/13084722756124486154</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www-users.cs.york.ac.uk/~ndm/elements/my-photo.jpg'/></author><thr:total>3</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7094652.post-77112479075721572</id><published>2008-06-15T13:44:00.003+01:00</published><updated>2008-06-15T13:56:01.628+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='darcs'/><title type='text'>darcs over FTP</title><content type='html'>I'm currently unable to access SSH, and suspect this situation will persist for most of the Summer. Most of my darcs repo's are behind SSH, so this presents a problem. I've been looking for a way to work with darcs over FTP, and have managed to get it going on Windows. The following are instructions for (1) me when I forget them and (2) any Windows users who want to follow the same path. If you are a Linux user, then similar information is available from &lt;a href="http://www.riffraff.info/2007/6/5/using-darcs-with-ftp-and-without-ssh"&gt;this blog post&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Step 1: Install Sitecopy&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;Go to &lt;a href="http://dennisbareis.com/freew32.htm"&gt;http://dennisbareis.com/freew32.htm&lt;/a&gt; and download and install &lt;tt&gt;SITECPY&lt;/tt&gt;.&lt;br /&gt;&lt;br /&gt;Add "C:\Program Files\SITECOPY" to your path.&lt;br /&gt;&lt;br /&gt;Add "C:\Home" to a &lt;tt&gt;%HOME%&lt;/tt&gt; variable.&lt;br /&gt;&lt;br /&gt;Open up a command line and type:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;c:\&gt; mkdir home&lt;br /&gt;c:\&gt; cd home&lt;br /&gt;c:\home&gt; mkdir .sitecopy&lt;br /&gt;c:\home&gt; echo . &gt; .sitecopyrc&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Step 2: Prepare the FTP site&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;Go to the FTP site, and create a directory. In my particular example, I created &lt;tt&gt;darcs/hoogle&lt;/tt&gt; so I could mirror the Hoogle repo.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Step 3: Configure Sitecopy&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;Edit the file "c:\home\.sitecopyrc" to contain:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;site          hoogle&lt;br /&gt;   server     ftp.york.ac.uk&lt;br /&gt;   username   ndm500&lt;br /&gt;   local      C:\Neil\hoogle&lt;br /&gt;   remote     web/darcs/hoogle&lt;br /&gt;   port       21&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Obviously, substituting in your relevant details.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Step 4: Initialise Sitecopy&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;Type:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;sitecopy --init hoogle&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;&lt;b&gt;darcs push&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;Now to do a darcs push, you can type:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;sitecopy --update hoogle&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;The first copy will take a long time, but subsequent copies should be a lot faster.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;darcs pull&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;After all this, you can either pull using FTP, or if your FTP is also a web site, you can pull over http. For example:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;darcs get http://www-users.york.ac.uk/~ndm500/darcs/hoogle/&lt;br /&gt;&lt;/pre&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7094652-77112479075721572?l=neilmitchell.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://neilmitchell.blogspot.com/feeds/77112479075721572/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=7094652&amp;postID=77112479075721572' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/77112479075721572'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/77112479075721572'/><link rel='alternate' type='text/html' href='http://neilmitchell.blogspot.com/2008/06/darcs-over-ftp.html' title='darcs over FTP'/><author><name>Neil Mitchell</name><uri>http://www.blogger.com/profile/13084722756124486154</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www-users.cs.york.ac.uk/~ndm/elements/my-photo.jpg'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7094652.post-6643110200132745635</id><published>2008-06-15T13:25:00.004+01:00</published><updated>2008-06-15T13:44:46.426+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='soc'/><category scheme='http://www.blogger.com/atom/ns#' term='haddock'/><category scheme='http://www.blogger.com/atom/ns#' term='hoogle'/><title type='text'>GSoC Hoogle: Week 3</title><content type='html'>This week I've travelled a further 600 miles by train, but am now starting to get settled for the Summer, and down to work on Hoogle.&lt;br /&gt;&lt;br /&gt;My main focus this week has been getting Haddock to generate Hoogle databases. For Haddock 0.8 I added in a &lt;tt&gt;--hoogle&lt;/tt&gt; flag to generate Hoogle databases, and a similar &lt;tt&gt;--hoogle&lt;/tt&gt; flag to Cabal. Unfortunately, for Haddock 2.0, the feature was removed as most of the code got rewritten. Now I've added the feature back, making extensive use of the GHC API to reduce the amount of custom pretty-printing required, and to support more Haskell features. The code has been added to the development Haddock branch, and will be present in the next release.&lt;br /&gt;&lt;br /&gt;Most of the challenge was working with the GHC API. It's certainly a powerful body of code, but suffers from being inconsistent in various places and poorly documented. I mainly worked with the code using &lt;tt&gt;:i&lt;/tt&gt; to view the API. I got bitten by various problems such as the &lt;tt&gt;&lt;a href="http://darcs.haskell.org/ghc/compiler/utils/Outputable.lhs"&gt;Outputable&lt;/a&gt;&lt;/tt&gt; module exporting useful functions such as &lt;tt&gt;mkUserStyle :: QueryQualifies -&gt; Depth -&gt; PprStyle&lt;/tt&gt;, but not exporting any functions that can create a &lt;tt&gt;Depth&lt;/tt&gt; value, and therefore not actually being usable. If Hoogle and Haddock could be used over the GHC API, it would substantially improve the development experience!&lt;br /&gt;&lt;br /&gt;I've also worked more on defining the database format. I am about to start work on the implementation today. I've also added a few more command line flags, but mainly as placeholders.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Next week:&lt;/b&gt; Database creation and text searches (looking back I see some similarity to last week!)&lt;br /&gt;&lt;br /&gt;&lt;b&gt;User visible changes:&lt;/b&gt; &lt;tt&gt;haddock --hoogle&lt;/tt&gt; now works.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7094652-6643110200132745635?l=neilmitchell.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://neilmitchell.blogspot.com/feeds/6643110200132745635/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=7094652&amp;postID=6643110200132745635' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/6643110200132745635'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/6643110200132745635'/><link rel='alternate' type='text/html' href='http://neilmitchell.blogspot.com/2008/06/gsoc-hoogle-week-3.html' title='GSoC Hoogle: Week 3'/><author><name>Neil Mitchell</name><uri>http://www.blogger.com/profile/13084722756124486154</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www-users.cs.york.ac.uk/~ndm/elements/my-photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7094652.post-232783859478539307</id><published>2008-06-09T08:58:00.002+01:00</published><updated>2008-06-09T09:18:11.194+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='soc'/><category scheme='http://www.blogger.com/atom/ns#' term='hoogle'/><category scheme='http://www.blogger.com/atom/ns#' term='derive'/><title type='text'>GSoC Hoogle: Week 2</title><content type='html'>This week I submitted my PhD thesis, emptied my entire rented house of furniture, spent £96 on petrol, drove (or was driven) 400 miles, travelled a similar distance by train, have been to the north of Scotland and am currently working on a borrowed Mac in London. Needless to say, its been rather busy - but now all the excitement is over and I should be able to focus properly on Hoogle.&lt;br /&gt;&lt;br /&gt;In the last week I've been focusing on the database, the store of all the function names and type signatures, so a very critical piece of information. I want to support fast searching, which doesn't slow down as the number of known functions increases - a nasty property of the current version. For text searching, the trie data structure has this nice property, and can deal with searching for substrings. For fuzzy type searching, things are a lot more complex. However, I think I have an algorithm which is fast (few operations), accurate (gives better matches), scalable (independent of the number of functions in the database) and lazy (returns the best results first). The idea is to have a graph of function results, and then navigate this graph to find the best match. &lt;br /&gt;&lt;br /&gt;Most of the database work has been theoretical, but I have done some coding. In particular, I have started on the database creation code, and polished the flag argument interaction code some more. Part of the development required the &lt;a href="http://www-users.cs.york.ac.uk/~ndm/derive/"&gt;Derive&lt;/a&gt; tool, and in doing this work I noticed a few deficiencies. In particular, if you run Windows and run derive over a UNIX line-ending file, the tool will generate a Windows line-ending file. This problem, and a few others, are now fixed.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Next week:&lt;/b&gt; Database creation and searching. I want text searches to work by the end of the week.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;User visible changes:&lt;/b&gt; The &lt;tt&gt;--help&lt;/tt&gt; flag prints out information on the arguments.&lt;br /&gt;&lt;br /&gt;PS. I was looking forward to seeing some blog posts from the other Haskell summer of code students on the Haskell Planet. If any Haskell GSoC student does have a blog, please ask for it to be included!&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7094652-232783859478539307?l=neilmitchell.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://neilmitchell.blogspot.com/feeds/232783859478539307/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=7094652&amp;postID=232783859478539307' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/232783859478539307'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/232783859478539307'/><link rel='alternate' type='text/html' href='http://neilmitchell.blogspot.com/2008/06/gsoc-hoogle-week-2.html' title='GSoC Hoogle: Week 2'/><author><name>Neil Mitchell</name><uri>http://www.blogger.com/profile/13084722756124486154</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www-users.cs.york.ac.uk/~ndm/elements/my-photo.jpg'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7094652.post-4706635847040048732</id><published>2008-06-01T16:28:00.002+01:00</published><updated>2008-06-01T16:42:11.439+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='soc'/><category scheme='http://www.blogger.com/atom/ns#' term='hoogle'/><title type='text'>GSoC Hoogle: Week 1</title><content type='html'>I started my &lt;a href="http://neilmitchell.blogspot.com/2008/04/summer-of-code-2008.html"&gt;Google Summer of Code&lt;/a&gt; project on &lt;a href="http://haskell.org/hoogle/"&gt;Hoogle&lt;/a&gt; at the beginning of this week. In my initial application I promised to make my weekly updates via blog, so here is the first weeks report:&lt;br /&gt;&lt;br /&gt;I've only done about half a weeks work on Hoogle this week, because I'm handing in my PhD thesis early next week, and because I'm moving house on Wednesday. I spent 14 hours on Saturday moving furniture, and many more hours than that on my thesis! I should be fully devoted to GSoC by the middle of next week.&lt;br /&gt;&lt;br /&gt;Despite all the distractions, I did manage to start work on Hoogle. I created a new project for Hoogle at the &lt;a href="http://community.haskell.org"&gt;community.haskell.org&lt;/a&gt; site, and an associated darcs repo at &lt;a href="http://code.haskell.org/hoogle"&gt;http://code.haskell.org/hoogle&lt;/a&gt;. I've done a number of things on Hoogle:&lt;br /&gt;&lt;br /&gt;&lt;ul&gt;&lt;br /&gt;&lt;li&gt;Improved the developer documentation in some places&lt;/li&gt;&lt;br /&gt;&lt;li&gt;Reorganised the repo, moving away dead files&lt;/li&gt;&lt;br /&gt;&lt;li&gt;Work on command line flags, parsing them etc.&lt;/li&gt;&lt;br /&gt;&lt;li&gt;Added a framework for running regression tests&lt;/li&gt;&lt;br /&gt;&lt;li&gt;Organise the command line/CGI division&lt;/li&gt;&lt;br /&gt;&lt;/ul&gt;&lt;br /&gt;&lt;br /&gt;I've started work from the front, and am intending to first flesh out an API and command line client, then move on to the web front end. The biggest change from the current implementation of Hoogle will be that there is one shared binary, which will be able to function in a number of modes. These modes will include running as a web server, as a command line version, as an interactive (Hugs/GHCi style) program, documentation location etc. This will allow easier installation, and let everyone host their own web-based Hoogle without much effort.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Next week:&lt;/b&gt; I hope to move towards the command line client and central Hoogle database structure. I also hope to chat to the Haddock 2 people, and try and get some integration similar to Haddock 1's &lt;tt&gt;--hoogle&lt;/tt&gt; flag.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;User visible changes:&lt;/b&gt; Hoogle 4 as it currently stands is unable to run searches, although &lt;tt&gt;hoogle --test&lt;/tt&gt; will run some regression tests.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7094652-4706635847040048732?l=neilmitchell.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://neilmitchell.blogspot.com/feeds/4706635847040048732/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=7094652&amp;postID=4706635847040048732' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/4706635847040048732'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/4706635847040048732'/><link rel='alternate' type='text/html' href='http://neilmitchell.blogspot.com/2008/06/gsoc-hoogle-week-1.html' title='GSoC Hoogle: Week 1'/><author><name>Neil Mitchell</name><uri>http://www.blogger.com/profile/13084722756124486154</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www-users.cs.york.ac.uk/~ndm/elements/my-photo.jpg'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7094652.post-4913976034839480592</id><published>2008-05-22T21:05:00.003+01:00</published><updated>2008-05-22T21:28:21.386+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='tagsoup'/><title type='text'>Interactive TagSoup parsing</title><content type='html'>I've written quite a few programs using the &lt;a href="http://www-users.cs.york.ac.uk/~ndm/tagsoup/"&gt;tagsoup library&lt;/a&gt;, but have never really used the library interactively. Today I was wondering how many packages on hackage use all lower case names, compared to those starting with an initial capital. This sounds like a great opportunity to experiment! The rest of this post is a GHCi transcript, with my comments on what I'm doing prefixed with &lt;tt&gt;--&lt;/tt&gt; characters.&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;$ &lt;b&gt;ghci&lt;/b&gt;&lt;br /&gt;GHCi, version 6.8.2: http://www.haskell.org/ghc/  :? for help&lt;br /&gt;Loading package base ... linking ... done.&lt;br /&gt;&lt;i&gt;-- load some useful packages&lt;/i&gt;&lt;br /&gt;Prelude&gt; &lt;b&gt;:m Text.HTML.TagSoup Text.HTML.Download Data.List Data.Char Data.Maybe&lt;/b&gt;&lt;br /&gt;Prelude Data.Maybe Data.Char Data.List Text.HTML.Download Text.HTML.TagSoup&gt;&lt;br /&gt;&lt;i&gt;-- ouch, that prompt is a bit long - we can use :set prompt to shorten it&lt;br /&gt;-- side note: I actually supplied the patch for set prompt :)&lt;/i&gt;&lt;br /&gt;   &lt;b&gt;:set prompt "Meep&gt; "&lt;/b&gt;&lt;br /&gt;&lt;i&gt;-- lets download the list of packages&lt;/i&gt;&lt;br /&gt;Meep&gt; &lt;b&gt;src &amp;lt;- openURL "http://hackage.haskell.org/packages/archive/pkg-list.html"&lt;/b&gt;&lt;br /&gt;... src scrolls pass the screen ...&lt;br /&gt;&lt;i&gt;-- parse the file, dropping everything before the packages&lt;/i&gt;&lt;br /&gt;Meep&gt; &lt;b&gt;let parsed = dropWhile (~/= "&amp;lt;h3&gt;") $ parseTags src&lt;/b&gt;&lt;br /&gt;&lt;i&gt;-- grab the list of packages&lt;/i&gt;&lt;br /&gt;Meep&gt; &lt;b&gt;let packages = sort [x | a:TagText x:_ &amp;lt;- tails parsed, a ~== "&amp;lt;a href&gt;"]&lt;/b&gt;&lt;br /&gt;&lt;i&gt;-- now we can query the list of packages&lt;/i&gt;&lt;br /&gt;Meep&gt; &lt;b&gt;length packages&lt;/b&gt;&lt;br /&gt;648&lt;br /&gt;Meep&gt; &lt;b&gt;length $ filter (all isLower) packages&lt;/b&gt;&lt;br /&gt;320&lt;br /&gt;Meep&gt; &lt;b&gt;length $ filter ('_' `elem`) packages&lt;/b&gt;&lt;br /&gt;0&lt;br /&gt;Meep&gt; &lt;b&gt;length $ filter ('-' `elem`) packages&lt;/b&gt;&lt;br /&gt;165&lt;br /&gt;Meep&gt; &lt;b&gt;length $ filter (any isUpper . dropWhile isUpper) packages&lt;/b&gt;&lt;br /&gt;100&lt;br /&gt;Meep&gt; &lt;b&gt;length $ filter (isPrefixOf "hs" . map toLower) packages&lt;/b&gt;&lt;br /&gt;47&lt;br /&gt;Meep&gt; &lt;b&gt;length $ filter (any isDigit) packages&lt;/b&gt;&lt;br /&gt;37&lt;br /&gt;Meep&gt; &lt;b&gt;reverse $ sort $ map (\(x:xs) -&gt; (1 + length xs,x)) $ group $ sort $ conca&lt;br /&gt;t packages&lt;/b&gt;&lt;br /&gt;[(484,'e'),(374,'a'),(346,'r'),(336,'s'),(335,'t'),(306,'i'),(272,'l'),(248,'c')&lt;br /&gt;,(247,'n'),(240,'o'),(227,'p'),(209,'h'),(185,'-'),(171,'m'),(159,'d'),(126,'g')&lt;br /&gt;,(112,'b'),(96,'u'),(87,'y'),(78,'k'),(76,'f'),(74,'x'),(58,'S'),(53,'H'),(35,'w&lt;br /&gt;'),(33,'v'),(29,'q'),(29,'L'),(27,'A'),(26,'F'),(24,'D'),(23,'C'),(22,'T'),(16,'&lt;br /&gt;P'),(16,'M'),(16,'I'),(16,'G'),(13,'B'),(12,'W'),(12,'3'),(12,'2'),(10,'O'),(9,'&lt;br /&gt;R'),(9,'1'),(8,'z'),(8,'j'),(8,'E'),(7,'X'),(7,'U'),(7,'N'),(6,'Y'),(6,'V'),(5,'&lt;br /&gt;J'),(4,'Q'),(4,'5'),(4,'4'),(3,'Z'),(3,'8'),(3,'6'),(1,'9')]&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;We can see that loads of packages use lowercase, lots of packages use upper case, quite a few use CamelCase, quite a few start with "hs", none use "_", but lots use "-". The final query figures out which is the most common letter in hackage packages, and rather unsurprisingly, it roughly follows the frequency of English letters.&lt;br /&gt;&lt;br /&gt;TagSoup and GHCi make a potent combination for obtaining and playing with webpages.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7094652-4913976034839480592?l=neilmitchell.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://neilmitchell.blogspot.com/feeds/4913976034839480592/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=7094652&amp;postID=4913976034839480592' title='4 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/4913976034839480592'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/4913976034839480592'/><link rel='alternate' type='text/html' href='http://neilmitchell.blogspot.com/2008/05/interactive-tagsoup-parsing.html' title='Interactive TagSoup parsing'/><author><name>Neil Mitchell</name><uri>http://www.blogger.com/profile/13084722756124486154</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www-users.cs.york.ac.uk/~ndm/elements/my-photo.jpg'/></author><thr:total>4</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7094652.post-3126048236804323086</id><published>2008-05-18T16:58:00.002+01:00</published><updated>2008-05-18T18:02:29.396+01:00</updated><title type='text'>Haskell and Performance</title><content type='html'>There has been lots of discussion on the Haskell mailing lists about the speed of Haskell. There are many conflicting opinions, and lots of different advice. Some of the information on Haskell's performance is written as a sales pitch, some is based on outdated knowledge. Since I've been working on optimisation for a while, I thought I'd try and give a snapshot of Haskell performance. Most of the following is personal opinion, and others could quite validly disagree. Since GHC is the best performing Haskell compiler, I have used Haskell to mean GHC with the -O2 flag throughout.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;High-level Haskell is not as fast as C&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;If you write Haskell in a standard manner, it is unlikely to perform as fast as C. In most cases, linked-lists are slower than arrays. Laziness is more expensive than strictness. The Haskell code will almost always be shorter, and more concise, since it will abstract over low-level detail. But by writing that low-level detail in the C code, you are likely to produce faster code.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Low-level Haskell is competitive with C&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;If you use GHC, with unboxed operations, written in a low-level style, you can obtain similar performance to C. The Haskell won't be as nice as it was before, but will still probably express fewer details than the C code. Writing in such a low-level manner requires more knowledge of Haskell, and is probably something that a beginner should not be attempting. However, for a critical inner loop, low-level Haskell is a very attractive option.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Haskell's Code Generator is weak&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;The back end assembly generator in GHC is a weak link, but improvements are being carried out. After this work has been finished, it is likely that low-level Haskell will be able to produce nearly identical assembly code to C.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Some Haskell libraries are poorly optimised&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;Some of the central Haskell libraries have functions which are badly optimised. For example, the MTL library is known to be poorly performing. The &lt;tt&gt;words&lt;/tt&gt; and &lt;tt&gt;isSpace&lt;/tt&gt; functions in the base library aren't very good. These issues are being addressed over time, the Binary and ByteString libraries have fixed two holes. A new implementation of &lt;tt&gt;inits&lt;/tt&gt; has been contributed. Over time, more issues will be identified and fixed, improving the speed of all code.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Haskell's multi-threaded performance is amazing&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;A lot of clever people have done a lot of clever work on making multi-threaded programming in Haskell both simple and fast. While low-level speed matters for general programming, for multi-threaded programming there are lots of much higher-level performance considerations. Haskell supports better abstraction, and can better optimise at this level, outperforming C.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Reading the Core is not easy&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;A standard advice to people trying to optimise Haskell is to read the Core - the low-level functional language used as an intermediate form in the compiler. While Core provides much useful information about what optimisations were performed, it isn't easy to read, and takes a lot of practice. Some effort has been done to make reading Core easier, but I still wouldn't recommend it for beginners.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Optimisation without profiling is pointless&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;People often want to make programs run faster. In general, this activity is a waste of time. I recently wrote a program for the HCI group at my university, which takes 10 minutes to run, and requires 4Gb of RAM, on a very expensive machine. I haven't even bothered to profile the program, because I have better things to do. Unless the speed of something actually makes a difference, you should not be spending excessive effort on optimisation.&lt;br /&gt;&lt;br /&gt;If you have determined that the program in question is running too slowly, then profile. After profiling, you can usually identify some small part of the program that needs optimisation. Too often there is a focus on speeding up something that is not slow enough to make a difference.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;The trend is for higher-level optimisation&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;As time goes buy, higher-level programs keep getting faster and faster. The ByteString work allows programmers to write high-level programs that are competitive with C. Performance enhancements are being made to the compiler regularly, pointer tagging, constructor specialisation etc. are all helping to improve things. More long term projects such as Supero and NDP are showing some nice results. Optimisation is a  difficult problem, but progress is being made, allowing programs to be written in a higher-level.&lt;br /&gt;&lt;br /&gt;My goal is that one day Haskell programs will be written in a very declarative, high-level style - and outperform C at the same time. I think this goal is obtainable, albeit some way in the future.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7094652-3126048236804323086?l=neilmitchell.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://neilmitchell.blogspot.com/feeds/3126048236804323086/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=7094652&amp;postID=3126048236804323086' title='14 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/3126048236804323086'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/3126048236804323086'/><link rel='alternate' type='text/html' href='http://neilmitchell.blogspot.com/2008/05/haskell-and-performance.html' title='Haskell and Performance'/><author><name>Neil Mitchell</name><uri>http://www.blogger.com/profile/13084722756124486154</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www-users.cs.york.ac.uk/~ndm/elements/my-photo.jpg'/></author><thr:total>14</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7094652.post-7446931250616736343</id><published>2008-05-10T07:12:00.004+01:00</published><updated>2008-05-10T07:43:43.739+01:00</updated><title type='text'>Bad strictness</title><content type='html'>Haskell has one primitive construct for enforcing strictness, &lt;tt&gt;seq :: a -&gt; b -&gt; b&lt;/tt&gt;. The idea is that the first argument is evaluated to weak-head normal form (WHNF), then the second argument is evaluated to WHNF and returned. WHNF is reduction until the outermost bit is available - either a function value, or an outer constructor.&lt;br /&gt;&lt;br /&gt;You can model the behaviour by introducing an &lt;tt&gt;evaluate&lt;/tt&gt; function, in a lower-level language, and showing how to perform reduction:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;evaluate (seq a b) = do&lt;br /&gt;    a' &lt;- evaluate a&lt;br /&gt;    b' &lt;- evaluate b&lt;br /&gt;    return b'&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;The &lt;tt&gt;evaluate&lt;/tt&gt; function must return an evaluated argument, and it wants to return &lt;tt&gt;b&lt;/tt&gt; which is not already evaluated, so it must make a recursive call. The &lt;tt&gt;evaluate&lt;/tt&gt; function for &lt;tt&gt;id&lt;/tt&gt;, which simply returns its argument, is:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;evaluate (id a) = do&lt;br /&gt;    a' &lt;- evaluate a&lt;br /&gt;    return a'&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Notice that even though &lt;tt&gt;id&lt;/tt&gt; does "nothing", it still has to evaluate its argument. Of course, &lt;tt&gt;evaluate (id x)&lt;/tt&gt; is the same as &lt;tt&gt;evaluate x&lt;/tt&gt;, so &lt;tt&gt;id&lt;/tt&gt; performs no additional work.&lt;br /&gt;&lt;br /&gt;Haskell is lazy, so if an expression has already been evaluated, then the &lt;tt&gt;evaluate&lt;/tt&gt; call will be incredibly cheap, and just return the previous result.  So let's consider the result of calling &lt;tt&gt;seq&lt;/tt&gt; with the same argument twice.&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;evaluate (seq a a) = do&lt;br /&gt;    a' &lt;- evaluate a&lt;br /&gt;    return a'&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;This time the second evaluation of &lt;tt&gt;a&lt;/tt&gt; is skipped, as &lt;tt&gt;a&lt;/tt&gt; is already evaluated. We can easily see that evaluation of &lt;tt&gt;seq a a&lt;/tt&gt; is exactly equivalent to &lt;tt&gt;a&lt;/tt&gt;. This means that any code which writes &lt;tt&gt;a `seq` a&lt;/tt&gt; is &lt;i&gt;wrong&lt;/i&gt;.  There is plenty of this code around, and one example (which prompted me to write this) is on slide 15 of &lt;a href="http://www.realworldhaskell.org/blog/2008/05/09/slides-from-last-nights-bayfp-talk/"&gt;this talk&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;The other classic &lt;tt&gt;seq&lt;/tt&gt; related mistake is &lt;tt&gt;id $! x&lt;/tt&gt;. The &lt;tt&gt;($!)&lt;/tt&gt; operator is for strict application, and is defined:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;f $! x = x `seq` f x&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;For the particular instance of &lt;tt&gt;id $! x&lt;/tt&gt;, we obtain &lt;tt&gt;x `seq` id x&lt;/tt&gt;. Of course, all that &lt;tt&gt;id x&lt;/tt&gt; does is evaluate &lt;tt&gt;x&lt;/tt&gt;, so again there is no change from just writing &lt;tt&gt;x&lt;/tt&gt;.&lt;br /&gt;&lt;br /&gt;There are valid uses of &lt;tt&gt;seq&lt;/tt&gt;, but any time you see either of the following constructs, you &lt;i&gt;know&lt;/i&gt; the programmer got it wrong:&lt;br /&gt;&lt;br /&gt;&lt;div style="border:2px solid red; background-color: #fdd; text-align:center;font-family:monospace;"&gt;x `seq` x&lt;br/&gt;id $! x&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7094652-7446931250616736343?l=neilmitchell.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://neilmitchell.blogspot.com/feeds/7446931250616736343/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=7094652&amp;postID=7446931250616736343' title='5 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/7446931250616736343'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/7446931250616736343'/><link rel='alternate' type='text/html' href='http://neilmitchell.blogspot.com/2008/05/bad-strictness.html' title='Bad strictness'/><author><name>Neil Mitchell</name><uri>http://www.blogger.com/profile/13084722756124486154</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www-users.cs.york.ac.uk/~ndm/elements/my-photo.jpg'/></author><thr:total>5</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7094652.post-5415954487663801694</id><published>2008-04-21T23:44:00.003+01:00</published><updated>2008-04-21T23:53:59.800+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='soc'/><category scheme='http://www.blogger.com/atom/ns#' term='hoogle'/><title type='text'>Summer of Code 2008</title><content type='html'>This year I am going to be participating in the &lt;a href="http://code.google.com/soc/2008/"&gt;Google Summer of Code&lt;/a&gt; as a student, for the &lt;a href="http://code.google.com/soc/2008/haskell/about.html"&gt;haskell.org organisation&lt;/a&gt;, on the &lt;a href="http://code.google.com/soc/2008/haskell/appinfo.html?csaid=896489B404EA57D4"&gt;Haskell API Search as an interface to Hackage&lt;/a&gt; project - aka &lt;a href="http://haskell.org/hoogle/"&gt;Hoogle 4&lt;/a&gt;. I will be mentored by &lt;a href="http://www.cs.chalmers.se/~d00nibro/"&gt;Niklas Broberg&lt;/a&gt;, author of various tools including &lt;a href="http://www.cs.chalmers.se/~d00nibro/haskell-src-exts/"&gt;Haskell Source Extensions&lt;/a&gt;, which is already used by Hoogle. My project link gives the summary I gave for the project, but below I've posted the interesting bits from my full application. I am going to be posting my progress at least once a week once the project phase starts (about 6 weeks time). I welcome any comments!&lt;br /&gt;&lt;br /&gt;&lt;hr/&gt;&lt;br /&gt;&lt;br /&gt;&lt;b&gt;What is the goal of the project you propose to do?&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;There are two main goals:&lt;br /&gt;&lt;br /&gt;1) Make Hoogle more useful to the community, along the same path as it is currently used.&lt;br /&gt;&lt;br /&gt;2) Make Hoogle suitable to use as the standard interface to Hackage.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Can you give some more detailed design of what precisely you intend to achieve?&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;# Removal of all bugs&lt;br /&gt;&lt;br /&gt;Hoogle 3 has a number of embarrassing bugs, some of which are not easily fixed. The nastiest of these is to do with monads, which are horribly mistreated. Since I now know the underlying issues which have caused a problem with Hoogle 3, things like higher-kinded type classes can be solved in a more principled manner.&lt;br /&gt;&lt;br /&gt;# Support for some Haskell type extensions&lt;br /&gt;&lt;br /&gt;Hoogle 3 does not deal with multi-parameter type classes. I would like to support a variety of type system extensions, primarily by mapping them on to Haskell 98 equivalent types.&lt;br /&gt;&lt;br /&gt;# Faster searching&lt;br /&gt;&lt;br /&gt;The current implementation is O(n) in the number of functions in the library, where the constant factor is absolutely massive. I wish to make text searching O(s), where s is the length of the search string, and have an incredibly low constant overhead -- using the a lazy file-based trie.&lt;br /&gt;&lt;br /&gt;The type searching also needs a massive speed up. I have some ideas on how to proceed, but it is a difficult problem! I will spend a small amount of time investigating this problem, but may have to use a simpler algorithm, rather than delay the rest of the project.&lt;br /&gt;&lt;br /&gt;# Better deployment&lt;br /&gt;&lt;br /&gt;Currently there is Hoogle for the base libraries, and a special (very hacked) version that supports Gtk2hs only. I have received several requests for custom Hoogle instances for tools such as XMonad, Ycr2js, wxHaskell etc. The new Hoogle will make deployment of individual versions for specific packages easy.&lt;br /&gt;&lt;br /&gt;# Support for multiple packages&lt;br /&gt;&lt;br /&gt;I wish to support searching through every package on Hackage at once. This requires a massive speed up in the searching algorithms.&lt;br /&gt;&lt;br /&gt;# Generalised text searching&lt;br /&gt;&lt;br /&gt;By searching both function names, and also cabal package descriptions, Hoogle can be much more useful in finding packages, as opposed to individual functions.&lt;br /&gt;&lt;br /&gt;# Better Design&lt;br /&gt;&lt;br /&gt;Hoogle 3 is a web application, with a hacked on command line program. Hoogle 4 will be a central API which can be reused from any IDE tools, and also used to build the web interface and the command line application.&lt;br /&gt;&lt;br /&gt;# Generalised interface to all of Cabal&lt;br /&gt;&lt;br /&gt;Hopefully all the above goals will result in a tool that is suitable to be an interface to Cabal.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;What deliverables do you think are reasonable targets? Can you outline an approximate schedule of milestones?&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;I would plan to release a beta of Hoogle 4 approximately half way through the project, as a web application. Much of the initial design has been done, so this is primarily hacking time.&lt;br /&gt;&lt;br /&gt;I would then hope to complete the final hackage integration for the second half. This stage will require discussion with the cabal people, and will be a combination of design, implementation and server administration/setup.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;In what ways will this project benefit the wider Haskell community? &lt;/b&gt;&lt;br /&gt;&lt;br /&gt;Hoogle is already of use to the community, but has never seen a final release, and has a reasonable number of known bugs. This project would produce a polished version of a tool for which we already know there is huge demand.&lt;br /&gt;&lt;br /&gt;Hackage is working well, and gaining new packages every day. As the number of packages increases, the interface to hackage must be updated to handle this volume. Discussions with some of the hackage/cabal team seem to suggest that a search interface is the way forward. By making Hackage easier to use, everyone benefits.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7094652-5415954487663801694?l=neilmitchell.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://neilmitchell.blogspot.com/feeds/5415954487663801694/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=7094652&amp;postID=5415954487663801694' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/5415954487663801694'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/5415954487663801694'/><link rel='alternate' type='text/html' href='http://neilmitchell.blogspot.com/2008/04/summer-of-code-2008.html' title='Summer of Code 2008'/><author><name>Neil Mitchell</name><uri>http://www.blogger.com/profile/13084722756124486154</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www-users.cs.york.ac.uk/~ndm/elements/my-photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7094652.post-3201719101251968153</id><published>2008-04-12T12:20:00.003+01:00</published><updated>2008-04-12T12:27:49.246+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='darcs'/><category scheme='http://www.blogger.com/atom/ns#' term='paper'/><title type='text'>darcs Feature Request (Part II)</title><content type='html'>I &lt;a href="http://neilmitchell.blogspot.com/2008/01/darcs-feature-request.html"&gt;previously requested&lt;/a&gt; a feature for darcs. I always pull from an http repo, and push over SSH. I have to push using &lt;tt&gt;--no-set-default&lt;/tt&gt; and typing the ssh repo in full, which I automate with a &lt;a href="http://darcs.haskell.org/packages/filepath/push.bat"&gt;.bat file&lt;/a&gt; in each repo.&lt;br /&gt;&lt;br /&gt;Today I noticed that darcs has &lt;tt&gt;_darcs/prefs/repos&lt;/tt&gt;, which seems to list the repo's that darcs has used. In one of my typical repo files, I have an http entry and an SSH entry. To get darcs to behave the way I want, all I need to do is push using the first non-http repo in that list.&lt;br /&gt;&lt;br /&gt;I have implemented my version of the darcs push command inside my paper tool, the code is all &lt;a href="http://www.cs.york.ac.uk/fp/darcs/paper/Paper/Push.hs"&gt;online here&lt;/a&gt;. Now I can delete all my push.bat scripts, and just type &lt;tt&gt;paper push&lt;/tt&gt; from any darcs repo. As an added bonus, I now don't need to change to the root directory to perform a push.&lt;br /&gt;&lt;br /&gt;It would be really nice if someone could incorporate this feature into the main darcs codebase. However, I'm quite happy using my paper tool for now. I certainly don't have time to patch, or even build darcs :-)&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7094652-3201719101251968153?l=neilmitchell.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://neilmitchell.blogspot.com/feeds/3201719101251968153/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=7094652&amp;postID=3201719101251968153' title='4 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/3201719101251968153'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/3201719101251968153'/><link rel='alternate' type='text/html' href='http://neilmitchell.blogspot.com/2008/04/darcs-feature-request-part-ii.html' title='darcs Feature Request (Part II)'/><author><name>Neil Mitchell</name><uri>http://www.blogger.com/profile/13084722756124486154</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www-users.cs.york.ac.uk/~ndm/elements/my-photo.jpg'/></author><thr:total>4</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7094652.post-3181172235163347391</id><published>2008-04-09T22:26:00.002+01:00</published><updated>2008-04-09T22:39:55.644+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='tagsoup'/><title type='text'>TagSoup Parsing: Dictionary Extraction</title><content type='html'>I've just read &lt;a href="http://www.haskell.org/sitewiki/images/0/0a/TMR-Issue10.pdf"&gt;issue 10&lt;/a&gt; of &lt;a href="http://www.haskell.org/haskellwiki/The_Monad.Reader"&gt;The Monad.Reader&lt;/a&gt;. It's a great issue, including a tutorial on using the new GHCi debugger, and how to write an efficient Haskell interpreter in Haskell. The running example for the GHCi debugger is parsing the &lt;a href="http://computerdictionary.tsf.org.za/dictionary/terms/computerdictionary-all.html"&gt;computer dictionary&lt;/a&gt; and extracting descriptions from keywords, using the &lt;a href="http://www.cs.york.ac.uk/~ndm/tagsoup/"&gt;TagSoup&lt;/a&gt; library. The article starts with an initial version of the extraction code, then fixes some mistakes using the debugger present in GHCi. The code was written to teach debugging, not as a demonstration of TagSoup. This post explains how I would have written the program.&lt;br /&gt;&lt;br /&gt;The original program is written in a low-level style. To search for a keyword, the program laboriously traverses through the file looking for the keyword, much like a modern imperative language might. But Haskell programmers can do better. We can separate the task: first parsing the keyword/description pairs into a list; then searching the list. Lazy evaluation will combine these separate operations to obtain something just as efficient as the original. By separating the concerns, we can express each at a higher-level, reducing the search function to a simple &lt;tt&gt;lookup&lt;/tt&gt;. It also gives us more flexibility for the future, allowing us to potentially reuse the parsing functions.&lt;br /&gt;&lt;br /&gt;I have fixed a number of other bugs in the code, and my solution is:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;module Main where&lt;br /&gt;import Text.HTML.TagSoup&lt;br /&gt;import Maybe&lt;br /&gt;&lt;br /&gt;main = do&lt;br /&gt;    putStr "Enter a term to search for: "&lt;br /&gt;    term &lt;- getLine&lt;br /&gt;    html &lt;- readFile "Dictionary.html"&lt;br /&gt;    let dict = parseDict $ parseTags html&lt;br /&gt;    putStrLn $ fromMaybe "No match found." $ lookup term dict&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;parseDict :: [Tag] -&gt; [(String,String)]&lt;br /&gt;parseDict = map parseItem &lt;br /&gt;          . sections (~== "&amp;lt;dt&amp;gt;")&lt;br /&gt;          . dropWhile (~/= "&amp;lt;div class=glosslist&amp;gt;")&lt;br /&gt;&lt;br /&gt;parseItem xs = (innerText a, unwords $ words $ innerText b)&lt;br /&gt;    where (a,b) = break (~== "&amp;lt;dd&amp;gt;") (takeWhile (~/= "&amp;lt;/dd&amp;gt;") xs)&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Instead of searching for a &lt;i&gt;single&lt;/i&gt; keyword, I parse &lt;i&gt;all&lt;/i&gt; keywords using &lt;tt&gt;parseDict&lt;/tt&gt;. The &lt;tt&gt;parseDict&lt;/tt&gt; function first skips over the gunk at the top of the file, then finds each definition, and parses it. The &lt;tt&gt;parseItem&lt;/tt&gt; function spots where tags begin and end, and takes the text from inside. The &lt;tt&gt;unwords $ words&lt;/tt&gt; expression is a neat trick for normalising the spacing within an arbitrary string.&lt;br /&gt;&lt;br /&gt;This revised program is shorter than the original, I find it easier to comprehend, and it provides more functionality with fewer bugs. The TagSoup library provides a robust base to work from, allowing concise expression of HTML/XML extraction programs.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7094652-3181172235163347391?l=neilmitchell.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://neilmitchell.blogspot.com/feeds/3181172235163347391/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=7094652&amp;postID=3181172235163347391' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/3181172235163347391'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/3181172235163347391'/><link rel='alternate' type='text/html' href='http://neilmitchell.blogspot.com/2008/04/tagsoup-parsing-dictionary-extraction.html' title='TagSoup Parsing: Dictionary Extraction'/><author><name>Neil Mitchell</name><uri>http://www.blogger.com/profile/13084722756124486154</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www-users.cs.york.ac.uk/~ndm/elements/my-photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7094652.post-663270179019449648</id><published>2008-04-08T18:12:00.006+01:00</published><updated>2008-04-10T13:00:05.801+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='tagsoup'/><title type='text'>Optional Parameters in Haskell</title><content type='html'>I use optional parameters in my &lt;a href="http://www-users.cs.york.ac.uk/~ndm/tagsoup/"&gt;TagSoup library&lt;/a&gt;, but it seems not to be a commonly known trick, as someone recently asked if the relevant line was a syntax error. So, here is how to pass optional parameters to a Haskell function.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Optional Parameters in Other Languages&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;Optional parameters are in a number of other languages, and come in a variety of flavours. Ada and Visual Basic both provide named and positional optional parameters. For example, given the definition:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;Sub Foo(b as Boolean = True, i as Integer = 0, s as String = "Hello")&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;We can make the calls:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;Call Foo(s = "Goodbye", b = False)&lt;br /&gt;Call Foo(False, 1)&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;In the first case we give named parameters, in the second we give all the parameters up to a certain position.&lt;br /&gt;&lt;br /&gt;In some languages, such as &lt;a href="http://www.cs.york.ac.uk/ftpdir/reports/YCST-2007-15.pdf"&gt;GP&lt;sup&gt;+&lt;/sup&gt;&lt;/a&gt;, you can say which parameters should take their default values:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;Call Foo(_, 42, _)&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Optional Parameters in Haskell&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;Haskell doesn't have built-in optional parameters, but using the record syntax, it is simple to encode named optional parameters.&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;data Foo = Foo {b :: Bool, i :: Integer, s :: String}&lt;br /&gt;defFoo = Foo True 0 "Hello"&lt;br /&gt;&lt;br /&gt;foo :: Foo -&gt; IO ()&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Now we can pass arguments by name, for example:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;foo defFoo{s = "Goodbye", b = False}&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;This syntax takes the value &lt;tt&gt;defFoo&lt;/tt&gt;, and replaces the fields &lt;tt&gt;s&lt;/tt&gt; and &lt;tt&gt;b&lt;/tt&gt; with the associated values. Using a type class, we can abstract this slightly:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;class Def a where&lt;br /&gt;    def :: a&lt;br /&gt;&lt;br /&gt;instance Def Foo where&lt;br /&gt;    def = defFoo&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Now we can make all functions taking default arguments use the &lt;tt&gt;def&lt;/tt&gt; argument as a basis, allowing type inference and type classes to choose the correct default type. Even still, optional parameters in Haskell are not quite as neat as in other languages, but the other features of Haskell mean they are required less often. &lt;br /&gt;&lt;br /&gt;This technique has been used in TagSoup, particularly for the &lt;a href="http://hackage.haskell.org/packages/archive/tagsoup/0.4/doc/html/Text-HTML-TagSoup-Parser.html"&gt;&lt;tt&gt;parseTagOptions&lt;/tt&gt;&lt;/a&gt; function. I've also seen this technique used in &lt;a href="http://www.cs.york.ac.uk/fp/cpphs/"&gt;cpphs&lt;/a&gt; with the &lt;a href="http://hackage.haskell.org/packages/archive/cpphs/1.5/doc/html/Language-Preprocessor-Cpphs.html#v%3ArunCpphs"&gt;&lt;tt&gt;runCpphs&lt;/tt&gt;&lt;/a&gt; function.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7094652-663270179019449648?l=neilmitchell.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://neilmitchell.blogspot.com/feeds/663270179019449648/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=7094652&amp;postID=663270179019449648' title='11 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/663270179019449648'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/663270179019449648'/><link rel='alternate' type='text/html' href='http://neilmitchell.blogspot.com/2008/04/optional-parameters-in-haskell.html' title='Optional Parameters in Haskell'/><author><name>Neil Mitchell</name><uri>http://www.blogger.com/profile/13084722756124486154</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www-users.cs.york.ac.uk/~ndm/elements/my-photo.jpg'/></author><thr:total>11</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7094652.post-1264977951949316329</id><published>2008-04-01T11:33:00.005+01:00</published><updated>2008-04-01T12:36:03.501+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='catch'/><title type='text'>Automated Proof Reading</title><content type='html'>I've been spending the last few months writing papers, which requires lots and lots of proof reading. Yesterday &lt;a href="http://www-users.cs.york.ac.uk/~shackell/"&gt;Tom&lt;/a&gt; shared one trick he'd used to help with his thesis. He took his thesis, ran detex over it, then used a text-to-speech program to read it back to him. This trick ensures you pick up all the subtle things like repeated words, and forces you to listen to every word, not skim reading bits. I thought this was a great idea, so implemented something similar in my paper preparation tool (release forthcoming), so typing &lt;tt&gt;paper talk&lt;/tt&gt; reads the paper back to you, converting the LaTeX in a sensible way, to make it as readable as possible.&lt;br /&gt;&lt;br /&gt;Here is result of running &lt;tt&gt;paper talk&lt;/tt&gt; on the introduction section of my Catch paper:&lt;br /&gt;&lt;br /&gt;&lt;center&gt;&lt;br /&gt;&lt;object width="320" height="266" class="BLOG_video_class" id="BLOG_video-b741558237592552" classid="clsid:D27CDB6E-AE6D-11cf-96B8-444553540000" codebase="http://download.macromedia.com/pub/shockwave/cabs/flash/swflash.cab#version=6,0,40,0"&gt;&lt;param name="movie" value="http://www.youtube.com/get_player"&gt;&lt;param name="bgcolor" value="#FFFFFF"&gt;&lt;param name="allowfullscreen" value="true"&gt;&lt;param name="flashvars" value="flvurl=http://v1.nonxt2.googlevideo.com/videoplayback?id%3Db741558237592552%26itag%3D5%26app%3Dblogger%26ip%3D0.0.0.0%26ipbits%3D0%26expire%3D1329895798%26sparams%3Did,itag,ip,ipbits,expire%26signature%3D63D0813514EEC008D4ED8DDC98964EE3C97C3986.3A4572FEEA6C2F6E1EE26F9A826BB329D13E28BA%26key%3Dck1&amp;amp;iurl=http://video.google.com/ThumbnailServer2?app%3Dblogger%26contentid%3Db741558237592552%26offsetms%3D5000%26itag%3Dw160%26sigh%3D1_INMjsdFGqdUVOpCvtg6lAZXxA&amp;amp;autoplay=0&amp;amp;ps=blogger"&gt;&lt;embed src="http://www.youtube.com/get_player" type="application/x-shockwave-flash"width="320" height="266" bgcolor="#FFFFFF"flashvars="flvurl=http://v1.nonxt2.googlevideo.com/videoplayback?id%3Db741558237592552%26itag%3D5%26app%3Dblogger%26ip%3D0.0.0.0%26ipbits%3D0%26expire%3D1329895798%26sparams%3Did,itag,ip,ipbits,expire%26signature%3D63D0813514EEC008D4ED8DDC98964EE3C97C3986.3A4572FEEA6C2F6E1EE26F9A826BB329D13E28BA%26key%3Dck1&amp;iurl=http://video.google.com/ThumbnailServer2?app%3Dblogger%26contentid%3Db741558237592552%26offsetms%3D5000%26itag%3Dw160%26sigh%3D1_INMjsdFGqdUVOpCvtg6lAZXxA&amp;autoplay=0&amp;ps=blogger"allowFullScreen="true" /&gt;&lt;/object&gt;&lt;br /&gt;&lt;/center&gt;&lt;br /&gt;&lt;br /&gt;The audio is a little clearer before being compressed for upload, but still has a very clunky feel. It is surprisingly useful - I made about 12 minor changes as a result. It has some oddities, for example ML becomes millilitre, and some pause issues with brackets, which the preprocessor sorts out. The entire paper takes 49 minutes to read, but I think I will be doing this will all papers from now on.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7094652-1264977951949316329?l=neilmitchell.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='enclosure' type='video/mp4' href='http://www.blogger.com/video-play.mp4?contentId=b741558237592552&amp;type=video%2Fmp4' length='0'/><link rel='replies' type='application/atom+xml' href='http://neilmitchell.blogspot.com/feeds/1264977951949316329/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=7094652&amp;postID=1264977951949316329' title='3 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/1264977951949316329'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/1264977951949316329'/><link rel='alternate' type='text/html' href='http://neilmitchell.blogspot.com/2008/04/automated-proof-reading.html' title='Automated Proof Reading'/><author><name>Neil Mitchell</name><uri>http://www.blogger.com/profile/13084722756124486154</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www-users.cs.york.ac.uk/~ndm/elements/my-photo.jpg'/></author><thr:total>3</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7094652.post-5964975995963545672</id><published>2008-03-30T23:16:00.003+01:00</published><updated>2008-03-30T23:28:11.131+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='tagsoup'/><title type='text'>Toddler's play with HTML in Haskell</title><content type='html'>I just read a blog article entitled &lt;a href="http://therning.org/magnus/archives/341"&gt;Kid's play with HTML in Haskell&lt;/a&gt;, where the author extracts some information from an HTML document, using the &lt;a href="http://www.fh-wedel.de/~si/HXmlToolbox/"&gt;Haskell XML Toolbox&lt;/a&gt;. I have an alternative XML/HTML library, &lt;a href="http://www-users.cs.york.ac.uk/~ndm/tagsoup/"&gt;TagSoup&lt;/a&gt;, so I decided to implement their problem with my library.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;The Problem&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;Given an HTML file, extract all hyperlinks to mp3 files.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;In TagSoup&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;[mp3 | TagOpen "a" atts &lt;- parseTags txt&lt;br /&gt;     , ("href",mp3) &lt;- atts&lt;br /&gt;     , takeExtension mp3 == ".mp3"]&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;The code is a list comprehension. The first line says use TagSoup to parse the text, and pick all "a" links. The second line says pick all "href" attributes from the tag you matched. The final line uses the FilePath library to check the extension is mp3.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;A Complete Program&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;The above fragment is all the TagSoup logic, but to match exact the interface to the original code, we can wrap it up as so:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;import System.FilePath&lt;br /&gt;import System.Environment&lt;br /&gt;import Text.HTML.TagSoup&lt;br /&gt;&lt;br /&gt;main = do&lt;br /&gt;   [src] &lt;- getArgs&lt;br /&gt;   txt &lt;- readFile src&lt;br /&gt;   mapM_ putStrLn [mp3 | TagOpen "a" atts &lt;- parseTags txt&lt;br /&gt;                       , ("href",mp3) &lt;- atts&lt;br /&gt;                       , takeExtension mp3 == ".mp3"]&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Summary&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;If you have a desire to quickly get a bit of information out of some XML/HTML page, TagSoup may be the answer. It isn't intended to be a complete HTML framework, but it does nicely optimise fairly common patterns of use.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7094652-5964975995963545672?l=neilmitchell.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://neilmitchell.blogspot.com/feeds/5964975995963545672/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=7094652&amp;postID=5964975995963545672' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/5964975995963545672'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/5964975995963545672'/><link rel='alternate' type='text/html' href='http://neilmitchell.blogspot.com/2008/03/toddlers-play-with-html-in-haskell.html' title='Toddler&apos;s play with HTML in Haskell'/><author><name>Neil Mitchell</name><uri>http://www.blogger.com/profile/13084722756124486154</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www-users.cs.york.ac.uk/~ndm/elements/my-photo.jpg'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7094652.post-825918600723258965</id><published>2008-03-11T18:45:00.004Z</published><updated>2008-03-11T19:05:29.151Z</updated><category scheme='http://www.blogger.com/atom/ns#' term='ada'/><title type='text'>Poor Ada Error Message</title><content type='html'>I have been demonstrating on the York University "Algorithms and Data Structures" course for 4 years now. As part of the course, first year students learn Ada. A lot of the error messages are really bad - but over time I've created a mental mapping between the message and the cause. I am now fairly fluent at recognising what mistake a student has made, given the exercise they are attempting and the error message. But yesterday I encountered a brand new misleading error message.&lt;br /&gt;&lt;br /&gt;The error message was:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;bad.adb:12:22: actual for "N" must be a variable&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;As always, the most useful thing in the error message is the line number. I read enough of the error message to check whether its a parse error, type error or something else, then head for the line mentioned. (I follow this same tactic in all languages, not just Ada.)&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;12:     Add_Cell(I, Next(N));&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Nothing obviously wrong about this statement, so I read the error message. It seems to want &lt;tt&gt;N&lt;/tt&gt; to be a variable. But I already know that &lt;tt&gt;N&lt;/tt&gt; &lt;i&gt;is&lt;/i&gt; a variable, or at the very least a parameter, so this condition seems to be met. &lt;br /&gt;&lt;br /&gt;Next step is to head to the definitions of &lt;tt&gt;Next&lt;/tt&gt; and &lt;tt&gt;Add_Cell&lt;/tt&gt;, to see if they can shed some light on the situation.&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;function Next(N: List) return List is ...&lt;br /&gt;procedure Add_Cell(I: Integer, N: in out List) is ...&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;The &lt;tt&gt;in out&lt;/tt&gt; in &lt;tt&gt;Add_Cell&lt;/tt&gt; can be read as "pass as a pointer". Aha, maybe the error message is complaining that the second argument to &lt;tt&gt;Add_Cell&lt;/tt&gt; can't be made a pointer, as its a return value from a function. That would explain it, and indeed, that turned out to be the cause of the problem. But back to the error message, what was it trying to tell us?&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;bad.adb:12:22: actual for "N" must be a variable&lt;br /&gt;12:     Add_Cell(I, Next(N));&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;When the error message refers to &lt;tt&gt;N&lt;/tt&gt; it isn't talking about the variable &lt;tt&gt;N&lt;/tt&gt; I can see, but the second argument of &lt;tt&gt;Add_Cell&lt;/tt&gt;, which is also called &lt;tt&gt;N&lt;/tt&gt;. If the function being calling was in a separate library, it would have been even harder to understand. A more helpful error message might have been:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;bad.adb:12:22: expression passed as the second argument to Add_Cell must be a variable&lt;br /&gt;    Found: Next(N)&lt;br /&gt;    Expected: A variable&lt;br /&gt;    Reason: Second argument of Add_Cell is declared "in out"&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;In general compiler error messages should be in terms of the line where the error resides, not requiring a large amount of global knowledge. The error can be resolved, but without help from the message. All compilers have bad error messages in some circumstances, but this one seems almost malicious!&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7094652-825918600723258965?l=neilmitchell.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://neilmitchell.blogspot.com/feeds/825918600723258965/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=7094652&amp;postID=825918600723258965' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/825918600723258965'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/825918600723258965'/><link rel='alternate' type='text/html' href='http://neilmitchell.blogspot.com/2008/03/poor-ada-error-message.html' title='Poor Ada Error Message'/><author><name>Neil Mitchell</name><uri>http://www.blogger.com/profile/13084722756124486154</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www-users.cs.york.ac.uk/~ndm/elements/my-photo.jpg'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7094652.post-7071959736035929654</id><published>2008-03-10T16:16:00.003Z</published><updated>2008-03-10T16:59:07.222Z</updated><category scheme='http://www.blogger.com/atom/ns#' term='yhc'/><title type='text'>Sorting At Speed</title><content type='html'>&lt;a href="http://en.wikipedia.org/wiki/Sorting_algorithm"&gt;Sorting&lt;/a&gt; is currently a hot topic within the the Haskell community. Christopher brought it up in a &lt;a href="http://www.nabble.com/(flawed-)-benchmark-:-sort-td15817832.html"&gt;recent thread&lt;/a&gt; on the mailing list, and this weekend I ended up spending several hours looking at sort routines.&lt;br /&gt;&lt;br /&gt;I was browsing through the &lt;a href="http://www.haskell.org/haskellwiki/Yhc"&gt;Yhc&lt;/a&gt; standard libraries, as one does on the weekend, and was drawn to Yhc's sort function. It had some undesirable characteristics for one of the projects I was working on, so I wondered if other Haskell systems used different implementations. I checked GHC, and discovered it their sort was different. In general, when Yhc and GHC have different implementations of a standard library function, the GHC one is better tuned for performance. I decided to replace the Yhc sort function with the GHC one, but before doing so, thought a quick performance test was in order. So I came up with something simple:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;test = do&lt;br /&gt;    src &lt;- readFile "Sort.hs"&lt;br /&gt;    print $ ordered $ sort $ sort $ reverse $ sort src&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;The file "sort.txt" was simply the source code to the program. The code sorts the contents of this file, then reverses it, sorts it and sorts it again. This means that the program performs one sort over semi-random data, one over reverse-ordered data and one over ordered-data. These are some fairly standard cases that should be checked. This test is not a comprehensive benchmark, but a nice quick indicator.&lt;br /&gt;&lt;br /&gt;I ran the Yhc sort function against the GHC version, and was shocked to find that the Yhc code was twice as fast. I ran the benchmark under Yhc, GHC and Hugs (using reduction count in Hugs), and in all cases the performance was doubled. I was not expecting this result!&lt;br /&gt;&lt;br /&gt;The code for the GHC sort is:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;mergesort :: (a -&gt; a -&gt; Ordering) -&gt; [a] -&gt; [a]&lt;br /&gt;mergesort cmp = mergesort' cmp . map (:[])&lt;br /&gt;&lt;br /&gt;mergesort' :: (a -&gt; a -&gt; Ordering) -&gt; [[a]] -&gt; [a]&lt;br /&gt;mergesort' cmp [] = []&lt;br /&gt;mergesort' cmp [xs] = xs&lt;br /&gt;mergesort' cmp xss = mergesort' cmp (merge_pairs cmp xss)&lt;br /&gt;&lt;br /&gt;merge_pairs :: (a -&gt; a -&gt; Ordering) -&gt; [[a]] -&gt; [[a]]&lt;br /&gt;merge_pairs cmp [] = []&lt;br /&gt;merge_pairs cmp [xs] = [xs]&lt;br /&gt;merge_pairs cmp (xs:ys:xss) = merge cmp xs ys : merge_pairs cmp xss&lt;br /&gt;&lt;br /&gt;merge :: (a -&gt; a -&gt; Ordering) -&gt; [a] -&gt; [a] -&gt; [a]&lt;br /&gt;merge cmp [] ys = ys&lt;br /&gt;merge cmp xs [] = xs&lt;br /&gt;merge cmp (x:xs) (y:ys)&lt;br /&gt; = case x `cmp` y of&lt;br /&gt;        GT -&gt; y : merge cmp (x:xs)   ys&lt;br /&gt;        _  -&gt; x : merge cmp    xs (y:ys)&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;The function works by splitting the list into one element lists, resulting in each basic list being ordered. These lists are then merged in pairs until a single list is left. For example, given the input &lt;tt&gt;"sort"&lt;/tt&gt;:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;"s"  "o"  "r"  "t"&lt;br /&gt;  "os"      "rt"&lt;br /&gt;      "orst"&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;We first split each character into its own list, then merge adjacent pairs. This code corresponds to the standard &lt;a href="http://en.wikipedia.org/wiki/Merge_sort"&gt;merge sort&lt;/a&gt;. But instead of making each initial list a single element, we could use sequences of increasing elements, for example using the &lt;tt&gt;&lt;a href="http://blog.jbapple.com/2008/01/extra-type-safety-using-polymorphic.html"&gt;risers&lt;/a&gt;&lt;/tt&gt; function:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;risers :: Ord a =&gt; [a] -&gt; [[a]]&lt;br /&gt;risers [] = []&lt;br /&gt;risers [x] = [[x]]&lt;br /&gt;risers (x:y:etc) = if x &lt;= y then (x:s):ss else [x]:(s:ss)&lt;br /&gt;     where (s:ss) = risers (y:etc)&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Now, if we apply &lt;tt&gt;risers "sort"&lt;/tt&gt; we get &lt;tt&gt;["s","ort"]&lt;/tt&gt;. We can now follow the same merge procedure as before:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;"s"    "ort"&lt;br /&gt;  "orst"&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Instead of doing 3 merges, we have done only 1. Given the input &lt;tt&gt;"abcd"&lt;/tt&gt; the effect would have been even more dramatic. We can refine this scheme further, by detecting both ascending and descending chains of elements in the initial list. This technique is used by Yhc, and is based on code originally written by &lt;a href="http://augustss.blogspot.com/"&gt;Lennart Augustsson&lt;/a&gt;. Knowing the original source of the code, my shock at the performance benefits offered by the Yhc version have decreased substantially.&lt;br /&gt;&lt;br /&gt;The GHC sort function should probably be replaced by the one from the Yhc libraries. This would offer increased performance, especially in the case of ordered or reverse-ordered lists. The asymptotic complexity of the two sorts means that there &lt;i&gt;must&lt;/i&gt; exist some value of &lt;tt&gt;n&lt;/tt&gt; such that &lt;tt&gt;sort [1..n]&lt;/tt&gt; runs faster in Yhc than GHC (assuming sufficient stack/heap for both). I wonder whether &lt;tt&gt;Int32&lt;/tt&gt; is capable of expressing such a value...&lt;br /&gt;&lt;br /&gt;&lt;i&gt;A side note:&lt;/i&gt; I have been playing with the &lt;tt&gt;risers&lt;/tt&gt; function for several years. I've used it as an example of pattern match checking, both specialised to &lt;tt&gt;Int&lt;/tt&gt; and on a general &lt;tt&gt;Ord&lt;/tt&gt; class. I've used it for supercompiling. It's appeared in blog posts, a TFP paper (Mitchell+Runciman 2007), a Haskell Workshop paper (Xu 2006) etc. I originally had this example suggested from a functional programming exam paper, but only today at lunch did I discover its true origins. The &lt;tt&gt;risers&lt;/tt&gt; function had originally been the first-step in a merge sort!&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7094652-7071959736035929654?l=neilmitchell.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://neilmitchell.blogspot.com/feeds/7071959736035929654/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=7094652&amp;postID=7071959736035929654' title='10 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/7071959736035929654'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/7071959736035929654'/><link rel='alternate' type='text/html' href='http://neilmitchell.blogspot.com/2008/03/sorting-at-speed.html' title='Sorting At Speed'/><author><name>Neil Mitchell</name><uri>http://www.blogger.com/profile/13084722756124486154</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www-users.cs.york.ac.uk/~ndm/elements/my-photo.jpg'/></author><thr:total>10</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7094652.post-5934591440795745710</id><published>2008-03-04T11:01:00.000Z</published><updated>2008-03-04T11:57:17.222Z</updated><category scheme='http://www.blogger.com/atom/ns#' term='supero'/><title type='text'>Lazy Evaluation: Strict vs. Speculative</title><content type='html'>For the last few days I have been thinking about how to write a low-level program optimiser, based on the ideas from &lt;a href="http://www-users.cs.york.ac.uk/~ndm/supero/"&gt;Supero&lt;/a&gt;. Supero works at the level of a lazy Core expressions, but actual hardware works on a sequence of strict instructions. The possible idea is to translate the lazy expressions to strict sequences, then borrow the ideas from supercompilation once more. In particular I have been looking at the &lt;a href="http://citeseer.ist.psu.edu/boquist96grin.html"&gt;GRIN&lt;/a&gt; approach, which defines such a set of instructions.&lt;br /&gt;&lt;br /&gt;The GRIN work is very clever, and has many ideas that I would like to reuse. However, the one aspect that gave me slight concern is the complexity. A GRIN program requires the use of several analysis passes, and many many transformation rules. While this approach is perfectly acceptable, one of the goals of the Supero work is to make the optimisation process simpler -- comprising of a few simple but powerful rules.&lt;br /&gt;&lt;br /&gt;I will first explain how strictness works, then how my speculative approach works. Readers who already know about unboxing are encouraged to skip to the speculative section.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Strictness&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;When doing low-level compilation, one of the most important stages is strictness analysis, and the associated unboxing. To take the example of the &lt;tt&gt;factorial&lt;/tt&gt; function in Haskell:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;factorial :: Int -&gt; Int&lt;br /&gt;factorial n = if n &gt; 0 then n * factorial (n-1) else 1&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Here it is easy to see that the &lt;tt&gt;factorial&lt;/tt&gt; function always evaluates &lt;tt&gt;n&lt;/tt&gt;. We can also use our knowledge of the definition of &lt;tt&gt;Int&lt;/tt&gt;:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;data Int = Int# I#&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Where &lt;tt&gt;I#&lt;/tt&gt; is an actual machine integer (possibly stored in a register), and &lt;tt&gt;Int#&lt;/tt&gt; is a lazy box surrounding it. Since we know that &lt;tt&gt;factorial&lt;/tt&gt; will always unwrap our &lt;tt&gt;n&lt;/tt&gt;, we can pass the &lt;tt&gt;n&lt;/tt&gt; around without the &lt;tt&gt;Int#&lt;/tt&gt; box. I have made all the conversions from &lt;tt&gt;I#&lt;/tt&gt; to &lt;tt&gt;Int&lt;/tt&gt; explicit using an &lt;tt&gt;Int#&lt;/tt&gt;, but have left all the unboxings implicit. The operators &lt;tt&gt;&amp;gt;#&lt;/tt&gt; etc. are simply unboxed and strict variants of the standard operators.&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;factorial# :: Int# -&gt; Int&lt;br /&gt;factorial# n# = if n# &gt;# 0 then n# *# factorial (Int# n# - 1) else 1&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Also, since we know &lt;tt&gt;factorial&lt;/tt&gt; is strict in its first argument, we can evaluate the first argument to the recursive call strictly. Applying all these optimisations can now write:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;factorial# :: Int# -&gt; Int&lt;br /&gt;factorial# n# = if n# &gt;# 0 then n# *# factorial (n# -# 1) else 1&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;We have removed the explicit boxing in the recursive call, and work entirely with unboxed integers. Now &lt;tt&gt;factorial&lt;/tt&gt; is entirely strict. We can even write a wrapper around our strict version, to provide a lazy interface matching the original.&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;factorial :: Int -&gt; Int&lt;br /&gt;factorial n = factorial# n#&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;I have used &lt;tt&gt;n#&lt;/tt&gt; to denote the unboxing of &lt;tt&gt;n&lt;/tt&gt;. Now &lt;tt&gt;factorial&lt;/tt&gt; looks like it did before, but operates much faster, on unboxed integers.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Speculative&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;I would like to not include a strictness analyser in my optimiser, or if it is included, have it be the result of a series of transformations -- without explicit "stop and analyse" then "use the results" stages. As part of my thoughts on this, I was trying to consider how to optimise &lt;tt&gt;factorial&lt;/tt&gt; without invoking the strictness analyser.&lt;br /&gt;&lt;br /&gt;The speculative transformation I have defined first generates &lt;tt&gt;factorial#&lt;/tt&gt; - I have left out the details of &lt;i&gt;why&lt;/i&gt; it decides to.&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;factorial :: Int -&gt; Int&lt;br /&gt;factorial n = if n &gt; 0 then n * factorial (n-1) else 1&lt;br /&gt;&lt;br /&gt;factorial# :: Int# -&gt; Int&lt;br /&gt;factorial# n# = if n# &gt;# 0 then n# *# factorial (Int# n# - 1) else 1&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;This step is entirely safe - we have defined &lt;tt&gt;factorial#&lt;/tt&gt;, but we have not written a wrapper that invokes it, even in the recursive case. The &lt;tt&gt;factorial#&lt;/tt&gt; function is equivalent to &lt;tt&gt;factorial&lt;/tt&gt; if the initial argument was evaluated. We have transformed &lt;tt&gt;factorial#&lt;/tt&gt; using only local knowledge, at the point. We can also transform &lt;tt&gt;factorial&lt;/tt&gt;, replacing any uses of &lt;tt&gt;n&lt;/tt&gt; which are guaranteed to come after &lt;tt&gt;n&lt;/tt&gt; is evaluated, with &lt;tt&gt;(Int# n#)&lt;/tt&gt;. This transformation is merely reusing the knowledge we have gained unwrapping &lt;tt&gt;n&lt;/tt&gt;:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;factorial n = if n &gt; 0 then Int# n# * factorial (Int# n# - 1) else 1&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Now we promote any primitive operations on only unboxed values. Given &lt;tt&gt;(-)&lt;/tt&gt;, it is cheaper to evalute the subtraction than to store a lazy thunk to the function.&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;factorial n = if n &gt; 0 then Int# n# * factorial (Int# (n# -# 1)) else 1&lt;br /&gt;&lt;br /&gt;factorial# n# = if n# &gt;# 0 then n# *# factorial (Int# (n# -# 1)) else 1&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;We can now use our knowledge that if we know an argument to a function is already evaluated, we can call the strict variant (this corresponds closely to &lt;a href="http://research.microsoft.com/~simonpj/papers/spec-constr/index.htm"&gt;constructor specialisation&lt;/a&gt;):&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;factorial n = if n &gt; 0 then n# *# factorial# (n# -# 1) else 1&lt;br /&gt;&lt;br /&gt;factorial# n# = if n# &gt;# 0 then n# *# factorial# (n# -# 1) else 1&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;We can also replace the &lt;tt&gt;*&lt;/tt&gt; in &lt;tt&gt;factorial&lt;/tt&gt; with &lt;tt&gt;*#&lt;/tt&gt; as we know we will have to evaluate the result of a function. Now we have ended up with a fast inner loop, operating only on unboxed integers. We have not required strictness information to make any transformation.&lt;br /&gt;&lt;br /&gt;One way of viewing the difference between strictness and this transformation is the flow of information. In strictness, the caller is informed that a particular argument will be evaluated. In speculative, the callee informs the caller that an argument has already been evaluated. These two concepts are not the same, and while they overlap, there are instances where they differ considerably.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Strict vs. Speculative&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;Consider the following example:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;strict :: Int -&gt; Int&lt;br /&gt;strict x = x `seq` lazy x (x-1) (x+1)&lt;br /&gt;&lt;br /&gt;lazy :: Int -&gt; Int -&gt; Int -&gt; Int&lt;br /&gt;lazy a b c = if a == 0 then b else c&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Here the &lt;tt&gt;lazy&lt;/tt&gt; function is strict in &lt;tt&gt;a&lt;/tt&gt;, but not either of &lt;tt&gt;b&lt;/tt&gt; or &lt;tt&gt;c&lt;/tt&gt;. A strictness analyser would generate a variant of &lt;tt&gt;lazy&lt;/tt&gt; with only the first argument unboxed. In contrast the speculative variant will determine that &lt;tt&gt;x-1&lt;/tt&gt; and &lt;tt&gt;x+1&lt;/tt&gt; should be evaluated, and pass unboxed values in all arguments of &lt;tt&gt;lazy&lt;/tt&gt;, even though &lt;tt&gt;lazy&lt;/tt&gt; may not evaluate &lt;tt&gt;b&lt;/tt&gt; or &lt;tt&gt;c&lt;/tt&gt;.&lt;br /&gt;&lt;br /&gt;To see this behaviour in GHC, it helps to make &lt;tt&gt;lazy&lt;/tt&gt; recursive:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;module Temp where&lt;br /&gt;&lt;br /&gt;strict :: Int -&gt; Int&lt;br /&gt;strict x = x `seq` lazy x (x+1) (x-1)&lt;br /&gt;&lt;br /&gt;lazy :: Int -&gt; Int -&gt; Int -&gt; Int&lt;br /&gt;lazy a b c = if a == 0 then lazy b b b else c&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Now run with the options &lt;tt&gt;ghc Temp.hs -c -O2 -ddump-simpl&lt;/tt&gt;, and you will see the &lt;tt&gt;lazy&lt;/tt&gt; variant has type &lt;tt&gt;lazy :: Int# -&gt; Int -&gt; Int -&gt; Int&lt;/tt&gt;.&lt;br /&gt;&lt;br /&gt;These thoughts are still very preliminary, and there are a number of unanswered questions:&lt;br /&gt;&lt;br /&gt;&lt;ul&gt;&lt;br /&gt;&lt;li&gt;What is the overlap between strict and speculative?&lt;/li&gt;&lt;br /&gt;&lt;li&gt;Can both variants be combined? (almost certainly yes)&lt;/li&gt;&lt;br /&gt;&lt;li&gt;Is speculative really simpler?&lt;/li&gt;&lt;br /&gt;&lt;li&gt;Is speculative sufficient?&lt;/li&gt;&lt;br /&gt;&lt;li&gt;What are the performance benefits of speculative?&lt;/li&gt;&lt;br /&gt;&lt;/ul&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7094652-5934591440795745710?l=neilmitchell.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://neilmitchell.blogspot.com/feeds/5934591440795745710/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=7094652&amp;postID=5934591440795745710' title='6 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/5934591440795745710'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/5934591440795745710'/><link rel='alternate' type='text/html' href='http://neilmitchell.blogspot.com/2008/03/lazy-evaluation-strict-vs-speculative.html' title='Lazy Evaluation: Strict vs. Speculative'/><author><name>Neil Mitchell</name><uri>http://www.blogger.com/profile/13084722756124486154</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www-users.cs.york.ac.uk/~ndm/elements/my-photo.jpg'/></author><thr:total>6</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7094652.post-2946662469752574405</id><published>2008-02-29T17:17:00.000Z</published><updated>2008-02-29T18:13:52.970Z</updated><category scheme='http://www.blogger.com/atom/ns#' term='hoogle'/><title type='text'>Hoogle 3 Security Bug</title><content type='html'>I recently released &lt;a href="http://haskell.org/hoogle/"&gt;Hoogle 3.1&lt;/a&gt;, in response to a security bug spotted by Tillmann Rendel. Checking back through my records, I found that Botje on #haskell had previously spotted the same issue, but at the time I hadn't noticed it was a security bug. The security implications of the bug are very low, and it could not be used to cause any real harm.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;The Bug&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;The bug is that in the Hoogle web interface, user supplied data may end up being shown to the user without escaping. For example, searching for the &lt;a href="http://haskell.org/hoogle/?q=1"&gt;number 1&lt;/a&gt; results in an error message, which  says "Parse Error: Unexpected character '1'". Unfortunately, in Hoogle 3, if that search string had been "1&amp;lt;u&amp;gt;2&amp;lt;u&amp;gt;", then the result would have been "Parse Error: Unexpected character '1&lt;u&gt;2&lt;/u&gt;'" - i.e. the number 2 would be underlined. If you try this same example in Hoogle 3.1, you do not get any formatting, and see the entered tags.&lt;br /&gt;&lt;br /&gt;The bug could be provoked in several places:&lt;br /&gt;&lt;br /&gt;&lt;ul&gt;&lt;br /&gt;&lt;li&gt;The error message, on a parse error.&lt;/li&gt;&lt;br /&gt;&lt;li&gt;The input box, after a search had been entered.&lt;/li&gt;&lt;br /&gt;&lt;li&gt;As the string listed as what the user searched for.&lt;/li&gt;&lt;br /&gt;&lt;/ul&gt;&lt;br /&gt;&lt;br /&gt;To perturb the input box would require entering a quote character ("), and to perturb the other instances would require an opening angle bracket (&amp;lt;).&lt;br /&gt;&lt;br /&gt;&lt;b&gt;The Severity&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;I am fairly sure the severity of this bug is "incredibly low". As a result of entering a malicious query, the attacker could cause the page displayed to contain whatever was desired. However, the Hoogle online interface has no privileges beyond that of a normal web page, and so can't actually do anything evil. The bug does not permit any supplied code to be executed on the server.&lt;br /&gt;&lt;br /&gt;There is only one malicious use I could think of: browser redirects. Sometimes evil companies will send out spam mail, with links such as "click here to go to example.com and order Viagra". One anti-spam measure is to reject all emails linking to a particular domain name. By crafting a URL, it was possible for a link to Hoogle to redirect to another domain, thus appearing that the initial link was to a trusted website. The spam recipient still goes to the original page, but it may defeat their spam filters.&lt;br /&gt;&lt;br /&gt;Checking the server logs for Hoogle shows that no one ever actually exploited the flaw to perform a redirect, or even to insert a &amp;lt;script&amp;gt; tag - the first step to any such exploit.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;The Fix&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;I had to make two fixes to the code. I use &lt;a href="http://www.cs.chalmers.se/~d00nibro/haskell-src-exts/"&gt;Haskell Source Extensions&lt;/a&gt; to generate most of the HTML shown in Hoogle. As part of that, I have a &lt;tt&gt;ToXML&lt;/tt&gt; class that automatically converts values to an XML representation, which is then rendered. The &lt;tt&gt;ToXML&lt;/tt&gt; instance for &lt;tt&gt;String&lt;/tt&gt; did not escape special HTML characters, now it does. I wrote the &lt;tt&gt;ToXML&lt;/tt&gt; instances, instead of relying on those supplied in the associated &lt;a href="http://www.cs.chalmers.se/~d00nibro/hsp/"&gt;HSP&lt;/a&gt;, and thus introduced the bug.&lt;br /&gt;&lt;br /&gt;The only other code that generates HTML uses a formatted string type, which can represent hyperlinks and various formatting, and can be rendered as either console escape characters or as HTML. Since this part of the code was written before moving to Haskell Source Extensions, it generates raw strings. This generating code was also patched to escape certain characters.&lt;br /&gt;&lt;br /&gt;As a result of using libraries and abstractions, it wasn't necessary to fix each of the security flaws one by one, but to fix the interface to the library. In doing so, I have much more confidence that all the security flaws have been tackled once and for all, and that they will not reoccur.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Is Haskell Insecure?&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;Enhanced security is one of the many advantages that Haskell offers. It is not possible to overrun a buffer and conduct stack smashing attacks on a Haskell program. Passing query strings will not overwrite global variables, and escaping cannot cause user code to be executed on the server. However, when Haskell code generates HTML, it is not immune from code injection attacks on the client side.&lt;br /&gt;&lt;br /&gt;In the beginning Hoogle did not use any HTML generation libraries. As I have slowly moved towards Haskell Source Extensions, I have benefited from better guarantees about well-formed HTML. By creating appropriate abstractions, and dealing with concerns like escaping at the right level, and enforcing these decisions with appropriate types, the number of places to introduce a security bug is lowered. Hopefully Hoogle will not fall victim to such a security problem in future.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7094652-2946662469752574405?l=neilmitchell.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://neilmitchell.blogspot.com/feeds/2946662469752574405/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=7094652&amp;postID=2946662469752574405' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/2946662469752574405'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/2946662469752574405'/><link rel='alternate' type='text/html' href='http://neilmitchell.blogspot.com/2008/02/hoogle-3-security-bug.html' title='Hoogle 3 Security Bug'/><author><name>Neil Mitchell</name><uri>http://www.blogger.com/profile/13084722756124486154</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www-users.cs.york.ac.uk/~ndm/elements/my-photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7094652.post-3869017144793319590</id><published>2008-02-28T00:02:00.000Z</published><updated>2008-02-28T00:31:53.646Z</updated><category scheme='http://www.blogger.com/atom/ns#' term='hoogle'/><category scheme='http://www.blogger.com/atom/ns#' term='cabal'/><title type='text'>Adding data files using Cabal</title><content type='html'>&lt;a href="http://www.haskell.org/cabal/"&gt;Cabal&lt;/a&gt; is the standard method of packaging Haskell programs and libraries for release. One problem I've encountered more than once is that adding data files to a Cabal built project is not as easy as it could be. I'm not entirely sure why - having just added data file support to &lt;a href="http://www.haskell.org/hoogle/"&gt;Hoogle&lt;/a&gt;, it wasn't excessively painful, but I still came out of the experience feeling slightly bruised. To help others (and my future self), I thought I'd write down the details while they are still freshly spinning round my head.&lt;br /&gt;&lt;br /&gt;Let's assume we start with an existing Cabal project, with an associated .cabal file. In the root directory of the project we have readme.txt and data.txt. The readme file contains a basic introduction to the user, and the data file contains some data that the program needs to access at runtime.&lt;br /&gt;&lt;br /&gt;We first modify the .cabal file to add the following lines in the top section:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;Extra-Source-Files:&lt;br /&gt;    readme.txt&lt;br /&gt;Data-Files:&lt;br /&gt;    data.txt&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;The &lt;tt&gt;Extra-Source-Files&lt;/tt&gt; tells Cabal to put the files in the release tarball, but nothing more - for a readme this behaviour is perfect. The &lt;tt&gt;Data-Files&lt;/tt&gt; section tells Cabal that the following files contain data which the program will want to access at runtime. Data files include things like big tables, the hoogle function search database, graphics/game data files for games, UI description files for GUI's, etc.&lt;br /&gt;&lt;br /&gt;Now we have added the data file to Cabal's control, Cabal will automatically manage it for us. It will be added to the source tarball, and will be installed somewhere appropriate on the users system, following operating system guidelines. The only question is where Cabal has put the file. To figure this out, Cabal generates a &lt;tt&gt;Paths_hoogle&lt;/tt&gt; module (change the project name as appropriate) which it links in with the program. The Paths module provides the function:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;getDataFileName :: FilePath -&gt; IO FilePath&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;At runtime, to find the data file, we can simply call &lt;tt&gt;getDataFileName "data.txt"&lt;/tt&gt;, and Cabal will tell us where the data file resides.&lt;br /&gt;&lt;br /&gt;The above method works well after a program has been installed, but is harder to work with while developing a program. To alleviate these problems, we can add our own Paths module to the program, for example:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;module Paths_hoogle where&lt;br /&gt;&lt;br /&gt;getDataFileName :: FilePath -&gt; IO FilePath&lt;br /&gt;getDataFileName = return&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Place this module alongside all the other modules. While developing the program our hand-created Paths module will be invoked, which says the data is always in the current directory. When doing a Cabal build, Cabal will choose its custom generated Paths module over ours, and we get the benefits of Cabal managing our data.&lt;br /&gt;&lt;br /&gt;Cabal's support for data files, and extra source files, is very useful. It doesn't take much work to make use of the provided facilities, and it will help to ensure that users of your program on all operating systems get the style of installation they were expecting.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7094652-3869017144793319590?l=neilmitchell.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://neilmitchell.blogspot.com/feeds/3869017144793319590/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=7094652&amp;postID=3869017144793319590' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/3869017144793319590'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/3869017144793319590'/><link rel='alternate' type='text/html' href='http://neilmitchell.blogspot.com/2008/02/adding-data-files-using-cabal.html' title='Adding data files using Cabal'/><author><name>Neil Mitchell</name><uri>http://www.blogger.com/profile/13084722756124486154</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www-users.cs.york.ac.uk/~ndm/elements/my-photo.jpg'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7094652.post-8446419372702220687</id><published>2008-01-26T15:33:00.000Z</published><updated>2008-01-26T15:37:07.719Z</updated><category scheme='http://www.blogger.com/atom/ns#' term='catch'/><category scheme='http://www.blogger.com/atom/ns#' term='supero'/><title type='text'>Safety and Optimisation: joinMaybes'</title><content type='html'>In a recent &lt;a href="http://conal.net/blog/posts/a-handy-generalized-filter/"&gt;blog post by Conal&lt;/a&gt;, he introduced the &lt;tt&gt;joinMaybes'&lt;/tt&gt; function, defined as:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;filterMP' :: MonadPlus m =&gt; (a -&gt; Bool) -&gt; m a -&gt; m a&lt;br /&gt;filterMP' p = (&gt;&gt;= f)&lt;br /&gt; where&lt;br /&gt;   f a | p a       = return a&lt;br /&gt;       | otherwise = mzero&lt;br /&gt; &lt;br /&gt;joinMaybes' :: MonadPlus m =&gt; m (Maybe a) -&gt; m a&lt;br /&gt;joinMaybes' = liftM fromJust . filterMP' isJust&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;He laments that the use of &lt;tt&gt;isJust&lt;/tt&gt; and &lt;tt&gt;fromJust&lt;/tt&gt; mean that his code will run slower (having two &lt;tt&gt;Just&lt;/tt&gt; tests), and that an automated checker such as &lt;a href="http://www-users.cs.york.ac.uk/~ndm/catch/"&gt;Catch&lt;/a&gt; won't be able to check it successfully. Fortunately, Catch can check the code perfectly, and &lt;a href="http://www-users.cs.york.ac.uk/~ndm/supero/"&gt;Supero&lt;/a&gt; can optimise the code perfectly. As such, this simple definition is perfectly fine from all points of view. I'm going to go through the checking with Catch in some detail, and if anyone wants, I'll post another article on the optimisation with Supero.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Checking With Catch&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;To simplify things, I'm going to work only in the [] monad, so here is a new variant of the code:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;filterMP' :: (a -&gt; Bool) -&gt; [a] -&gt; [a]&lt;br /&gt;filterMP' p = concatMap f&lt;br /&gt; where&lt;br /&gt;   f a | p a       = [a]&lt;br /&gt;       | otherwise = []&lt;br /&gt; &lt;br /&gt;joinMaybes' :: MonadPlus m =&gt; m (Maybe a) -&gt; m a&lt;br /&gt;joinMaybes' = map fromJust . filterMP' isJust&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Catch would remove the dictionaries before starting, so would accept the original code unmodified. The first thing Catch wold do is reduce this fragment to first-order. The end translation would be:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;filterMP_isJust x = concatMap_f x&lt;br /&gt;concatMap_f [] = []&lt;br /&gt;concatMap_f (x:xs) = (if isJust x then [x] else []) ++ concatMap_f xs&lt;br /&gt;&lt;br /&gt;joinMaybes x = map_fromJust (filterMP_isJust x)&lt;br /&gt;map_fromJust [] = []&lt;br /&gt;map_fromJust (Just x:xs) = x : map_fromJust xs&lt;br /&gt;map_fromJust (Nothing:xs) = error "Pattern match error"&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;I've also taken the liberty of inlining the otherwise, and used pattern matching rather than &lt;tt&gt;case&lt;/tt&gt; expressions. Catch will take care of those details for us, but the code is a little easier to follow without them in. Now Catch can begin the checking process.&lt;br /&gt;&lt;br /&gt;Catch first decideds that if &lt;tt&gt;map_fromJust&lt;/tt&gt; is passed a list matching the pattern &lt;tt&gt;(Nothing:_)&lt;/tt&gt;, it will crash, and annotates the precondition of &lt;tt&gt;map_fromJust&lt;/tt&gt; as being either the input list is &lt;tt&gt;[]&lt;/tt&gt; or &lt;tt&gt;(Just _:_)&lt;/tt&gt;. It then spots the recursive call within &lt;tt&gt;map_fromJust&lt;/tt&gt;, and determines that the revised precondition should be that the input list is a list, or any length, whose elements are all &lt;tt&gt;Just&lt;/tt&gt; constructed (we call this condition P).&lt;br /&gt;&lt;br /&gt;Having determined the precondition on &lt;tt&gt;map_fromJust&lt;/tt&gt;, it uses that within &lt;tt&gt;joinMaybes&lt;/tt&gt;. Catch transforms the condition P, trying to find the precondition on &lt;tt&gt;filterMP_isJust&lt;/tt&gt; to ensure the postcondition P holds. By examining each branch, Catch determines that under &lt;i&gt;all&lt;/i&gt; circumstances the postcondition will hold, therefore the precondition is just true. Given that &lt;tt&gt;filterMP_isJust&lt;/tt&gt; always satisfies the precondition of &lt;tt&gt;map_fromJust&lt;/tt&gt;, it is clear that &lt;tt&gt;joinMaybes&lt;/tt&gt; never crashes.&lt;br /&gt;&lt;br /&gt;Catch can generate the above proof automatically, showing the above function is safe.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7094652-8446419372702220687?l=neilmitchell.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://neilmitchell.blogspot.com/feeds/8446419372702220687/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=7094652&amp;postID=8446419372702220687' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/8446419372702220687'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/8446419372702220687'/><link rel='alternate' type='text/html' href='http://neilmitchell.blogspot.com/2008/01/safety-and-optimisation-joinmaybes.html' title='Safety and Optimisation: joinMaybes&apos;'/><author><name>Neil Mitchell</name><uri>http://www.blogger.com/profile/13084722756124486154</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www-users.cs.york.ac.uk/~ndm/elements/my-photo.jpg'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7094652.post-5123737845417039226</id><published>2008-01-25T14:37:00.000Z</published><updated>2008-02-19T23:45:32.068Z</updated><title type='text'>Functional Flow Control</title><content type='html'>In normal programming languages, there are many keywords for &lt;i&gt;flow control&lt;/i&gt;, such as &lt;tt&gt;for&lt;/tt&gt;, &lt;tt&gt;while&lt;/tt&gt; etc. These flow control keywords encode a particular pattern of iteration, such as looping over a range (in the case of &lt;tt&gt;for&lt;/tt&gt;) or continuing until some condition holds (&lt;tt&gt;while&lt;/tt&gt;). Imperative programming languages continue to add more iteration keywords: both C#/Java have introduced some form of &lt;tt&gt;for-each&lt;/tt&gt;; Python/C# have &lt;tt&gt;yield&lt;/tt&gt;; Ada has many variants.&lt;br /&gt;&lt;br /&gt;Haskell doesn't have these iteration keywords, but instead relies on recursion. This choice, when coupled with a few other Haskell ingredients, makes it much more powerful. Take for example the task of looping over a sequence, adding 1 to each element:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;// In C&lt;br /&gt;for (int i = 0; i &lt; n; i++)&lt;br /&gt;    list[i]++&lt;br /&gt;&lt;br /&gt;-- In Haskell&lt;br /&gt;incList (x:xs) = 1+x : foo xs&lt;br /&gt;incList [] = []&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;I've used C mutating an array, and Haskell allocating a new list, simply because that would be the natural thing to do in each language. However, the great thing about higher-order functions is that we can now go back and abstract the flow control in Haskell, giving us:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;incList = map (+1)&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;The above function maps over a list, incrementing each element. People have identified a common pattern (iterating over a list) and rather than baking it into the language with a keyword such as &lt;tt&gt;iterate-over-list&lt;/tt&gt;, a library function can provide the operation. It is very important that &lt;tt&gt;map&lt;/tt&gt; is not special in any way, and can simply be defined as:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;map f (x:xs) = f x : map f xs&lt;br /&gt;map f [] = []&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;The great advantage is that rather than being restricted to a limited range of flow control operators that someone somewhere decided upon, we can add new ones. Let's take another example, that of summing a list:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;-- in C&lt;br /&gt;int total = 0;&lt;br /&gt;for (int i = 0; i &lt; n; i++)&lt;br /&gt;   total += list[i];&lt;br /&gt;&lt;br /&gt;-- in Haskell&lt;br /&gt;sum [] = 0&lt;br /&gt;sum (x:xs) = x + sum xs&lt;br /&gt;&lt;br /&gt;-- or using the built in foldl&lt;br /&gt;sum = foldl (+) 0&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;In Haskell there is a standard library function &lt;tt&gt;foldl&lt;/tt&gt; which iterates over a list using an accumulator, managing updates to the accumulator for you, and setting an initial value. In C there is no such operator, so the more general purpose &lt;tt&gt;for&lt;/tt&gt; is used.&lt;br /&gt;&lt;br /&gt;But these examples are very common, so C's &lt;tt&gt;for&lt;/tt&gt; keyword has provided most most of the control flow. However, sometimes you need more exotic flow control, which the authors of the language did not think of including. Take the example of computing a fixed point of a function &lt;tt&gt;f&lt;/tt&gt; on the value &lt;tt&gt;x&lt;/tt&gt;:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;int x;&lt;br /&gt;while (1) {&lt;br /&gt;   int x2 = f(x);&lt;br /&gt;   if (x == x2) break;&lt;br /&gt;   x = x2;&lt;br /&gt;}&lt;br /&gt;&lt;br /&gt;fix f x = if x == x2 then x else fix f x2&lt;br /&gt;   where x2 = f x&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Here the Haskell version shows its power, instead of having defined a particular instance for a particular &lt;tt&gt;f&lt;/tt&gt; and a particular type of value &lt;tt&gt;x&lt;/tt&gt;, in Haskell we have basically defined &lt;tt&gt;fix&lt;/tt&gt; as a new form of flow control.&lt;br /&gt;&lt;br /&gt;In C we were still able to define something, but it was much harder. Now consider the following example that I was working on yesterday. I have an algorithm which has 4 stages, &lt;tt&gt;lambda&lt;/tt&gt;, &lt;tt&gt;simplify&lt;/tt&gt;, &lt;tt&gt;inline&lt;/tt&gt;, &lt;tt&gt;specialise&lt;/tt&gt;. Each stage must be run in turn, but if any stage changes something, then we restart from the beginning. For example, we apply &lt;tt&gt;lambda&lt;/tt&gt;, then &lt;tt&gt;simplify&lt;/tt&gt; - if something changes we restart at &lt;tt&gt;lambda&lt;/tt&gt;. We only finish once all the stages have been run without changing anything. In Haskell this code is simple:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;fixList orig x = f orig x&lt;br /&gt;   where&lt;br /&gt;      f [] x = x&lt;br /&gt;      f (a:as) x = if x == x2 then f as x else f orig x2&lt;br /&gt;            where x2 = a x&lt;br /&gt;&lt;br /&gt;fixList [lambda,simplify,inline,specialise]&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;I define a new function called &lt;tt&gt;fixList&lt;/tt&gt;, which provides an abstraction of flow control. The actual operations have been well isolated from this structure. I considered how to express this in structured C and drew a complete blank. My best guess is:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;int x, x2;&lt;br /&gt;&lt;br /&gt;begin:&lt;br /&gt;x = x2;&lt;br /&gt;&lt;br /&gt;x2 = lambda(x)    ; if (x != x2) goto begin;&lt;br /&gt;x2 = simplify(x)  ; if (x != x2) goto begin;&lt;br /&gt;x2 = inline(x)    ; if (x != x2) goto begin;&lt;br /&gt;x2 = specialise(x); if (x != x2) goto begin;&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;It might be possible to change some of this flow control into a macro, but I can think of no clean abstraction. Haskell is a great language for building abstractions, flow-control is just one highly useful instance.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;PS: C Issues&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;Some of the code in this article isn't quite as nice as I'd wanted it to be, and isn't really a fair comparison. The array processing code in C relies on having defined &lt;tt&gt;n&lt;/tt&gt; to be the length of the list, and having that tracked separately. The fixed point definition on C works over &lt;tt&gt;int&lt;/tt&gt; to get a nice equality test, but that is merely a limitation of the language not having a standardized way to do equality. The C code could use function pointers, but in reality &lt;tt&gt;inline&lt;/tt&gt; takes an extra argument so is used as a closure in Haskell - and besides, that would hardly be the standard way of coding C.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7094652-5123737845417039226?l=neilmitchell.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://neilmitchell.blogspot.com/feeds/5123737845417039226/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=7094652&amp;postID=5123737845417039226' title='13 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/5123737845417039226'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/5123737845417039226'/><link rel='alternate' type='text/html' href='http://neilmitchell.blogspot.com/2008/01/functional-flow-control.html' title='Functional Flow Control'/><author><name>Neil Mitchell</name><uri>http://www.blogger.com/profile/13084722756124486154</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www-users.cs.york.ac.uk/~ndm/elements/my-photo.jpg'/></author><thr:total>13</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7094652.post-8018445893950772941</id><published>2008-01-15T21:01:00.000Z</published><updated>2008-01-15T21:22:35.938Z</updated><category scheme='http://www.blogger.com/atom/ns#' term='darcs'/><title type='text'>darcs Feature Request</title><content type='html'>I use &lt;a href="http://darcs.net/"&gt;darcs&lt;/a&gt; all the time. I have 15 repos &lt;a href="http://www-users.cs.york.ac.uk/~ndm/downloads/"&gt;listed on my website&lt;/a&gt;, and at least an additional 10 that haven't made that list yet. The main reason I use version control is for syncronising files between the various computers I use (at least 4 on a regular basis). Of course, having the version control features available as well is an added bonus. I used to use CVS, and its Windows graphical companion, Tortoise CVS, which is a fantastic tool. However, as the Haskell community has migrated to darcs it has become more convenient to standardise on one single tool.&lt;br /&gt;&lt;br /&gt;All my darcs repo's are available behind ssh (for pushing to) and behind http (for pulling from). So for example, to pull from the &lt;a href="http://www-users.cs.york.ac.uk/~ndm/filepath/"&gt;filepath library&lt;/a&gt;, I use the http accessible version:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;darcs pull http://darcs.haskell.org/packages/filepath/&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;And to push my changes I do:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;darcs push neil@darcs.haskell.org:/home/darcs/packages/filepath&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;The reason for pulling using http is that this operation doesn't require a public key to be active, doesn't fail if SSH is blocked (which it is in at least one of the locations I regularly visit), and is much quicker.&lt;br /&gt;&lt;br /&gt;One feature of darcs is that it saves the last repo address used in a prefs file, so after typing the long &lt;tt&gt;darcs pull&lt;/tt&gt; command the first time, in future I can just type:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;darcs pull&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Much handier! Unfortunately, if I then do a long-winded push command, it overrides the default repo, and future pull's would go from the ssh path. There is a solution, instead type:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;darcs push --no-set-default neil@darcs.haskell.org:/home/darcs/packages/filepath&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Ouch! Now that command line is getting a bit of a mouthful. As a result, I pasted this line into a file &lt;a href="http://darcs.haskell.org/packages/filepath/push.bat"&gt;push.bat&lt;/a&gt;, and committed that to the repo. Now I can do the simple &lt;tt&gt;darcs pull&lt;/tt&gt; and a push is as short as:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;push&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;But this isn't ideal. It's a bit sad that one of the core Hsakell libraries contains a batch script which is platform specific and only works for people with shared keys on my Haskell account (which is now about 10 people...). I'd love to remove this script, which brings me on to...&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Feature Request: have darcs allow a different default for push and pull commands.&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;One way of engineering this would be for darcs to store them entirely separately. Another option would for pull to use the last repo url, and push to use the last repo that &lt;i&gt;wasn't&lt;/i&gt; http (since you can't usually push over http). There may be other options that will achieve the same effect.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7094652-8018445893950772941?l=neilmitchell.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://neilmitchell.blogspot.com/feeds/8018445893950772941/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=7094652&amp;postID=8018445893950772941' title='9 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/8018445893950772941'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7094652/posts/default/8018445893950772941'/><link rel='alternate' type='text/html' href='http://neilmitchell.blogspot.com/2008/01/darcs-feature-request.html' title='darcs Feature Request'/><author><name>Neil Mitchell</name><uri>http://www.blogger.com/profile/13084722756124486154</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://www-users.cs.york.ac.uk/~ndm/elements/my-photo.jpg'/></author><thr:total>9</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7094652.post-7578438463661240371</id><published>2008-01-11T17:25:00.000Z</published><updated>2008-01-11T18:10:57.120Z</updated><category scheme='http://www.blogger.com/atom/ns#' term='rant'/><title type='text'>British Telecom and Evil Sales Tactics</title><content type='html'>&lt;i&gt;Not a Haskell post, just a rant about how evil British Telecom (BT) are.&lt;/i&gt;&lt;br /&gt;&lt;br /&gt;When I got home this Christmas I was welcomed with a router sitting near the computer, and asked to "install it". This perplexed me - when I left my parents left I had already set up broadband, with a wireless router, and it was all working perfectly. I chose to go with BT for my parents, because BT should be the simplest choice - they already have a BT bill, and a BT line, so adding a broadband should be simple. Also, if anything goes wrong, then it has to be a problem with BT.&lt;br /&gt;&lt;br /&gt;The package I signed my parents up to came with a really rubbish little USB modem, which I replaced with a nice wireless router with 4 wired ports from &lt;a href="http://www.ebuyer.co.uk/"&gt;Ebuyer&lt;/a&gt;. It also had a 20Gb/month limit, with warnings if we went over the limit. Based on talking to my brother (who still lives at home, and is the largest internet user), I suspect we transfer around 15Gb/month.&lt;br /&gt;&lt;br /&gt;However, while I was away at Uni, someone from BT phoned up the house. Unfortunately, my Dad answered. My Dad is not qualified to deal with computers. If he wants to "visit" a website he writes down the address on a post-it note and places it near the computer, to be printed out. The person on the end of the phone told my Dad that in we could upgrade our package, making it 4 times faster and £6/month cheaper, and not changing anything else. He agreed, and entered into an 18 month contract with this new plan.&lt;br /&gt;&lt;br /&gt;As part of this new plan, we got a wireless router, which is currently sitting in a cupboard somewhere (its inferior to the £40 one from ebuyer in every way). We also got upgraded from 2Mb/s to 8Mb/s, a "significant increase in speed". However, in practice, that makes absolutely no difference. I tried explaining this to people at BT, and they resolutely claimed it was 4 times faster. The real kick in the teeth though is that this new package is limited to 8Gb/month, and any additional usage is charged. This is particularly dangerous: if for example a trojan ends up on the machine, it could run up expensive bills. An 8Gb limit is most definitely not "the same" as our original plan, which was chosen based on what we actually needed.&lt;br /&gt;&lt;br /&gt;Over Christmas I had the fun job of trying to sort this out. I rang up BT several times, spent a lot of time listening to bad music and got repeatedly transferred between departments. If the largest telecom company in Britain can't run a call centre, who can? If someone lies to you while sel
