Monday, January 20, 2014

How to use Haddock

Haddock, Haskell's documentation generator of choice, is the de-facto means of communicating the purpose of a module or function in the Haskell ecosystem. Despite this, it not included in Learn you a Haskell, and is only mentioned in passing in Real World Haskell. While those books don't make mention of it, it's as crucial to be able to write Haddock as it is to be able to write Javadoc, or doxygen. Because GHC is exposed as a library, Haddock is able to determine the type signature of your library's functions by parsing its code. So, what is left up to the programmer?

Documenting a Module
The first step is to document the module you're working on.  In the picture shown, the chunk of information Portability, Stability, and Maintainer all come from the Module documentation, as do the Module, Copyright, and License fields.  This is generated by a multi-line comment with the special Haddock character, "|", in the front position. That's a pipe, not a 1 or l.


The multi-line comment to generate that information in the document is:
{- |
   Module      :  OpenSSL.Digest.ByteString
   Copyright   :  (c) 2010 by Peter Simons
   License     :  BSD3

   Maintainer  :  simons@cryp.to
   Stability   :  provisional
   Portability :  portable

   Wrappers for "OpenSSL.Digest" that supports 'ByteString'.
 -}
The fields are on the left side, separated by a colon and the r-value for the field. Stability and Portability are subjective. If your API for the module is subject to change frequently, or you know that it will change in the foreseeable future, mark this field as 'Unstable'. If your module depends on other Haskell code not included in a default ghc installation, a Foreign Function Interface, or a feature only found in a very recent or very old version of ghc, mark your module as Non-portable, and possibly give a reason why. For instance, in one of my modules, I have the line "Portability :   Portable (standalone - ghc)", meaning that it does not rely on any other modules within its own namespace, nor any module not included in a default ghc installation.

Documenting your Function, Typeclass, Method, Instance, etc.
This is the last type of documentation you need to add to your Haskell module to bring its documentation up to speed. Single-line comments above functions, usually separated by a newline from the function's type signature (if included), and preceded immediately after the comment (--) by a pipe (|), are Haddock comments.

Take this trivial function:

rev :: [a] -> [a]
rev [] = []
rev (x:xs) = rev xs ++ [x]

ghc already knows the type signature is [a] -> [a], because you told it. For good measure, it'd know even if you didn't; type inference. Because of this, there's no need to talk about the types of data a function can receive. You need only discuss the purpose of the function. The function rewritten with a haddock comment:

-- | Reverses a finite list. 
rev :: [a] -> [a]
rev [] = []
rev (x:xs) = rev xs ++ [x]

Conclusion
Now you see that it's really super easy (and, dare I say, fun?) to document your Haskell code. However, some people believe that "Good code speaks for itself", and that that idiom holds especially true for ML derivatives with type signatures. In fact, superfluous comments will actually clutter your code more than anything. What's the etiquette for commenting?

I will leave that "as an exercise for the reader", as many Haskell bloggers often do for various topics. My personal preference is to always include Haddock comments in code that will be parsed and documented for me anyway. If you maintain a package on Hackage, the popular repository for Haskell, put a bit of Haddock in your code, even for self-explanatory functions. If you're just going to upload it to github and forget about it? There's no point in using Haddock, unless you're anal about good documentation. Use good judgement on when and when not to use this powerful tool.

No comments:

Post a Comment