Hakyll Setup

Posted on August 13, 2016

As people with two eyes might have noticed, this blag is powered by Hakyll, an open-source static content generator written in Haskell.

Hakyll configurations are basically (and by that I mean literally) just Haskell programs. And, since Haskell programs can be written literately, I thought publishing my Hakyll configuration would be useful for other aspiring Haskell blagists.

{-# LANGUAGE OverloadedStrings #-}

Hakyll makes heavy use of overloaded strings, a GHC language extension that makes string literals (i.e. "these things") have type IsString a => a, rather than the default [Char]. This is also used by other modules such as ByteStrings and Text.

import Data.Monoid (mappend, (<>))
import Hakyll
import Text.Pandoc.Options
import qualified Data.Set as S

Here, we import the nescessary modules for working with Hakyll in conjunction with Pandoc.

Pandoc talks to Hakyll with what is called a compiler. These compilers can also be used to filter files through arbitrary programs, compress CSS, and more. These compilers are Monads, which live inside the IO monad. The Compiler newtype is defined as

newtype Compiler a = Compiler
    { unCompiler :: CompilerRead -> IO (CompilerResult a)

Like all good Monads (and all Monads, since we live in the post-AMP world), it also has an instance of Functor and Applicative.

readerOpts :: ReaderOptions
readerOpts = def { readerExtensions = pandocExtensions

For our markdown reader, we enable all of the pandoc extensions. This enables things such as footnotes, metadata blocks, embedded TeX maths, syntax highlighting, and most important of all: Literate Haskell support (In fact, this post is written in Literate Haskell!)

                 , readerSmart = True }

We also enable smart punctuation to produce better output.

writerOpts :: WriterOptions
writerOpts = def { writerHighlight = True
                 , writerHtml5 = True

Our writer produces HTML5 with syntax highlighting. While creating a standalone document, Pandoc would include the nescessary CSS for syntax highlighting along with the output HTML, but since this is not the case, it is our responsibility to take care of supplying the CSS for highlighting code.

theCompiler = pandocCompilerWith readerOpts writerOpts

The pandocCompilerWith function represents a compiler that calls out to Pandoc with the given reader and writer options.

rssfeed :: FeedConfiguration
  = FeedConfiguration { feedTitle = "Hydraz' Blag: Latest articles"
                      , feedDescription = "The latest rambles from a madman's mind"
                      , feedAuthorName = "Matheus de Alcantara"
                      , feedAuthorEmail = "matheus.de.alcantara@gmail.com"
                      , feedRoot = "https://hydraz.club"

This constant represents the configuration of the generated RSS feed, containing things such as the title, description, author name and email, along with the feed root. This is then used by the renderRss function to generate a feed.xml for our blog.

conf :: Configuration
conf = def { destinationDirectory = ".site"
           , storeDirectory       = ".store"
           , tmpDirectory         = ".store/tmp"
           , deployCommand        = "./sync" }

We overwrite some values in the default Hakyll configuration since using _ for directories that should be hidden is plain stupid, so, we replace it with a ..

main :: IO ()
main = hakyllWith conf $ do

The hakyllWith function takes a configuration and a Rules monad and produces an IO (). There is also the hakyll function, that passes the default configuration to the hakyllWith function. Since we overwrite some values in the default configuration, we can not use that.

   match "static/*" $ do
       route   idRoute
       compile copyFileCompiler

This rule specify that anything in the static directory should be copied to the output as-is, with no modifications.

   match "css/*" $ do
       route   idRoute
       compile compressCssCompiler

While we could do the same for CSS as images, Hakyll provides a way to do CSS compression. While the ammount of CSS that we currently have is minimal, this might change in the future.

   match (fromList ["static/about.md", "static/contact.md"]) $ do
       route   $ setExtension "html"
       compile $ theCompiler
           >>= loadAndApplyTemplate "templates/default.html" defaultContext
           >>= relativizeUrls

Here, we specify that the the special about.md and contact.md files should become HTML files, and have the following sequence of transformations applied to them:

  1. Compile Markdown to HTML, using our Pandoc compiler,
  2. Apply the templates/default.html template,
  3. Relativize URLs, so that relative URLs become absolute URLs with the root of the website as their parent.

Sidenote: While the OverloadedLists extension and the associated IsList typeclass do use a method named fromList, the Pattern type that match expects as a first parameter is not an instance of IsList. However, if it was, the above snippet could be rewritten as

   match ["about.md", "contact.md"] $ do

Which is much more elegant.

   match "posts/*" $ do
       route $ setExtension "html"
       compile $ theCompiler
           >>= loadAndApplyTemplate "templates/post.html"    postCtx
           >>= saveSnapshot "content"
           >>= loadAndApplyTemplate "templates/default.html" postCtx
           >>= relativizeUrls

Much in the same way that a composed pipeline is nescessary for the special markdown paths, it is also nescessary for posts. These steps, apart from applying the default template, also apply the post template, which is used for posts. We also save a snapshot of the content before applying the default template. This is later used while rendering our RSS.

   create ["archive.html"] $ do

Here, instead of using a match rule, we use a create rule. This is useful when a file should be generated automatically, which is the case here.

       route idRoute
       compile $ do
           posts <- recentFirst =<< loadAll "posts/*"

First, we load the posts and sort them with the most recent posts first.

           let archiveCtx =
                   listField "posts" postCtx (return posts) <>
                   constField "title" "Archives"            <>

Then, we create a Context for our archives. Contexts are used when applying a template to a file, as a keen-eyed reader might have noticed. In this context, the title field is set to a constant value of Archives. We also have a listField that is generated from our loaded posts.

           makeItem ""
               >>= loadAndApplyTemplate "templates/archive.html" archiveCtx
               >>= loadAndApplyTemplate "templates/default.html" archiveCtx
               >>= relativizeUrls

We apply the archive and default templates to the generated file with the archive context, instead of the post context or the default context, while also relativizing URLs.

   match "index.html" $ do
       route idRoute
       compile $ do
           posts <- recentFirst =<< loadAll "posts/*"

Much like with the archives, we start by generating a special context for our index.html file. This file, however, is not automatically generated: It is instead templated from an on-disk definition.

           let indexCtx =
                   listField "posts" postCtx (return posts) <>

We once again have a list of posts loaded into the posts template variable,

                   constField "title" "Home"                <>

Except, in this case, the constant title field is, as one would expect, Home.


               >>= applyAsTemplate indexCtx
               >>= loadAndApplyTemplate "templates/default.html" indexCtx
               >>= relativizeUrls

We apply the template context to itself, while also applying the default template (with the generated context). Once again, URLs are relativized.

   match "templates/*" $ compile templateBodyCompiler

We compile everything in the templates/ directory with the templateBodyCompiler, a compiler which strips the metadata header from the file while reading it as a template.

   create ["feed.xml"] $ do
     route idRoute
     compile $ do
       let feedCtx = postCtx <> bodyField "description"
       posts <- (take 10 <$>) . recentFirst =<< loadAllSnapshots "posts/*" "content"
       renderRss rssfeed feedCtx posts

The last thing we do while inside our Rules monad is to create a feed.xml file with our rssfeed settings. This is acheived by loading all the content snapshots from the posts, then taking the 10 most recent. A description field is added to the context based on the contents of the body.

postCtx :: Context String
postCtx =
    dateField "date" "%B %e, %Y" <> defaultContext

The postCtx we use is the default context with a date field called date, formatted using the %B %e, %Y string. These strings are in the same format as the unix date utility. The %B %e, %Y format generates a string August 13, 2016 from a given UNIX time.

Hakyll is, though sparsely documented, a really powerful tool for generating static websites. In less than 100 lines of haskell (94, to be precise) we can automatically generate a blog from nothing but Markdown files and a couple of templates.

And thanks to the wide variety of formats Pandoc supports, markdown is not the only option: ReStructured Text and AsciiDoc posts would be possible, and even stranger things, such as Office documents (docx files) and OpenDocument Text files.

PS: This post is my actual site.hs, now site.lhs. Grab a runnable copy here.