First (pelican) post

Jamie Lentin

After a long period of inactivity, this website has finally been updated once more. The content from the old website is still available at https://old.shuttlthread.com, and blog posts will automatically redirect

Why pelican?

Pelican is a static site generator, meaning every page has been pre-generated ahead of time, instead of relying on dynamic generation of HTML. The switch is a pretty dramatic contrast from the original site, based on Plone, a full-featured enterprise CMS.

The obvious question, why the switch? As good as Plone is as a CMS, it's not a great fit for the target user base (i.e. me).

  • I work with a text editor and git, so it matches what I'm used to.
  • User management isn't very useful in this case.
  • I get to write ReST-based content, not editing raw HTML (it gets laborious pretty quickly).

A static site generator like Pelican provides just enough functionality to do all of the above, without introducing attack surface with things I don't want. Besides, all the cool kids use static page generators nowadays.

Another question could be if Pelican is simple enough, why didn't I roll my own, given I already did for my personal website, http://jamie.lentin.co.uk? In that case the website was originally hand-crafted HTML, and one with a deep structure instead of having blog-esque navigation, which I wanted to preserve as closely as possible. Using docutils and XSLT in my own script worked out to be the easiest way of doing this, but it's still a non-trivial amount of code. For this site, the structure Pelican offers pretty much exactly what I wanted already.

Static archive of the Plone site

I'd already decided to take the BBC news approach of preserving the original look and feel of older content, as opposed to migrating blog entries into the new system. We can use wget to spider the site:

wget \
  --mirror \
  --page-requisites \
  --default-page=index.html \
  --adjust-extension \
  --continue \
  -e robots=off \
  -D ${DOMAIN}  \
  --reject 'search*,@@search*' \
  http://${DOMAIN}/search_icon.png \
  http://${DOMAIN}/spinner.gif \
  http://${DOMAIN}

find ${DOMAIN} -name '*.html' \
    -exec sed -Ei 's/https?:\/\/'${DOMAIN}'\/*/\//g' {} \;
rdfind -makesymlinks true ${DOMAIN}/

At this point, we have html pages in a directory structure, e.g. blog.html, but we need NGINX to read blog.html when a request for /blog comes:

location / {
    autoindex off;
    try_files $uri $uri.html $uri/index.html =404;
}

Configuring Pelican

There are plenty of guides to this, and pelican-quickstart can get you most of the way there anyway, so I thought I'd just highlight a few points:

Site structure / hiding .html extensions

Two things upset me about Pelican's default site structure:

  • The blog was at the root, pages nested in a sub-directory, I wanted it the other way around
  • Page URLs ended in .html, which feels like a leaky abstraction to me

Chipped Prism's blog post figured this out for me. Any file location, and the URL used for that file in content, can be altered using a pair of _URL and _SAVE_AS URLs in pelicanconf.py. So to put blog-posts under posts/ and pages at the root, I can do:

INDEX_SAVE_AS = 'posts.html'
ARTICLE_URL = 'posts/{date:%Y}-{date:%m}-{date:%d}-{slug}/'
ARTICLE_SAVE_AS = ARTICLE_URL.rstrip('/') + '.html'
PAGE_URL = '{slug}/'
PAGE_SAVE_AS = PAGE_URL.rstrip('/') + '.html'
AUTHOR_URL = 'author/{slug}'
AUTHOR_SAVE_AS = AUTHOR_URL.rstrip('/') + '.html'
CATEGORY_URL = 'category/{name}/'
CATEGORY_SAVE_AS = CATEGORY_URL.rstrip('/') + '.html'
TAG_URL = 'tag/{name}/'
TAG_SAVE_AS = TAG_URL.rstrip('/') + '.html'

Then I can make a static page called index.html to use as the site root.

Also note that the _URL expressions don't have .html extensions, whereas the files generated still do. try_files can fix this:

# Remove any trailing slashes
rewrite ^(.+)/+\$ \$1 permanent;

# Block direct access to any .html files (for cleaner URLs)
location ~ \.html\$ {
    internal;
}

location / {
    try_files \$uri \$uri.html \$uri/index.html @try_old;
}

Theme

After being inspired by jvanz's theme, I decided to make my own based on Hyde. This was mostly a case of taking the simple theme and applying the Hyde classes in the right places.

Also useful: http://mygeekdaddy.net/2015/01/09/never-change-your-pelican-footer-again/

The picture is a modified version of hilos by Carol C..

Deployment

The site is deployed to a virtual machine, using git push. NGINX is pointed at the version on the server, hosting the content.

Previously one couldn't push to a non-bare repository (i.e. one with a checkout as well as .git directory). However with modern versions of git I can do the following on the checkout:

git config receive.denyCurrentBranch updateInstead

Thus deployment, assuming my origin is of the form *(hostname)*:*(path)*, is done with:

git push origin && ssh $$(git remote get-url origin | sed 's/:/ make -C /')