Notes about migrating to Hugo

I've moved my blog from Wordpress.com to Hugo. I wrote some of my thoughts on why I made the switch, ideas, improvements for the design, and custom tooling for my editing workflow.

Notes about migrating to Hugo

I’ve moved my blog to a new blogging platform. This time from Wordpress.com to Hugo. I took some notes about various parts of this migration. I wrote some of my thoughts on why I made the switch, ideas, improvements for the design and custom tooling for my editing workflow.

Dynamic vs static hosted platform

Wordpress.com is a dynamic hosted platform. That means that each blog post is dynamically generated. This is has its own advantages and disadvantages (not going into it). Hugo is a static site generator written in Go. Albeit it’s not that important which language it’s written in, in case of Hugo it makes some differences. In my case I like that Hugo can be compiled in a single binary and I can use it without fiddling with Ruby, Python, Node.js, etc.. dependencies. Another benefit for me is that I can locally preview my blog without any internet connection.

Wordpress.com is a hosted service provided by Automattic that makes using Wordpress as easy as pie. It also comes with really good features out of the box. You might wonder now why I decided not to continue with it anymore? For me, because of the following cases:

  • Lack of a good editor. Even if I would use something else, once published I had to use the online editor for small changes. Every time it would cause the HTML to get messed up. I’ve tried to be patient and still used as I wanted to support Wordpress. But at some point it was too much hassle for me to continue it
  • Lack of customization. After a while, the internal CSS Customization wasn’t enough for me. I wanted to add more, change the HTML layout and many more things. Unless I would upgrade to the upper business plan (I’m on the premium plan), I wasn’t allowed to do it.

So in general, I’ve started using Wordpress for its simplicity and the fact that I didn’t had to maintain it. However this simplicity also put a constraint on me which started to become a burden. What is the point of using it if I can’t use it in a meaningful way?

I decided that it was time to move on.

Extracting data out of Wordpress.com

Because I was using Wordpress.com (a paid hosted service from Automattic), I didn’t had any access to the raw data (MySQL). This means, I had to export my blog posts in a XML file. This XML contains the blog posts with all additional metadata, such as tags, categories, etc.. This can be done easily from the settings.

Once you get the XML data. The next step is to convert it to a data format that Hugo can understands. There are several ways to do it:

  1. Write a script that parses the XML data and creates the markdown files manually
  2. Use the Wordpress.org to Hugo exporter (https://github.com/SchumacherFM/wordpress-to-hugo-exporter) by setting up a local Wordpress setup, importing the XML and running the plugin
  3. Transforming the XML file to Jekyll directory and then use Hugo’s Jekyll importer to create a new Hugo blog from scratch (via exitwp).

The first option would take a lot of time and knowledge, It wasn’t worth it. At the time I was migrating, I didn’t know the second option (learned it later). So I’ve chose the third option.

exitwp is written in Python and works by adding your exported xml file in a wordpress-xmldirectory and running python exitwp.py command. This produces a Jekyll compatible folder named build.

Once you have this, the next step is to use Hugo’s import. This is done by calling the command:

hugo import jekyll build/jekyll myblogfolder/

After that you have a ready to go Hugo structure inside the folder myblogfolder However, you’re still not finished as the output is not yet compatible with Hugo. The followings needs to be fixed:

  1. Images need to be downloaded and put under static/images folder
  2. Wordpress’s custom markdown syntax needs to be cleaned up (i.e: [caption]![](https://example.com/foo.jpg) A picture[/caption]) into Hugo’s custom shortcodes (i.e: {{ /* figure src="/images/foo.jpg" caption="A picture" */ }}

For this, I’ve created a very hacky Go script that would iterate over all markdown files, check for images, download them and also parse the Wordpress syntax and convert it to custom Hugo shortcode. Having said that, even with this script, it didn’t work 100% perfectly (said it was hacky) and I had to manually intervene. This process took some time until I’ve cleaned up all my blog posts.

Here is a link to a gist of some part of the Go script I wrote. Do not run this directly without understanding what it does. I’m just putting it here to give an idea how this can be used:

https://gist.github.com/fatih/2141f0ab201f55fb37f4104649ea6577

Also note that the image downloading part is not available (not sure why not, I must have deleted). But it’s fairly easy to add it. All you have is to do a http.Get() request on the url variable and then use io.Copy to copy the resp.Body into a file. So you basically copy and write the response byte by byte to a file.

Hosting Hugo pages

Now that we have a working Hugo directory and can produce the static pages, it is time to choose the platform to host them. The easiest way is to setup a Droplet and install caddy/nginx/etc.. and just serve it. But I wanted something that I don’t have to maintain and constantly monitor. I’ve decided to ask it on Twitter:

Upon researching and the answers from Twitter, I’ve settled on Netlify for the following reasons:

  • Support for HTTPS and custom domain
  • Excellent integration with various VCS (in my case my private Github repo)
  • Customizable builds based on master
  • Previews of builds via Github PR’s
  • Simplicity and excellent company vision

Netlify just works. My site was up in minutes and I couldn’t believe how simple it was. Now I understand why people like them so much. In the core Netlify works by setting up a webhook into your Github repo and then listens to pushes/PR’s. Once it detects a change, it goes and builds your site.

You tell Netlify which command to use when it detects a change in the repo (Pull request or a push/merge to a specific branch). I’ve set it up that it uses hugo as the build command. and public as the output directory of the build command (i.e: our static pages). Netlify then uses this command to build the pages and then starts serving them from the custom domain you setup. It has many other features that I won’t to go in, just check it out.

One problem I have with them is their pricing strategy. All this awesomeness is free (how can this be a problem?).

I strongly believe a valuable product should charge people instead of relying on VC money. I really like it and can see myself giving them $5-10 monthly. But their next pricing tier from the free plan is $45. I’m not sure what they want to achieve with this strategy.

Netlify settings

I’ve configured my Hugo’s baseURL setting to "/" in the configuration file (config.toml). This allows all URL’s on localhost to resolve to the correct page. If I would set it to "https://arslan.io all URL’s would resolve to my actual website, which would break all local links.

For example, on the homepage I might a have a link to http://arslan.io/2017/11/23/blue-bottle-in-japan/ but on localhost this needs to be resolve to localhost:1313/2017/11/23/blue-bottle-in-japan/. Here localhost:1313 is served when when you run hugo server which builds all the pages and then start serving it on your localhost.

Going back to Netlify. It has something called “Deploy Previews”. These are previews of your latest site build which you can see before pushing to production (merging/pushing the change to master). These have unique domain names, such as: https://deploy-preview-2–jonny-brown-0c9f53.netlify.com. This is excellent because you can visit it and then check your website before publishing to production.

However, in this staging environment, the URL’s need to resolve to this unique domain. Because they constantly change, I can’t go and add them to baseUrl. To fix this issue, I’ve added the following netflify.toml file:

[context.production.environment]
  HUGO_VERSION = "0.30"
  HUGO_BASEURL = "https://arslan.io/"

[context.deploy-preview.environment]
  HUGO_VERSION = "0.30"

This makes sure that Hugo builds the pages with the correct baseUrl. In production it resolves to "https://arslan.io", however in the staging environment (deploy-preview) it resolves to the default inside the config.toml, which was set to "/". And this means that all pages resolves directly to the existing domain. As an example,https://arslan.io/about/would resolve in localhost and deploy previews to localhost:1313/about and https://deploy-preview-2--jonny-brown-0c9f53.netlify.com/about

arslan.io customizations

I made couple of customizations to my blog. I used the excellent hyde theme as my base theme, but as you see it dramatically is different. That’s because I’ve changed everything under the hood. Some of the notable changes:

  • Removed the sidebar and added a header
  • Added two pages, about and archive
  • Added featured images next to the summaries
  • Various kinds of CSS improvements to fit it to my own liking

Some of these deserve more information

Some of the notable customization is “featured images with summaries”. For this I’ve added the following code to the index.html (homepage) layout:

<div class="post-thumbnail">
    {{ if .Params.featured_image }}
    <div class="post-thumbnail-image-box">
        <a href="{{ .Permalink }}">
            <img{{ with .Params.featured_image }} class="post-thumbnail-image" src="{{ . }}"{{ end }} alt="{{ .Title }}">
        </a>
    </div>
    {{ end }}

    <div class="post-thumbnail-entry">
        <h1 class="post-title">
            <a href="{{ .Permalink }}">{{ .Title }}</a>
        </h1>
        <span class="post-date">{{ .Date.Format "January 2, 2006" }}</span>

        {{ if .Params.description }}
        <p class="post-thumbnail-desc">{{ .Params.description }} <a href="{{ .RelPermalink }}">Read More…</a></p>
        {{ end }}
    </div>
</div>

What important here are the .Params.featured_image and .Params.descriptionparameters. These are custom parameters I use in my blog posts front matter. For example from one of the recent reviews I’ve got this:

+++
author = "Fatih Arslan"
comments = true
date = 2017-11-23T07:45:38Z
title = "Blue Bottle in Japan"
slug = "blue-bottle-in-japan"
url = "/2017/11/23/blue-bottle-in-japan/"
draft = false
featured_image = "/images/blue-bottle-in-japan-1.jpg"
description = "For a while, I knew Blue Bottle was interested to invest into Japan. Their CEO James Freeman was inspired from a small Kissaten (old Japanese Coffee shop) when he opened his first Blue Bottle Coffee shop"
+++

This produces then the following little box (with some CSS improvements of course):

Twitter Card

I use Twitter a lot. Because of this I want my links that I share on Twitter originating from https://arslan.io to be shown beautifully. Twitter parses the <head> of each link for certain keywords. If you provide the necessary meta tags, Twitter makes sure to display your link in a more visual way.

First I've created a new partial called twitter-card.html and put it under layouts/partials/twitter-card.html. The content of twitter-card.html partial is in the form of:

<meta name="twitter:title" content="{{ .Title }}"/>
<meta name="twitter:description" content="{{ with .Description }}{{ . }}{{ else }}{{if .IsPage}}{{ .Summary }}{{ else }}{{ with .Site.Params.description }}{{ . }}{{ end }}{{ end }}{{ end -}}"/>
{{- if .Params.featured_image -}}
<meta name="twitter:card" content="summary_large_image"/>
<meta name="twitter:image" content="https://arslan.io{{ .Params.featured_image }}"/>
{{ else -}}
<meta name="twitter:card" content="summary"/>
{{- end -}}
{{ with .Site.Social.twitter -}}
<meta name="twitter:site" content="@{{ . }}"/>
<meta name="twitter:creator" content="@{{ . }}"/>

And then added this partial to layouts/partial/head.html:

<head>
  ...
  {{ partial "twitter-card.html" . }}
</head>

This partial produces then the following meta tags. For the homepage it'll produce:

<meta name="twitter:title" content="Fatih Arslan"/>
<meta name="twitter:description" content="My thoughts about Programming, Coffee, Bags and various other stuff"/>
<meta name="twitter:card" content="summary"/>
<meta name="twitter:site" content="@fatih"/>
<meta name="twitter:creator" content="@fatih"/>

Which renders to the following tweet on Twitter:

But a post (such as "Blue Bottle in Japan") it produces the following:

<meta name="twitter:title" content="Blue Bottle in Japan"/>
<meta name="twitter:description" content="For a while, I knew Blue Bottle was interested to invest into Japan. Their CEO James Freeman was inspired from a small Kissaten (old Japanese Coffee shop) when he opened his first Blue Bottle Coffee shop"/>
<meta name="twitter:card" content="summary_large_image"/>
<meta name="twitter:image" content="https://arslan.io/images/blue-bottle-in-japan-1.jpg"/>
<meta name="twitter:site" content="@fatih"/>
<meta name="twitter:creator" content="@fatih"/>

This renders into a more visual card on Twitter:

Twitter provides a great card validator that I've used to validate the various twitter cards.

Lastly, Hugo already provides a custom internal template that you can without adding any code to your website. All you have is to add the following line (instead of {{ partial "twitter-card.html" . }}:

{{ template "_internal/twitter_cards.html" . }} 

However this is not customizable. That's why I've copy pasted and suited it for my needs. For example I'm using the .Params.featured_image field for my Twitter cards.

Markdown to Hugo's custom markdown format

Hugo prepends a custom data format to the blog post itself, called the front matter. It includes metadata that describes the blog post with various kinds of informations, such as Title, Slug, Featured Image, Tags, etc..  A bare markdown post doesn't have this.

Second, Hugo also allows you to extend the markdown syntax with additional information called shortcodes. When Hugo reads the blog post written in markdown, it renders these shortcodes differently. For example it can embed a Github Gist, Youtube video, Tweet, etc.. with a single line. It's really great to easily embed social and media links. You can even create your own shortcodes. I use the figure short code a lot to add captions to images:

{{</* figure src="/images/tombihn-71-of-911.jpg" caption="A fully packed Tri-Star"  */>}}

This renders then to the following HTML piece:

I'm using Vim or Ulysess to write my blog posts. Lately I've decided to use Ulysses for writing blogs more as I like the functionality and UX it provides. However, Ulysses doesn't understand what a shortcode or a front matter is.

To solve this and make my life easier every time I create a new blog post, I've created a small Go tool. This tool that takes a directory of images with a bare markdown file and then does the following:

  • Parses the directory for images and markdown, extracts as much information as needed (such as post title, images, etc..)
  • Creates a front matter from scratch based on these information
  • Converts the base markdown into a Hugo compatible markdown with shortcodes
  • Copies the post to content/posts with the current date
  • Moves all images from the destination to the static/images folder with a unique name

The tool is idempotent. So you can run it multiple times and it'll try it bests to update the existing file. If no file exists it tries to update it. If anyone is interested here is the Gist link to it.

https://gist.github.com/fatih/d85ec4bf41e6925b7e738d8d3cb46140

Be aware that it's highly custom for my own needs. It doesn't have any tests or whatever. It just something I've hacked in an evening and it works fine for me. This is highly customized for my blog and workflow. As with the other script, this is added for showing things can be done. You probably want to download and change it yourself. Put it under scripts/md-to-hugo.go folder and then call it pointing to the directory where the images and your markdown file residue:

 go run scripts/md-to-hugo.go -dir my_markdown_dir

What's Next

Hugo is fun and very powerful. However it also needs constant maintenance and custom tooling if you want to shape it the way you like it. After the initial investment things got easier though. My website is still not perfect, some of the things I want to do are:

  • I'm currently using 2048px width pictures everywhere. Those pictures are big and probably should resized accordingly (Both for blog posts and featured image)
  • I never liked showing small excerpts of articles in RSS feeds. Currently Hugo is configured to show only a summary. I need to change my feed settings so it shows the full content.
  • Better commenting platform. Disqus is not minimal for my needs. Something more lighter that fits the design of this site would be better.

If you made it until here, let me know how you use Hugo yourself. Feel free to share any feedback or tips/tricks you think are valuable. Thanks for reading.