Blog Rebuild

Share on:

Like a phoenix from the ashes this blog has been reborn! I thought it might be interesting to talk about the new architecture, as well as the process I went through to migrate it.

TL;DR I used:

  • Hugo to build the site
  • Azure Static Web App for hosting
  • CloudFlare to deal with the top level domain redirection
  • A custom tool called html2md to convert my old HTML content to markdown.

Objectives

  1. Use a static site generator. As much as I like CMS', for the frequency at which I post they are overkill. Not having to maintain server side logic and a database is also a win.
  2. Keep all the old content, including dealing with redirects.
  3. If possible, use this as an opportunity to try new technologies.

Static Site Generation

Hugo logo:right:inline I decided to use Hugo, having had some success with it recently while building the new LIFTI v2 documentation. I used the Clarity theme with some minor customizations, such as swapping out Google Tag Manager for Application Insights to track page views. You can have a look around the site’s code in the GitHub repo.

There are lots of resources out there on creating a site based on Hugo so I won’t go into details here. The Hugo Quick Start is a good example.

Migrating Content With html2md

If all the content in my old site was in markdown this would have been trivial, unfortunately this wasn’t the case - to understand why, here’s a potted history of goatly.net:

  • 2004 Born on Windows Live Spaces (MSN Spaces at that time?)
  • 2010 Windows Live Spaces shuts - Migrated all the content to an Umbraco CMS build
  • 2012 Migrated all the content to an Orchard CMS build

The main takeaway here is that at no point in its history did this blog ever have anything fancy like markdown; it was all badly formatted HTML, tables with out proper thead, etc, transformed multiple times to different platforms.

To help with the conversion process I built a tool called html2md that allowed me to:

  1. Download and convert the raw HTML of my live site
  2. Download any linked images
  3. Update image link addresses
  4. Extract Front Matter metadata from the page, writing it to the header of the converted markdown.

Publishing html2md as a dotnet tool allowed me to build a single script to download and process all the content. I used this process to refine the output from html2md as I went. You can still see the resulting script in the repo, but here’s a snippet:

 1$pages = @(
 2    "2006/7/20/Easy-way-to-enter-GUIDs-for-WiX-scripts.aspx",
 3    ...
 4    "fixing-nuget-errors-after-removing-microsoft-bcl-build",
 5    "using-dynacache-with-simpleinjector"
 6)
 7
 8$Urls = @($pages | ForEach-Object { @( "-u", "http://goatly.net/$_") })
 9$Html2mdArgs = @(
10    "-o",
11    ".\content\post\",
12    "-i",
13    ".\static\images\post\",
14    "--image-path-prefix",
15    "/images/post/"
16    "--it",
17    "//article[@class='blog-post content-item']",
18    "--et",
19    "header,//h2[@class='comment-count'],//ul[@class='comments'],//div[@id='comments']",
20    "--code-language-class-map",
21    "xml:xml,sh_csharp:csharp",
22    "--front-matter-data",
23    "title://article/header/h1",
24    "--front-matter-data",
25    "date://div[@class='metadata']/div[@class='published']:Date",
26    "--front-matter-data",
27    "author:{{'Mike Goatly'}}",
28    "--front-matter-data-list",
29    "tags://p[@class='tags']/a",
30    "--logging",
31    "Debug"
32)
33
34$Html2mdArgs = $Html2mdArgs + $Urls
35
36& 'html2md.exe' $Html2mdArgs

Hosting

Azure static web apps logo:right:inline I initially tried hosting the content in an Azure Storage Static Website, but without fronting that with a CDN and paying for a load of redirect rules, mapping the old URLs to the new ones wasn’t going to work out.

Instead I opted to use the new Static Web App Azure service that’s currently in preview. The benefits from this were:

  • Pre-canned deployment process. I just push changes to GitHub and an automatically configured GitHub action builds the site and publishes it to Azure. 🧙🏾‍♂️🎉
  • A really simple way to statically declare server-side redirects using a routes.json file:
 1{
 2    "routes":[
 3        {
 4            "route": "/2006/7/20/Easy-way-to-enter-GUIDs-for-WiX-scripts.aspx",
 5            "serve": "/post/easy-way-to-enter-guids-for-wix-scripts",
 6            "statusCode": 301
 7        },
 8        ...
 9    ]
10}

Top level domain handling

Cloudflare logo:right:inline One of the things that Static Web Apps can’t currently handle is top level domains (aka naked domains). Previously my blog was primarily hosted at https://goatly.net so I needed a way to redirect traffic from there to the www subdomain. Fortunately there’s a workaround documented that uses CloudFlare to proxy requests for the top level domain and redirect to www.

Summary

That’s the journey of how this new iteration of my blog came to being - hopefully there’s some interesting pointers in there for others who want to undertake something similar!