Logo Two double and operators that are shaped like the letter M intersect to give the impression of mountains and roads intertwined with each other.

Migrating the blog to a new domain

redirects and custom error pages in Apache
breadcrumbs

During summer vacation, I have been using some of my extra free time to do a little housekeeping on this blog. In the course of researching about various changes I wanted to make around here, I became re-acquainted with the world of indieweb. I had been vaguely aware of the indieweb for a number of years, but never really felt motivated to look into it beyond the core principles for which it is well known: ownership and control of your data and online identity. But going down that rabbit hole a little further exposed me to a lot of interesting concepts, technologies, and design principles that inspired me to upscale my blog housekeeping to a full-on renovation.

The low-hanging fruit in this undertaking was migrating to a new domain that would more suitably represent my identity online. When I started this blog, it was a very scant page with little more than a stagit instance hosting my dotfiles. At this phase, my blog was hosted as a subdirectory (namu.blue/~mieum) in the traditional pubnix fashion. Once I got my PhD and started teaching at university, I decided that I would reinvent the blog to serve as a kind of online portfolio of my work and also as a space for publishing course materials. At this point, the blog graduated to a subdomain (dsm.namu.blue), which I happily used for a couple of years. I liked this address because it was nearly identical to my work email address (dsm@namu.blue). I had opted for a subdomain rather than just using namu.blue because at the time it was a multi-user server hosting a few other people. But after reading more about the indieweb and the way the concept of an online identity is conceived there, I started to weigh the potential benefits of graduating my blog one step further to a root-level domain; but which?

I own another domain, treeblue.space, but I decided to keep that one available for a project I have been incubating for a while. I ultimately decided on giraffleur.org because it is literally representative of my identity being that my name in Korean translates to flower giraffe. I also like it because it is a whimsical, playful take on my name rather than just something like myrealname.com. So here we are.

Redirecting traffic to giraffleur.org

The problem is that, even though this is not a high volume blog with a large readership, its former address (dsm.namu.blue) has been published and archived in various publications of mine. So as long as those publications exist, someone along the way will be led to the old domain. In order to migrate to a new domain gracefully, I needed to redirect all that traffic to my current domain. How does one go about this with Apache (the web server this blog uses)?

There is really nothing to it. In fact, all it takes is adding a line like this to your Virtual Host config:

RedirectMatch 301 ^/(.*) https://newdomain.org/$1

It is important to note several things here. First, this should be in your actual config file, not an .htaccess file in your DocumentRoot. You could put this directive there, but including it in your Virtual Host config has the benefit of loading these instructions into memory when Apache starts; that is, requests are processed faster that way. The second thing to note is that the directive is RedirectMatch not just Redirect. A RedirectMatch directive allows us to capture the paths of pages being requested (^/(.*)) and append them to the new domain to which they will be redirected (/$1). Last, the error code 301 is significant, because it tells the client that the resource has been moved permanently.

Custom error pages using Hugo and Apache

Redirecting traffic to my new domain this way works perfectly. But in the process of rennovating the blog, I decided that I wanted to evict a few pages and subdirectories that were just adding clutter and baggage to the blog (and its configuration in Hugo). If someone attempts to access those pages, they will be served with Apache’s default 404 error page since that resource no longer exists on the server. This is fine, but it would be nice to inform clients that the resource is not just missing, but that it has in fact been deleted permanently. To do this, we need to configure Apache to return a 410 error code for the resources that have been intentionally removed. And while we’re at it, why not make a custom error page that is consistent with the blog’s theme?

Configuring Apache to return a specific error code for a specific resource is as simple as adding something like this to your Virtual Host config:

Redirect 410 /thesis
Redirect 410 /courses

Of course, /thesis and /courses would be replaced with the file or folder you want clients to know is gone forever.

Now we want to have Apache serve our custom ErrorDocument rather than Apache’s stock one. Just add this (or something similar) to your Virtual Host config:

ErrorDocument 410 /410.html
ErrorDocument 404 /404.html

This tells Apache to use these files in the site’s DocumentRoot as error pages when returning 410 and 404 error codes to clients. But how do we generate these in Hugo so that they can be templated and remain consistent with the site’s theme? To have these error pages generated and copied to the root of your web directory, you must create a corresponding markdown file in the content folder of your Hugo project directory. For example:

content/
├── _index.md
├── 404.md
├── 410.md
├── about.md
├── categories
├── colophon.md
├── faqs.md
├── feeds.md
├── posts
├── series
└── tags

But as you may know, individual pages such as these will not be rendered as single html files, but as subdirectories in the form of 404/index.html. This is not what we want, but how do we change that behavior? The trick is to flip a few switches in the metadata of these source markdown files. For example, the YAML frontmatter of my 404.md looks like this:

---
title: Page Not Found
noindex: true
layout: error
errorcode: 404
url: "404.html"
list: never
---

Here url: "404.html" tells Hugo that we want this to be a standalone page, not a subdirectory. noindex: true prevents this page from being included in the sitemap, and list: never keeps it from turning up in page listings on our site (because it is an error page). layout: error tells Hugo to render this page using the custom layout template error.html, and errorcode is a custom variable used to render that template. When rendered, the page looks like this: 404.html.

Note that when you access 404.html on my site, the address bar in your browser will show https://giraffleur.org/404.html, but when the error page gets served as an actual 404 error response, the address bar shows the URL you tried to access (such as https://giraffleur.org/no-one-home). This is because when you navigate to https://giraffleur.org/404.html you are asking for a page that does exist (even though that page is a 404 error page). This may feel weird at first, but this is the proper behavior that you want. Trying to get Apache to serve a 404 for the 404 error page is an exercise of hubris I cannot be bothered to oblige.

After all of this is said and done. My new internet home, https://giraffleur.org, is live and happily receiving redirects from my old domain. If for some reason you find this page trying to solve similar problems and have questions, feel free to send them my way.

I pledge to recognize the dignity and worth of all people. trying to implement microformats as much as i can