Rewriting URLs in nginx to avoid dead links

We’ve recently migrated Pagefault from using Jekyll to Hugo, and with it some things have changed.
… (things we’ve wanted to change for a while anyways)

One of these things happened to be our post urls, they used to be these long unwieldy things:

https://pagefault.se/life/2019/03/14/how-i-stop-hurrying/

All the categories associated, the post date, etc. all ended up in there, someone coming to our site likely doesn’t care when it’s from (the content of the article will likely horribly date it anyways, like me writing about D and now writing way more Rust instead…) and simpler is always nicer when possible.

Either way, we decided to have post titles be unique and adopted this format instead:

https://pagefault.se/post/article-title/

The problem is moving to the new format, all old links would be dead and personally I hate dead links.
Thus, we are presented with a conundrum and we happen to have a very try_files 1 shaped hammer.

First Attempts

Initially I’d fallen through trying to use named locations 2 to solve the issue, something like:

location \ {
  try_files $uri $uri/ @missing;
}

location @missing {
  rewrite \/(?<end>[^\/]*)\/*$ $scheme://$host/post/$1 last;
}

The big problem you run into here if that @missing handler matches most things is an infinite rewrite loop, I spent quite a while fighting this, having a look if maybe you could check if the file you were doing the rewrite for actually existed.. which turned out to be way more involved than I thought it would be.

… Why did it turn out way more involved do you ask? Well it turns out nginx configurations don’t really want to let you do things like bind results of things to variables, or even have multiple conditions in the if statements, leading to hacks like this one by jrom:

Revelations

It wasn’t until I saw an example of someone passing > 2 arguments to try_files it occurred to me the whole regex could go in the main location match, and the attempt to find the now relocated post could just be another argument to try_files, using the capture from the location regex.

Lets have a look what that ended up looking like:

# this is here so that old links which looked like:
#  /programming/errors/android/2019/08/14/feeling-like-an-idiot-again/
# but now look like:
#  /post/feeling-like-an-idiot-again/
# still work! (also sorry)
location ~ \/(?<title>[^\/]*)\/*$ {
 try_files
   $uri
   $uri/
   /post/$title/
   =404;
}

With a named capture there for the post title, it doesn’t even look that bad!

… Maybe the title is now a little misleading, but the purpose is the same!