Shortening individual archive URIs
For a long while I’ve wanted to make my archive URIs a bit shorter, a bit less crufty. As most of you are very well aware, I take this whole archiving thing very seriously and have put a lot of thought and effort into coming up with the best long-term solution, while trying to not break past efforts along the way.
A little background
A few years ago I put together a post titled, Future-proof your URIs, where I explained how to do just that using Movable Type. A couple of years later, when I moved to WordPress, I obviously wanted the same thing and wrote up a little piece called, Maintaining URIs between Movable Type and WordPress. The purpose of that post was to give users a way (after figuring it out for myself) to make the [now future-proofed] MT URIs play nice with WordPress’s permalink structure, which uses hyphens, instead of underscores, in the post slug. Because mod_rewrite limits you to nine back-references, it can’t be used for posts with longer titles (i.e., no post where there are more than nine words in the title; the actual number is likely much smaller when you take into account the fact that you need to reference other parts of the URI like year, date, etc). My method created individual rewrite conditions and rules for each post; I suspect that the only other way to get around the back-reference limitation is to write some backend code to modify the address each time one of the older URIs is accessed.
Removing some of the cruft
When I speak of cruft, I’m referring to superfluous delimiters that simply aren’t needed to uniquely identify a post. For example, until a few days ago, my individual archive URIs looked like this:
/archives/year/month/day/post-name
It’s obvious that archives and the day reference are not needed. Depending on how you title your posts, you might think that the month is too much as well, but if there is any chance that you’ll give two posts from the same year the same title, then you should probably leave the month in there (I also like it there because it adds a bit more granularity to the URI and lets you, or whoever, get a better idea of when the post was actually published).¹ So, with these two elements removed, the new URIs would look like this:
/year/month/post-name
To get the URIs into this format I had to pay special attention to past schemes so as not to break them — I needed all past links to resolve to the new structure — no one who links to me using a now-defunct, but once active URI scheme, should ever receive a 404 error. After spending some time in the lab, I came up with the following .htaccess condition and rule:
RewriteCond %{REQUEST_URI} archives/([0-9]{4}/[0-9]{2})/[0-9]{2}/?([_0-9a-z-]+)?
RewriteRule .* %1/%2 [R=301,L]
As you may or may not be able to see, that code removes archives and the day reference from the link and the user sees the modified link in their address bar. Because I put this rule below the rules I created to maintain the URIs between MT and WP, even the older links correctly map to the shorter URI.
Lastly, I changed the permalink structure within WP to comport with the new, less-crufty URIs.
FOOTNOTES
- Keep in mind that WordPress will automatically rename the post slug (to something like post-name-2) if there are identical titles within the weblog. ↩