Managing legacy links with .htaccess

An .htaccess file is nothing more than a simple text file but one that offers a vast amount of power for users running their web sites on an Apache server. These tiny files are the hidden, and generally unsung, heroes of web designers, offering them the chance to quickly and easily extend the functionality of web sites .

Unfortunately, working with .htaccess files can be a little involved - we’ll need to make use of a plain text editor (to create or edit the file) as well as an FTP application to move the file onto our server - but the benefits are well worth it.

What can .htaccess do for me?
.htaccess is the cornerstone for a lot of cool things on an Apache server including adding error documents to your site, password protecting folders, adding dynamic includes so that pages can pull in content from other files, blocking rogue visitors, redirecting visitors, and preventing other sites from stealing your bandwidth by ‘hot linking’ your content.

For the sake of this overview we’ll concentrate on a single, but very important, aspect of .htaccess - redirection.

Redirecting users
Freeway is great at managing your links for you. Built into the application is a link manager that silently and seamlessly handles all of the internal links in the document and makes sure that they all point to their intended destinations. If you change the name of a page, an anchor, or a frame reference the link manager will automatically update all of the links that point to that item. Unlike a lot of other web design applications it is very hard to create dead-end links in Freeway. The application can also check links to external sites with the check button in the Edit URLs dialog (Edit>URLs… menu).

What Freeway can’t handle for us, but .htaccess can, are legacy links from sites that point to now non-existent or moved pages.

Imagine you are redesigning a popular web site and in the process you decide to update the site structure so that the information is gathered in a more logical fashion. What was once a contact page at http://www.example.com/contact.html, for example, is now http://www.example.com/about-us/contact/. Obviously you’ll want to try and keep all of the web traffic from other sites as well as search engines and redirect them to your new page location. One way to do this is to physically redirect users from the old page to the new page using either a script or meta-refresh tag. The Timed Redirect Action, for example, does just this. Although this solution works the problem is that none of the linking sites are aware that the page has moved and will keep sending visitors to the old address regardless of where the new content is.
This is where our .htaccess file steps in. A single line of text in our .htaccess file will not only redirect these visitors instantly but will send back a notification saying that the page has permanently moved.

Before we get started with our .htaccess file let’s spend a few minutes looking at how many sites link into our site and what pages they are linking to. There are a number of tools and services you can use to track these links but for the sake of this example we’re going to use Yahoo’s Site Explorer tools.
Open up your web browser and head on over to Yahoo. You will almost certainly get automatically redirected to a regional Yahoo site unless you are in the USA but don’t worry, the Site Explorer tools are available wherever you are. In the Web search bar at the top of the screen enter site: and the address of the site you want to track the links for.

For example
site:www.example.com

Press the Web search button and you should get redirected to the Yahoo’s Site Explorer where you’ll see a list of your site pages that other web sites link to. If you want to see which pages link to your site simply click on the Inlinks button at the top of the page and change the settings to Show inlinks 'From all Pages" to ‘Entire Site’.

Now we’ve a good idea of the pages that are linked to in our site we can start to compare this to our new site structure and create an .htaccess file to seamlessly redirect users to the new content.

Creating the .htaccess file
To create our .htaccess file we’re going to need a plain text editor. TextEdit that ships with Mac OS X is fine, although if you use it make sure that it is in plain text mode (Make Plain Text from the Format menu). A better all round text editor for this sort of work is Text Wrangler. Not only is it a very powerful text editing tool it is also free. It even contains a built in FTP tool meaning you can save your .htaccess file directly to your server.

Start up Text Wrangler and select Open from FTP/SFTP Server from the File menu. Enter your site FTP log-in details, make sure that the Show files starting with “.” option is selected and finally click on the Connect button. If all goes well you should see a bunch of files or folders that all live on your server. Navigate to the home folder for your site (sometimes this is called public_html or htdocs). If you see an existing .htaccess file in the list we’ll open that and add our own redirection commands. If you don’t see a file, don’t worry, we’ll create one in the next step.

If you need to create an .htaccess file simply select File>New>Text Document otherwise select the existing .htaccess document in the FTP Browser window and press the Open button. When working with an existing .htaccess file be very careful not to remove or edit anything that is already there unless you know what you are doing as it can be very easy to mess things up.

For our example we want to move users from the old page;
http://www.example.com/contact.html

to the new one;
http://www.example.com/about-us/contact/

Here’s our magical .htaccess line;
redirect 301 /contact.html /about-us/contact/

Note: If your file path contains spaces (/about us/contact/ for example) then you’ll need to quote the text like this:
redirect 301 /contact.html “/about us/contact/”
Generally spaces in URLs like this aren’t a great idea so are best avoided if possible.

That’s it! Now save the file. If you opened an existing .htaccess file directly from the server then Text Wrangler will upload it for you automatically. Alternatively if you created a new document simply use Text Wrangler’s Save to FTP/SFTP Server option under the File menu to save the file directly to your server.

You can now jump back over to your web browser and test your new URLs. Enter the URL of the old file (http://www.example.com/contact.html) and the server should magically, and quietly, take you directly to the new page (http://www.example.com/about-us/contact/).

With the .htaccess file in place, any search engines that come by and re-index your site should see the 301 response for the requested page and update their directories accordingly.

If you fancy reading more about .htaccess and redirects head on over to the official Apache documentation where you’ll find details on everything related to the topic.