Sunday, May 19, 2024

Apache Redirect Example

Why do we want our web server to redirect web requests to a new URL? Several reasons really. Here are some of the reasons:

  • We want HTTP requests redirected to the HTTPS version of our website.
  • Redirect www requests to our non-www website.
  • Perhaps the reason that holds a lot of weight is for SEO purposes. If a search engine crawler encounters a broken link then that URL will not be indexed. So if you have a webpage that is high up in Page Rank value, obviously you'ld want to keep that high Page Rank value, you don't want the web crawler encountering a broken link. The crawler not only notes the new URL but also processes the transferring of the Page Rank value from the old URL to the new one. The already achieved SEO rank is preserved.
  • A short and easy to remember URL can redirect to long and complex one, vice versa.
  • We want to redirect similarly named domain name into our main website (e.g. example.net redirects to example.com, common misspellings are redirected to the main site, etc.)

Apache Redirect Setup

This example was done on a Windows 11 machine with Windows Subsystem for Linux running Ubuntu 22.04. Our web server is Apache v2.4. I will not be demonstrating how to install WSL and Ubuntu. I'm going to assume that you already have something similar set up. It is also assumed that the reader knows how to navigate Linux (e.g. vi, nano, sudo, etc.). This blog will demonstrate how to configure Apache2 to perform redirects. You should have something like below already.

Plain and Simple URL Redirect

This plain and simple URL redirect approach is good if you only need to redirect a few URLs. We can easily add it on the Apache2 configuration file. But first, let's see it break like below. If it is a fresh Apache2 install, then your /var/www/html directory shouldn't have a broken directory in it. Let's go to http://localhost/broken.

Let's add a redirect in the Apache configuration file (i.e. /etc/apache2/sites-available/000-default.conf). Below is the syntax and the line to add inside the VirtualHost block.

  
<VirtualHost *:80>
  ... snipped ...
  # Redirect 301 [old URL] [new URL]
  Redirect 301 /broken /
  ... snipped ...
</VirtualHost>
  

Restart Apache, sudo service apache2 restart. Open your browser and disable cache in your network tab before hitting http://localhost/broken. You should have something like below.

As you can clearly see in the Network logs, Apache told our browser to "301" to the root resource. Take note of the Transfered column and hit http://localhost/broken again. You should have something like below.

Did you see the difference? Take a look at the Transferred column. It is pulling cached data. Alright, let's change 301 to 302 in the Apache configuration file and restart. On the first call tick disable cache and hit the /broken resource again. What do you see on the Network logs?

Ok, now the Status is 302. Again, take note of the Transferred column. Untick disable cached and hit the /broken URL again. What do you see now?

Interesting, the 302 row downloaded data from the web server. It didn't pull from the browser cache. Why? How? What the? The HTTP 301 status code means the resource has moved permanently. Since the browser already know that it is going to be the same resource, it just pulled from the cache. The HTTP 302 status code means the resource has moved temporarily. Since it is temporary, the browser downloads the data again because the original URL could be available again. This is also why a 301 is more desirable from a SEO perspective. Since it is a permanent redirect, the search engine crawler will initiate the transfer of Page Rank value to the new URL. Thus preserving the SEO ranking.

Complex Apache Redirect

For production systems, the Apache module mod_rewrite is the best choice. It is fast, powerful and has the ability to manipulate URLs in a simple way. The rules are defined in a .htaccess file. The following command enables mod_rewrite on Ubuntu, sudo a2enmod rewrite. Then restart Apache to take effect. You'll know what modules are enabled by looking at the /etc/apache2/mods-enabled directory. Likewise, /etc/apache2/mods-available tells you what modules are available.

  
jpllosa@localhost:/etc/apache2/mods-enabled$ ls
access_compat.load  authn_file.load  autoindex.load  env.load        mpm_event.load    rewrite.load
alias.conf          authz_core.load  deflate.conf    filter.load     negotiation.conf  setenvif.conf
alias.load          authz_host.load  deflate.load    mime.conf       negotiation.load  setenvif.load
auth_basic.load     authz_user.load  dir.conf        mime.load       reqtimeout.conf   status.conf
authn_core.load     autoindex.conf   dir.load        mpm_event.conf  reqtimeout.load   status.load
  

Next step is to allow the use of .htaccess files. Replace the Redirect 301 /broken / directive in /etc/apache2/sites-available/000-default.conf with the below and restart Apache.

  
<VirtualHost *:80>
  ... snipped ...
  <Directory /var/www/html/>
    AllowOverride All
  </Directory>
  ... snipped ...
</VirtualHost>
  

And then create the .htaccess file in /var/www/html with the below contents. You don't need to restart Apache for the changes on this file to take effect.

  
RewriteEngine on
RewriteRule ^broken$ http://localhost [R=301]
  

The above will redirect requests to http://localhost/broken back to http://localhost. You should have the same result as the Simple 301 Redirect demonstrated earlier. The first line enables the mod_rewrite. The second line is the regular expression to match and replace the URL. The syntax is RewriteRule [pattern] [substitution] [flags]. The pattern is a regular expression to match. It is then substituted with a full file system path or a web path relative to the root directory or an absolute URL. The flags are optional. In this example, it forces a 301 redirect.

Usually, a RewriteRule is introduced by a RewriteCond. This specifies the conditions under which the RewriteRule takes effect. The syntax is RewriteCond [test string] [condition] [flags]. The test string is typically a server variable with the format %{VARIABLE NAME}. The condition could be a regular expression or a string comparison or file/path tests. Flags are optional (e.g. NC - which means ignore case).

Apache Redirect Example Summary

There you have it. A simple and complex way of redirect traffic to your website. You can know rest easy that your Page Rank value is preserved with some magical Apache redirect configurations. Happy redirecting.

No comments:

Post a Comment