Forum Moderators: phranque

Message Too Old, No Replies

Custom 404

Was returning 200 OK

         

grandpa

7:06 pm on Sep 15, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I recently added some sites to my Webmaster Tools dashboard, and was setting the preferred domain to mydomain.com vs www.mydomain.com. I had a custom 404 page, and to my surprise the tool told me that my custom 404 page was returning a status 200 OK. What!? How could that be? I had this bit of code at the top of my custom 400 page:


<?php
header("HTTP/1.0 404 Not Found");
header("Status: 404 Not Found");
?>

Someone is messing with me, I think. So I decided to check the server headers for http://www.mydomain.com/404.php

What I saw was a 301 Permanent redirect, from www.mydomain.com to mydomain.com, followed by a 200 OK at mydomain.com. The custom page was completely overlooked.

Now is was time to take a closer look at my .htaccess file. Here's the bit where I told Apache to rewrite everything away from the subdomain www


RewriteCond %{HTTP_HOST} ^www.mydomain.com [NC]
RewriteRule ^(.*)$ [mydomain.com...] [R=301]

Racking my brain (an easy task these days) I looked at this code.. the problem had to be there. Perhaps I needed to tell Apache that this instruction was to be last, so I added that flag.


RewriteCond %{HTTP_HOST} ^www.mydomain.com [NC]
RewriteRule ^(.*)$ [mydomain.com...] [R=301,L]

Nothing doing. Time to dig deeper.

Aha! I wasn't rewriting the variable, in this case the error page. This is a simple matter of adding the variable (.*) to the end of the string, as $1


RewriteCond %{HTTP_HOST} ^www.mydomain.com [NC]
RewriteRule ^(.*)$ [mydomain.com...] [R=301,L]

Now my custom 404 page displays the information I want, and actually returns a 404 error code. Life is good.

jdMorgan

7:44 pm on Sep 15, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Minor tweaks: Always escape literal periods in patterns by preceding them with a "\", and no need to anchor a ".*" pattern standing alone:

RewriteCond %{HTTP_HOST} ^www\.example\.com [NC]
RewriteRule (.*) http://example.com/$1 [R=301,L]

An un-escaped period in a regular-expressions pattern is interpreted as a single-character wildcard, meaning "match any single character."

Now you can look into canonicalizing index.php, and all that other fun stuff... :)

Jim

grandpa

8:00 pm on Sep 15, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Thanks Jim. The fine-tweaks are duly noted.

Note to self: Next time I leave a job with 4 years of fine tuning the .htaccess file, send self a copy of said file.

>> canonicalizing index.php, and all that other fun stuff

There's more!?! :)

jdMorgan

12:13 am on Sep 16, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Lots more. Of course, some sites don't have much of a problem with legacy links or malicious linkers, and so don't need all of it.

Some of the things we've discussed recently (and repeatedly) are redirecting

  • from example.com/<any_directory_or_subdirectory>/index.html to example.com/<any_directory_or_subdirectory>/
  • from //images/logo.gif to /images/logo.gif (note double slashes)
  • and from /images//logo.gif to /images/logo.gif
  • to remove spurious query strings when the page doesn't even use them (e.g. a static HTML page)
  • to remove trailing slashes from extensionless filenames
  • to add missing trailing slashes to directory names
  • to remove spurious characters from the end of URLs (often added by forum/blog auto-linking routines or because of typos, for example, "GET /foo.html." with the trailing period actually included in the URL
  • to remove spurious characters from the end of URLs caused by improperly-closed link tags in HTML code, for example "GET /bar.php>a%20really%20useful%20page%here</a>" where the closing quote was left off the URL in the <a href> tag.

    But wait, there's even more! Basically, there are so many I can't even think of them all at once.

    Jim

  • g1smd

    9:06 am on Sep 16, 2008 (gmt 0)

    WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



    We listed them in a thread just a few weeks back to save further brain ache jd!

    [webmasterworld.com...]