Forum Moderators: phranque

Message Too Old, No Replies

RewriteRule Forwarding 404 Pages?

Why does RewriteRule work even if 404?

         

war59312

8:41 am on Oct 28, 2007 (gmt 0)

10+ Year Member



Hello,

I find it every strange that a RewriteRule still takes place even if the page is 404.

For example:

RewriteRule ^archives/([0-9]{4})/([0-9]{1,2})/([0-9]{1,2})/?$ /index.php?year=$1&monthnum=$2&day=$3 [QSA,L]

Say you browse to:

http://www.example.com/archives/2006/06/55/

Your browser will send a 404 header response but you are still taken to index.php. :( Instead of my 404 page. Why?

Of course the code above works just fine for valid pages. Very strange, you would think this would not work this way.

Any thoughts,

Will

jdMorgan

3:20 pm on Oct 28, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



You are confusing URLs and files.

The normal server response is a 404 if the URL does not resolve to an existing file.

By rewriting all of you "archives" URLs to your script, you are in effect saying, "all URLs of this format resolve to the existing file index.php."

If you rewrite URLs to a script, then it is up to that script to look in your database and determine whether the content associated with the requested URL exists or not. If it does not, then the script can and should output a 404-Not Found HTTP response header, a 410-Gone response, or any other response you choose.

...browser will send a 404 header response but you are still taken to index.php.

Your server sends this response, your browser only displays it.

Jim

war59312

7:36 am on Oct 29, 2007 (gmt 0)

10+ Year Member



OK well I am trying to use this with wordpress.

I figured out how to make index.php display an error when it does not exist. But I'd rather have it point to the real 404 page instead. Any tips?

jdMorgan

11:44 am on Oct 29, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Then to re-phrase:

If you rewrite URLs to Wordpress, then it is up to Wordpress to look in your database and determine whether the content associated with the requested URL exists or not. If it does not, then Wordpress can and should output a 404-Not Found HTTP response header, a 410-Gone response, or any other response you choose.

Once Apache enters the content-handler API phase (to run a script such as Wordpress), then the default Apache error-handling is no longer available, and the script becomes responsible for all error-handling.

You might want to look for a Wordpress plugin that detects missing content and returns an error response code; Having done so, it could then "include" your 404 custom error page as an in-line file.

Do not use redirection to accomplish this. If you do, the client will see the redirect response code instead of the 404 response code. Search engines will therefore not recognize that the originally-requested URL resolves to non-existent content. Check your work using the "Live HTTP Headers" extension to Firefox, or any other accurate server header checker.

Jim

war59312

6:03 am on Oct 30, 2007 (gmt 0)

10+ Year Member



All right thanks, and yes that is was I was doing and how I was 100% sure it was sending a 404 response.

I simply did not understand that the script does not "let go" when a 404 is received, but yes I see why not. Thanks for the explanation.

jdMorgan

1:00 pm on Oct 30, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



> I simply did not understand that the script does not "let go" when a 404 is received

That's actually an excellent way of describing the misunderstanding, and probably easier to understand than my approach of trying to explain the various Apache API phases. So thanks for posting that; I'll try to remember that phrasing the next time this question comes up.

To further clarify, the reason you wouldn't get a 404 is that as far as the Apache 'exists' checking is concerned, the URL does exist, because the URL is rewritten to "/index.php" and "/index.php" does exist.

The "year=y&monthnum=m&day=d" query string parameters (the "GET data") attached to that URL are not part of the URL, and are not meaningful to Apache in any way; They only have meaning to the script, and do not affect whether the script exists or not. Apache has no way to find out that a database lookup inside the script has failed because one or more of those values does not return a valid record, and so the script itself must handle all such error conditions.

Be sure to check the response code headers returned by your script as described above.

Jim

war59312

4:12 am on Nov 4, 2007 (gmt 0)

10+ Year Member



Thanks again for all your help.

I got it working great now. :)