Forum Moderators: phranque

Message Too Old, No Replies

Redirecting bad urls to the homepage

using mod_rewrite and php

         

AnonyMouse

3:02 pm on Apr 8, 2005 (gmt 0)

10+ Year Member



I'm using mod_rewrite and php on my site, with the php parsing urls. Works great for valid urls (i.e. urls for which I have corresponding data) but for invalid urls (e.g. generated crappy spiders or user entry)it results in errors all over the page.

What I would like to do, in the case of an invalid url, is:
1. Return a 404 status e.g. header("HTTP/1.0 404 Not Found");
2. Redirect the browser to the homepage
3. Stop any futher processing!

My understanding is that I can send a header("Location: [example.com...] but this results in a 302 status. I'm fairly sure I don't want a 302 status (or a 301 for that matter), as the page being requested never existed - so it should be a 404 error. I'm also a little wary of Google then seeing my homepage content under multiple pages and getting hit for duplicate content!

Any tips on how to redirect but keep the 404 status, or whether I should just use a 301?

TheDoctor

3:18 pm on Apr 8, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Don't redirect if the URL doesn't exist. People will think they've made a mistake, or you've made an error, and keep trying to reinput the original URL. When they accept they can't get there, they'll go away in frustration.

Better to have a custom 404 page that

  1. explains that the URL doesn't exist
  2. displays the site map so that the visistor can find out where they ought to go to find their information (preferably in one click)

AnonyMouse

4:02 pm on Apr 8, 2005 (gmt 0)

10+ Year Member



Good point!

Okay, in which case, how do I redirect the browser to this custom 404 page? For some reason, just using

header("HTTP/1.0 404 Not Found");

doesn't seem to actually send the browser to any 404 page at the moment!

I'd also better check if I can have a custom 404 page...though I guess if I can redirect, then it can be to a page of my own choosing?

TheDoctor

9:34 am on Apr 9, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



You need to have the following in your .htaccess file:


ErrorDocument 404 http://www.example.com/errorpages/404.html

The URL should be the URL of your 404 error page. Make sure you think seriously about the design of this page. From what you say, it could be important to you.

You should probably also have the following in your .htaccess file, in order to catch other standard erors:


ErrorDocument 401 http://www.example.com/errorpages/401.html
ErrorDocument 403 http://www.example.com/errorpages/403.html
ErrorDocument 500 http://www.example.com/errorpages/500.html

AnonyMouse

1:46 pm on Apr 9, 2005 (gmt 0)

10+ Year Member



Thanks Doc (sorry, had to say that!), I'll give that a try, sounds like the solution I'm looking for. Yes, the error page certainly needs thought - most likely a search page plus a "page not found" message. Thx again.

AnonyMouse

10:59 am on Apr 11, 2005 (gmt 0)

10+ Year Member



Hmmm, still haven't cracked this - how can I force the server to throw a 404 error? At the moment, the logic is:

if $path exists
then generate page
else throw 404 error

However, using the php heder function - header("http/1.0 404 Not Found") - for that last part doesn't appear to force a 404 error.

Reading the php site on this topic, it might be interpreted as saying that this directive would appear *ON* the 404 page to indicate that it is the 404 error page, rather than being used to force a 404 error in the first place. But that interpretion is open to question, I'm not clear...

So, I'm back to the original question - how do I force a 404 error, using php?!?

AnonyMouse

11:34 am on Apr 11, 2005 (gmt 0)

10+ Year Member



Thought I'd better reply to my own post, having found the answer!

By using PHP to send the header, I'm in fact bypassing Apache, which means I have to generate the page within the php itself. I.e.

if $path exists
then generate page
else generate 404 errorpage

Using the header() function to signal to indicate that it is a 404 page, of course!