Forum Moderators: phranque

Message Too Old, No Replies

htaccess 404 not working

         

internetheaven

12:12 pm on Jun 29, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I have:

ErrorDocument 404 http://www.example.co.uk
Options +FollowSymLinks
RewriteEngine on
RewriteCond %{HTTP_HOST} ^example.co.uk [NC]
RewriteRule ^(.*)$ http://www.example.co.uk/$1 [L,R=301]

in my htaccess file which means that:

http://www.example.co.uk/ghdit
and
http://www.example.co.uk/nosuchpage.html

send out a 404 and go to the main page but:

http://www.example.co.uk/?page3&id=5

doesn't redirect, it puts out a 200 server header and displays the information from http://www.example.co.uk

How do you get that URL type to output a 404 and redirect? For some reason Yahoo has indexed that and many other similar URLs into their database even though they don't exist.

Thanks
Mike

jdMorgan

2:49 pm on Jun 29, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



To be clear: Do you use query strings on any pages on this site? If you do, we'll need to know how to tell a "good query string" from a "bad query string."

Also, the syntax of your ErrorDocument directive is incorrect, and the result will be that even for the "404s" that appear to be working now, the actual response will be a 302 redirect. This is a dangerous problem with respect to search engine listings. The correct syntax is:


ErrorDocument 404 /

Furthermore, it's not a good idea to just "throw" requests for missing pages to your index page. A better practice is to create a real 404 error page, explaining that the requested page is missing or has been intentionally removed, and offering links to major sections of your site, to a site map, and to the home page. A meta-refresh after ten to twenty seconds can be used to allow people to read the page, but deliver them to the home page if they are confused and take no action.

If you throw people to the home page with no explanation, they'll likely try their bad link or bookmark several more times, get frustrated, and leave.

Let us know about whether your site uses query strings or not, as discussed above.

Jim

[edited by: jdMorgan at 2:50 pm (utc) on June 29, 2007]

Marcia

3:17 pm on Jun 29, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



doesn't redirect, it puts out a 200 server header and displays the information from http://www.example.co.uk

Well naturally, why wouldn't it, when you told the server to return your homepage as the custom error document? Do you really, really, REALLY want your site's homepage to return a 404?

internetheaven

6:41 pm on Jun 29, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



doesn't redirect, it puts out a 200 server header and displays the information from http://www.example.co.uk
Well naturally, why wouldn't it, when you told the server to return your homepage as the custom error document?

That doesn't make sense and that is obviously not what is happening. Every other error page gives out a 404 except for ones that have co.uk/?whatever as the wrong URL.

If I change the error document to www.example.co.uk/error.html or whatever it still displays the main page under the url http://www.example.co.uk/?page3&id=5

The choice of error document isn't the issue as far as I can tell, the htaccess file does not see http://www.example.co.uk/?page3&id=5 (something which does not exist and parameters we don't use) as an error. It sees everything else as an error, just not?parametersthatdon'tmatchanything

Rurne

9:21 pm on Jul 18, 2007 (gmt 0)

10+ Year Member



You're right. That's not what's happening here.

Here is.

Everything after the question mark is considered a query string from a GET operation. (i.e., all that fun junk that gets sent from a <form method="GET"> element). So what exactly does this mean?

Your webserver sees this as a call for http://www.example.co.uk, and then tries to feed it the query string as input values. Since your index page is available, you're going to get the 200, and then silently ignore all those form parameters.

Simply put, you cannot force a 404 to happen, because this is a legal (albeit obnoxious) call to http://www.example.co.uk/

jdMorgan

11:28 pm on Jul 18, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Well actually you *can* force a 404, but since I've been told I don't know what I'm talking about, I probably shouldn't say how.

The syntax of the ErrorDocument directive in the initial post above is incorrect, and will return a 302-Found server response code, as can be confirmed with any decent server error checker such as the Live HTTP Headers extension for Firefox/Mozilla browsers, and as clearly described in the Apache ErrorDocument documentation.

Unwelcome query strings can either be removed with a 301-Permanent redirect to the correct URL, or a 404 can be forced by rewriting those requests to a non-existent *filepath*. But without knowing the answer to the critical question I asked above, using the method is potentially quite dangerous to search engine listings.

Jim

Rurne

4:47 am on Jul 19, 2007 (gmt 0)

10+ Year Member



Interesting perspective you have there, jd. I fail to see any comment as a direct reply to yours stating such.

Regardless, the question as to whether query strings are in use does stand open. My point is that it'd be fairly bad practice to try to rewrite these if you can't tell that these are being fed to whatever you've specified in your DirectoryIndex directive as input parameters. You're likely to break your application which would be more detrimental than Yahoo! creating fictitious indexing (of which this is the first I've heard without someone else SEO-bombing you) that points to your DirectoryIndex.

g1smd

12:01 pm on Jul 19, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



An ErrorDocument directive that specifies the domain to get the document from, always returns a 302 code not the expected 404.

To get the proper 404 response, the ErrorDocument directive must specify only the filename of the error document to show the user.

Start with fixing that before you move on to more advanced topics.