Forum Moderators: phranque
ErrorDocument 404 http://www.example.co.uk
Options +FollowSymLinks
RewriteEngine on
RewriteCond %{HTTP_HOST} ^example.co.uk [NC]
RewriteRule ^(.*)$ http://www.example.co.uk/$1 [L,R=301]
in my htaccess file which means that:
http://www.example.co.uk/ghdit
and
http://www.example.co.uk/nosuchpage.html
send out a 404 and go to the main page but:
http://www.example.co.uk/?page3&id=5
doesn't redirect, it puts out a 200 server header and displays the information from http://www.example.co.uk
How do you get that URL type to output a 404 and redirect? For some reason Yahoo has indexed that and many other similar URLs into their database even though they don't exist.
Thanks
Mike
Also, the syntax of your ErrorDocument directive is incorrect, and the result will be that even for the "404s" that appear to be working now, the actual response will be a 302 redirect. This is a dangerous problem with respect to search engine listings. The correct syntax is:
ErrorDocument 404 /
If you throw people to the home page with no explanation, they'll likely try their bad link or bookmark several more times, get frustrated, and leave.
Let us know about whether your site uses query strings or not, as discussed above.
Jim
[edited by: jdMorgan at 2:50 pm (utc) on June 29, 2007]
doesn't redirect, it puts out a 200 server header and displays the information from http://www.example.co.ukWell naturally, why wouldn't it, when you told the server to return your homepage as the custom error document?
That doesn't make sense and that is obviously not what is happening. Every other error page gives out a 404 except for ones that have co.uk/?whatever as the wrong URL.
If I change the error document to www.example.co.uk/error.html or whatever it still displays the main page under the url http://www.example.co.uk/?page3&id=5
The choice of error document isn't the issue as far as I can tell, the htaccess file does not see http://www.example.co.uk/?page3&id=5 (something which does not exist and parameters we don't use) as an error. It sees everything else as an error, just not?parametersthatdon'tmatchanything
Here is.
Everything after the question mark is considered a query string from a GET operation. (i.e., all that fun junk that gets sent from a <form method="GET"> element). So what exactly does this mean?
Your webserver sees this as a call for http://www.example.co.uk, and then tries to feed it the query string as input values. Since your index page is available, you're going to get the 200, and then silently ignore all those form parameters.
Simply put, you cannot force a 404 to happen, because this is a legal (albeit obnoxious) call to http://www.example.co.uk/
The syntax of the ErrorDocument directive in the initial post above is incorrect, and will return a 302-Found server response code, as can be confirmed with any decent server error checker such as the Live HTTP Headers extension for Firefox/Mozilla browsers, and as clearly described in the Apache ErrorDocument documentation.
Unwelcome query strings can either be removed with a 301-Permanent redirect to the correct URL, or a 404 can be forced by rewriting those requests to a non-existent *filepath*. But without knowing the answer to the critical question I asked above, using the method is potentially quite dangerous to search engine listings.
Jim
Regardless, the question as to whether query strings are in use does stand open. My point is that it'd be fairly bad practice to try to rewrite these if you can't tell that these are being fed to whatever you've specified in your DirectoryIndex directive as input parameters. You're likely to break your application which would be more detrimental than Yahoo! creating fictitious indexing (of which this is the first I've heard without someone else SEO-bombing you) that points to your DirectoryIndex.
To get the proper 404 response, the ErrorDocument directive must specify only the filename of the error document to show the user.
Start with fixing that before you move on to more advanced topics.