Forum Moderators: phranque
Here's what I'd love for my site:
1. Redirecting 404 with absolute path i.e. [blah_blah...] so I am sure all the M$ lovely ideas don't help me from showing the custom page
2. Preventing robots from indexing my no-longer-existing pages (that is though they will be redirected to my custom error page)
----
Would it be enought just by:
1. Placing the whole domain+page url in .htaccess
2. A robots.txt file disallowing the custom 404 page
?
Welcome to WebmasterWorld [webmasterworld.com]!
I'm not sure I understood your question, but I'll try to help.
Doing a 404 redirect with a canonical URL will cause you problems, because the server status code will not be 404, it will be 302. ErrorDocuments must be specified as local documents to avoid this problem -- See the note in the Apache ErrorDocument documentation [httpd.apache.org]. Also, this may not help with the "friendly error message" problem in IE5 anyway - just make your 404 document longer than 512 bytes to avoid this.
I think you're saying you want to respond to search engine robot requests for non-existent files with a 404-Not Found response, but redirect other visitors to a script that will handle redirecting them. Is this right?
If so, you could use mod_rewrite to detect non-existent files (See RewriteCond -F and -U options) and then redirect regular browsers to your script, leaving the robots to be handled by the regular 404 or 410 handler. This method is not very efficient, since each HTTP request will result in a second subrequest to determine if the file exists. Also, detecting robots by user-agent will make the mod_rewrite code fairly large, and you will have to maintain it as new search engines or robots appear.
HTH,
Jim
Classic .htaccess:
ErrorDocument 404 /errors/error404.php?goto=some_page.php
(error404.php > 512 bytes)
-Not found pages return a 404, while redirecting to the custom page, so far so good.-
(*)The next code does a javascript redirection and may help someone hopefully:
<script>setTimeout("alert('redirigiendo a la $descripcion...'); document.location.replace('/$goto');", 7000)</script>
where goto is in the query as shown. (I redirect with an .htaccess to the index.php inside the directory where the file was searched)
I hope bots will Ignore this non-esisting pages since I give back a 404