Forum Moderators: phranque

Message Too Old, No Replies

404 Error page for specific IP or Agent

404 Error page Agent specific

         

krack

6:28 pm on Feb 22, 2006 (gmt 0)

10+ Year Member



I'm having problems creating an htacces file that will take a specific agent or IP and give it a "Ligit" 404 in the header info - is that something that is possible?

so far Im stuck on:

Options all
RewriteEnging On
RewriteCond %{HTTP_USER_AGENT} ^msnbot/.* [NC,OR]
RewriteCond %{REMOTE_ADDR} ^72\.14\.192\.[8-9]$
RewriteRule?

I'm a little bit confused as to how to put the last part together so that msnbot gets a "proper" 404 reply for any page it requests.

thanks in advance.

jdMorgan

5:24 am on Feb 23, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



You can force a 404 for all URL requests by adding

RewriteRule .* /some_file_that_does_not_exist.lmth [L]

after your RewriteConds.

Alternatively, you can return a 410-Gone status for HTTP/1.1 clients using


RewriteRule .* - [G]

Jim

krack

7:50 pm on Feb 23, 2006 (gmt 0)

10+ Year Member



Just so I understand correctly - if "cherrypicker" is looking for a number of different pages and I do as suggested then ANY and ALL pages will show 404 or 410?
Is that correct - that is what Im looking to do.
Any page that linkwalker or cherrypicker go to will produce a "Valid" 404 "Responce" so that they will take it out of their index.

THanks Jim,
Krack

jdMorgan

8:14 pm on Feb 23, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Yes, it will 404 all requests, subject to your additional RewriteCond restrictions. However, I'd suggest that you exclude robots.txt and any custom error page URLs that you may have. Example:

RewriteCond %{REQUEST_URI} !^(robots\.txt¦my_404\.html)$

robots.txt should always be fetchable -- It's just a Web convention that if you're going to block a robot, you should tell it so. And if you block access to the custom error page needed to handle an error, then the server will go into a loop tyrying to handle the error, thereby generating another error, trying to handle that error and generating yet another, etc.

Jim