Page is a not externally linkable
- Search Engines
-- Search Engine Spider and User Agent Identification
---- Yahoo! Slurp


Mokita - 12:27 pm on Sep 11, 2011 (gmt 0)


No, I haven't seen that behaviour you describe (as yet) in any of the sites under our control.

However, there was a time not long ago, when I couldn't understand why Yahoo were suddenly spidering the Images folder in one site, when they had been banned (via robots.txt) since day 1 (as they are in all our sites).

When I looked into it deeply, I found that the robots.txt in that one site had somehow become scrambled, such that all rules were on one continuous line, rendering it absolutely useless.

So that experience taught me to have a backup method for all sites, via an .htaccess located in the images folder itself:

RewriteCond %{HTTP_USER_AGENT} (bing|googlebot|msn|slurp|Yahoo) [NC]
RewriteRule .*\.(gif|jpg|jpeg|pdf|png|swf)$ - [F]

Add/subtract User-Agents according to your wishes.


Thread source:: http://www.webmasterworld.com/search_engine_spiders/4360952.htm
Brought to you by WebmasterWorld: http://www.webmasterworld.com