Page is a not externally linkable
- Search Engines
-- Search Engine Spider and User Agent Identification
---- Yahoo! Crawlers - A response from Yahoo! Search


Mokita - 6:17 am on Jul 18, 2006 (gmt 0)


The Yahoo! China Slurp was indeed following 'Slurp' user-agent rules in preference to 'Slurp China' rules. The Yahoo! China team have corrected that, so China Slurp will now observe its own specific rules instead of Slurp rules.

Yahoo! Slurp China has just violated our robots.txt, which contains both the following entries (to be sure one of them will work!)

User-agent: Yahoo! Slurp China
User-agent: Slurp China
Disallow: /

Logged:

lj910157.inktomisearch.com - - [18/Jul/2006:14:08:13 +1000] "GET /robots.txt HTTP/1.0" 200 1815 "-" "Mozilla/5.0 (compatible; Yahoo! Slurp China; [misc.yahoo.com.cn...]
lj910193.inktomisearch.com - - [18/Jul/2006:14:08:23 +1000] "GET / HTTP/1.0" 403 - "-" "Mozilla/5.0 (compatible; Yahoo! Slurp China; [misc.yahoo.com.cn...]

Luckily I still had mod_rewrite blocking it.

Yahoo_Mike: Please would you confirm what is the correct syntax to successfully block Yahoo! Slurp China via robots.txt. Thanks.

--
P.S. I wrote to Yahoo! via the help form on their site more than two weeks ago, asking the same thing. I haven't had any reply, not even an automated one.


Thread source:: http://www.webmasterworld.com/search_engine_spiders/3006509.htm
Brought to you by WebmasterWorld: http://www.webmasterworld.com