Page is a not externally linkable
- Search Engines
-- Search Engine Spider and User Agent Identification
---- Naughty Yahoo User Agents


GaryK - 5:26 pm on Jun 9, 2006 (gmt 0)


I'm amazed at how quickly Y! responded to my initial list. I'll paraphrase what I was told so as not to violate WW's TOS.

The Y! China team admits that the mp3spider does not yet observe robots.txt. They are willing to accept requests to be excluded from the crawl by sending an e-mail to cn-search-devel@yahoo-inc.com. They are also being pushed hard to start respecting robots.txt.

I've also been told that all user agents with Slurp in them respect robots.txt. If any of them do not do this please post as much detail as possible so it can be investigated and corrected. I've been told that someone from Inktomi will be looking at the individual threads I referenced.

I'm told that "Yahoo! Slurp DE", "Yahoo! Slurp China" and "Yahoo! Slurp" do recognize distinct User-Agent rules if provided.

Apparently Yahoo! Slurp DE is the crawler for a (D)irectory (E)ngine service that crawls preferred content explicitly listed by Yahoo! Search content service partners.

Slurp DE will respect robots.txt rules for User-Agent: Slurp DE or User-Agent: Yahoo! Slurp DE. If those user agents are not listed Slurp DE will obey User-Agent: Slurp.

Yahoo! Slurp China also obeys robots.txt rules for User-Agent:
Slurp China or User-Agent: Yahoo! Slurp China. Again, if there is no explicit Slurp China rule it will follow the more generic User-Agent: Slurp rule.

If the above is not the case please post as many details as you can about the offense so it can be investigated and corrected.

I was pleasantly surprised to see them admit that with Y! growing so fast it's hard to maintain consistent central control over every division of the company. They're very much aware of this problem and are working hard to correct it.

I know it might sound like they're trying to placate us, but having worked in a large corporation for 20 years (I'm now retired) I can tell you it's often hard to coordinate policies and enforcement across all departments.

A directive might come down from higher up, but it's up to each department to implement the directive as they understand it.

Ideally there's someone in charge of oversight, but if often takes complaints about lack of adherence to the directive before anything is done about it.

Now that Y! is aware of these problems from a group of experienced webmasters I hope they'll do something about it.

And frankly, all we can do is hope this will be the case. IMO it's good that Y! is willing to listen to our complaints and appears to be attempting to do something to address them.

I hope someone from Y! or Inktomi will show up here and try to work with us. That might be unrealistic, but a guy can hope can't he? :)

Finally, for now at least, and in the interest of full disclosure, my contact at Y! Engineering has offered to send me a Y! shirt. I think that's very nice of him and I accepted the offer.


Thread source:: http://www.webmasterworld.com/search_engine_spiders/3276.htm
Brought to you by WebmasterWorld: http://www.webmasterworld.com