Forum Moderators: open
First visit, went straight to a sub-dir file:
rlx-2-2-10.labs.corp.yahoo.com - - [06/Apr/2006:19:43:32 -0700]
"GET /dir/file.html HTTP/1.1" [...] "-" "Yahoo! Mindset"
More info about this here [mindset.research.yahoo.com] (Yahoo) and here [askdavetaylor.com] (Ask Dave Taylor).
Alas, Yet Another Yahoo bot/crawler/spider/whatever that doesn't ask for robots.txt.
Ban-worthy in my book.
And "Yahoo! Mindset" doesn't ask:
q02.yrl.dcn.yahoo.com - - [09/Apr/2006:15:24:31 -0700]
"GET / HTTP/1.1" [...] "-" "Yahoo! Mindset"
q02.yrl.dcn.yahoo.com - - [09/Apr/2006:15:24:31 -0700]
"GET / HTTP/1.1" [...] "-" "Yahoo! Mindset"
(Two separate, completely identical hits.)
Also using "Yahoo! Mindset" and also not asking:
rlx-2-2-10.labs.corp.yahoo.com (see my initial post, above)
rlx-2-2-2.labs.corp.yahoo.com
In addition to 'regular' Slurp (as opposed to 'China' Slurp), Yahoo has waaaay too many UAs (and inktomisearch and yahoo and who-knows-what-all domains) for me to even begin tracking which Y! UAs retrieving robots.txt might be sharing the info with this new one.
Imho, Yahoo knows how to 'do' robots.txt. They're just of a Mindset not to.
: )