Yahoo! Mindset

Forum Moderators: open

Message Too Old, No Replies

Yahoo! Mindset

Didn't ask for robots.txt

Pfui

5:02 am on Apr 7, 2006 (gmt 0)

UA: Yahoo! Mindset
Host: rlx-2-2-10.labs.corp.yahoo.com

First visit, went straight to a sub-dir file:

rlx-2-2-10.labs.corp.yahoo.com - - [06/Apr/2006:19:43:32 -0700]
"GET /dir/file.html HTTP/1.1" [...] "-" "Yahoo! Mindset"

More info about this here [mindset.research.yahoo.com] (Yahoo) and here [askdavetaylor.com] (Ask Dave Taylor).

Alas, Yet Another Yahoo bot/crawler/spider/whatever that doesn't ask for robots.txt.

Ban-worthy in my book.

volatilegx

3:35 am on Apr 10, 2006 (gmt 0)

Seen coming from:

66.228.182.177
66.228.182.183
66.228.182.187
66.228.182.188
66.228.182.190

volatilegx

3:48 am on Apr 10, 2006 (gmt 0)

AND...

Date: 04/2/2006, 19:52:46
IP: 66.228.182.185
Host: rlx-2-2-5.labs.corp.yahoo.com
UA: Mozilla/4.0

Suspicious. Wonder what's going on?

GaryK

3:33 pm on Apr 10, 2006 (gmt 0)

Same here.

Yahoo! Mindset
04/07/2006 05:33:10

No robots.txt. It went right to my tutorials section and stole everything.

Pfui

3:35 pm on Apr 10, 2006 (gmt 0)

More sightings -- still NO robots.txt:

q02.yrl.dcn.yahoo.com
Yahoo! Mindset
04/09 15:24:31 /
04/09 15:24:31 /

(Two hits in one second.)

Only hitting my largest, DMOZ'd site (fwiw).

fiestagirl

5:56 pm on Apr 14, 2006 (gmt 0)

Possibility:
May 27, 2005

"Often, we come across a web page that hasn't been classified yet. In those cases, Mindset tries to classify that web page in the background, so it'll be classified along with the rest of the results next time you do the same query."

from ysearchblog.com

volatilegx

10:32 pm on Apr 15, 2006 (gmt 0)

It may not be asking for robots.txt, but is it obeying it? Maybe it gets the robots.txt from Slurp requests.

Pfui

11:07 pm on Apr 15, 2006 (gmt 0)

To me, if "Yahoo! Mindset" doesn't ask, "Yahoo! Mindset" doesn't obey.

And "Yahoo! Mindset" doesn't ask:

q02.yrl.dcn.yahoo.com - - [09/Apr/2006:15:24:31 -0700]
"GET / HTTP/1.1" [...] "-" "Yahoo! Mindset"

(Two separate, completely identical hits.)

Also using "Yahoo! Mindset" and also not asking:

rlx-2-2-10.labs.corp.yahoo.com (see my initial post, above)
rlx-2-2-2.labs.corp.yahoo.com

In addition to 'regular' Slurp (as opposed to 'China' Slurp), Yahoo has waaaay too many UAs (and inktomisearch and yahoo and who-knows-what-all domains) for me to even begin tracking which Y! UAs retrieving robots.txt might be sharing the info with this new one.

Imho, Yahoo knows how to 'do' robots.txt. They're just of a Mindset not to.

: )

Yahoo! Mindset

Didn't ask for robots.txt

Pfui

volatilegx

volatilegx

GaryK

Pfui

fiestagirl

volatilegx

Pfui

Join The Conversation

Moderators and Top Contributors

Hot Threads This Week