homepage Welcome to WebmasterWorld Guest from 54.234.74.85
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Visit PubCon.com
Home / Forums Index / Search Engines / Search Engine Spider and User Agent Identification
Forum Library, Charter, Moderators: Ocean10000 & incrediBILL

Search Engine Spider and User Agent Identification Forum

    
dp131.data.yahoo.com -- Mozilla/4.0
Yet Another Whatever from Yahoo that doesn't ask for robots.txt
Pfui




msg:407911
 8:38 am on Jun 7, 2006 (gmt 0)

UA: Mozilla/4.0
HOST: dp131.data.yahoo.com

LOG:

dp131.data.yahoo.com - - [06/Jun/2006:14:57:14 -0700] "GET /dir/file.html HTTP/1.1" [...] "-" "Mozilla/4.0"

NOTES:

No robots.txt

No mention of the exact IP/Host via the top engines. (A week ago, a different Yahoo bot, YRL_ODP_CRAWLER [webmasterworld.com] appeared, also w/o any prior mention in SERPs. Weird.)

From dnsstuff [dnsstuff.com]:

IP address: 66.228.173.150
Reverse DNS: dp131.data.yahoo.com.
Reverse DNS authenticity: [Verified]
ASN Name: YAHOO-US

 

incrediBILL




msg:407912
 11:33 pm on Jun 7, 2006 (gmt 0)

That IP belongs to a block assigned as Overture Services

I've seen a lot of wacky stuff from Yahoo lately so it's hard to tell what's going on.

fiestagirl




msg:407913
 11:43 pm on Jun 7, 2006 (gmt 0)

[webmasterworld.com...]

wilderness




msg:407914
 11:53 pm on Jun 7, 2006 (gmt 0)

Anybody have a clue?

I have an email discussion list hosted by Yahoo Gorups.
Mypage and MyPage2 were sent to the list.

I'm not sure if it's related to the activity of a list subscriber and possibly a tool bar or other Yahoo tool?
Of Yahoo itself indexing these pages?
NOTe the sucessive duplicates!

209.191.87.215 - - [07/Jun/2006:08:15:41 -0700] "GET /myfolder/mySubFolder/mypage.html HTTP/1.0" 200
51938 "-" "Mozilla/5.0 (compatible; Yahoo! Slurp;
[help.yahoo.com...]
209.191.87.215 - - [07/Jun/2006:08:15:45 -0700] "GET
/myfolder/mySubFolder/mypage.html HTTP/1.0" 200 51938 "-" "Mozilla/5.0 (compatible; Yahoo! Slurp; [help.yahoo.com...]

209.191.87.218 - - [07/Jun/2006:08:16:34 -0700] "GET /myfolder/mySubFolder/mypage2.html
HTTP/1.0" 200 20161 "-" "Mozilla/5.0 (compatible; Yahoo! Slurp; [help.yahoo.com...]
209.191.87.218 - - [07/Jun/2006:08:16:37 -0700] "GET /myfolder/mySubFolder/mypage2.html HTTP/1.0" 200 20161 "-" "Mozilla/5.0 (compatible; Yahoo! Slurp; [help.yahoo.com...]

209.191.87.218 - - [07/Jun/2006:11:39:58 -0700] "GET / HTTP/1.0" 200 6617 "-" "Mozilla/5.0 (compatible; Yahoo! Slurp; [help.yahoo.com...]
209.191.87.218 - - [07/Jun/2006:11:40:02 -0700] "GET / HTTP/1.0" 200 6617 "-" "Mozilla/5.0 (compatible; Yahoo! Slurp; [help.yahoo.com...]

Pfui




msg:407915
 12:06 am on Jun 8, 2006 (gmt 0)

Thanks, BILL and fiestagirl. I even read your post earlier, too, girl, and remembered the Moz 4 info but totally spaced out the similarly subdomain'd Host. (slaps head)

Looks like Overture [overture.com]'s spawned a ton of who-knows-what... Their "Search Marketing products" link leads to "Yahoo! Search Marketing [searchmarketing.yahoo.com]."

They have too many crawlers, imho. Too many names [webmasterworld.com]. Too many that ignore robots.txt

Btw, the other day, someone in another forum, possibly Yahoo Search [webmasterworld.com] provided a list of sorts, of various Yahoo Host prefixes (like mud and labs and corp, etc.) and what they meant. Wish I could remember where it is! I'll keep looking.

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Search Engines / Search Engine Spider and User Agent Identification
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved