Welcome to WebmasterWorld Guest from 54.198.179.85

Forum Moderators: martinibuster

Message Too Old, No Replies

Yahoo search is back

"Slurp" has been visiting...

     
10:41 pm on Feb 5, 2014 (gmt 0)

Junior Member

joined:Sept 11, 2013
posts: 48
votes: 0


Yahoo seems to be crawling again:

b100104.yse.yahoo.net - - [05/Feb/2014:20:26:20 +0000] "GET /robots.txt" 200 2551 "-" "Mozilla/5.0 (compatible; Yahoo! Slurp; http://help.yahoo.com/help/us/ysearch/slurp)"


Confirmed that b100104.yse.yahoo.net = 68.180.224.228 and vice versa. ;)
12:59 am on Feb 11, 2014 (gmt 0)

Moderator from GB 

WebmasterWorld Administrator mack is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:June 15, 2001
posts:7564
votes: 4


Slurp still crawls, but not for a general search engine. It crawls within specific niches for certain content areas within Yahoo! and it's partners.

Mack.
8:36 pm on Feb 11, 2014 (gmt 0)

Senior Member from GB 

WebmasterWorld Senior Member dstiles is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:May 14, 2008
posts:3117
votes: 3


No proper rDNS, no access, bot UA or not.
9:26 pm on Feb 11, 2014 (gmt 0)

Junior Member

joined:Sept 11, 2013
posts: 48
votes: 0


mack:
Would you happen to know what those niches are?

dstiles:
Right - as with any other bot. :) Yahoo's host name format matches the one they used years ago, and - based on a sample of 1 ;) - I can say that rDNS appears to be working. If the past is any indication, Yahoo's spider has been way better behaved than Google's! Among rumours that Yahoo is considering returning to general crawling, we'll have to see how this pans out...
9:57 pm on Feb 12, 2014 (gmt 0)

Senior Member from GB 

WebmasterWorld Senior Member dstiles is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:May 14, 2008
posts:3117
votes: 3


My several checks of yahoo bot IPs in DNS during the past couple of weeks shows they are no longer valid bot IPs.
3:18 am on Feb 14, 2014 (gmt 0)

Moderator from GB 

WebmasterWorld Administrator mack is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:June 15, 2001
posts:7564
votes: 4


mack:
Would you happen to know what those niches are?


Yahoo provide a little bit of this information on their Slurp web page.

[help.yahoo.com...]

Mack.
11:45 am on Apr 20, 2014 (gmt 0)

Junior Member

joined:Sept 11, 2013
posts: 48
votes: 0


mack:
Yahoo provide a little bit of this information on their Slurp web page. help.yahoo.com...

OK, thanks. It says there
Slurp collects content from partner sites

Well, that makes searching my sites unnecessary, and slurp can be blocked like every other unwanted robot. ;)
2:35 pm on Apr 20, 2014 (gmt 0)

Moderator This Forum from US 

WebmasterWorld Administrator martinibuster is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Apr 13, 2002
posts:14158
votes: 196


Just a general note. It's probably not a good idea to block Slurp. Yahoo slices and dices Bing data. Slurps crawl is part of their in-house quality control and editorial process.

https://help.yahoo.com/kb/search/slurp-crawling-page-sln22600.html [help.yahoo.com]

It also accesses pages from sites across the Web to confirm accuracy and improve Yahoo's personalized content for our users.
5:52 pm on Apr 20, 2014 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:Apr 9, 2011
posts:13204
votes: 346


How does it currently behave? I blocked Slurp some years ago because it seemed to pay no attention to robots.txt. But if it has begun conducting itself properly I'll let it back in-- at least from selected IP ranges.
7:11 pm on Apr 20, 2014 (gmt 0)

Moderator from US 

WebmasterWorld Administrator not2easy is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:Dec 27, 2006
posts:2748
votes: 61


I let it in one site and I have not seen it ignoring any robots disallows. Yet. I seldom see it requesting robots.txt, but it does. Keeping an eye on it.