Welcome to WebmasterWorld Guest from 54.166.178.177

Forum Moderators: martinibuster

Message Too Old, No Replies

Way too many Yahoo! Slurp Spider crawling my community site

Delay or reduce crawling of Yahoo! Slurp spiders

     
4:12 pm on Jun 28, 2007 (gmt 0)

10+ Year Member



Hello,

I am running a community site (Vbulletin forum). I have too many Yahoo! Slurp spiders crawling my site frequently.

Looks like more than the actual users / visitors in my community, these spiders are querying my forum database (MySQL) a lot causing it to reach the max_questions limit and reducing the performance of the site.

How do I delay or reduce the crawling of the Yahoo! SLurp spiders? Though I want yahoo spider to crawl and index my site, I want it to be done adequately without affecting site performance.

Do I introduce crawl delay in robots.txt file? How do I make it take effect immediately?

Is there any other solution?

Thanks in advance for the responses received here.

8:38 pm on Jun 29, 2007 (gmt 0)

5+ Year Member



I have the same exact problem. Yahoo simply sucks and just sits there and crawls all day for, apparently, nothing, LOL. I think they literally all go meet on my forum for a beer or something. I am thinking of simply banning Yahoo from robots.txt, altogether, as they are simply worthless anymore anyway. I hate to be that drastic, but they have serious problems. Google usually has ONE spider all day on my forum, MSN usually has 3 all day on forum and Yahoo has 100 or so all day...no doubt its a waste.

When WILL Yahoo wake up and pull their head's from their posteriors? Who knows.

9:07 am on Jun 30, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I run several sites and I have the same problem.
In fact, if Y! were to send me 1 visitor for every 10 bot visits I would need a dedicated server to handle the traffic :o)
9:16 am on Jul 5, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



On one of my sites Slurp has now decided that robots.txt is for everyone else but not for them and started crawling pages in dirs that are disallowed.

Slurp is now banned

10:58 am on Jul 5, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Add parameter to robots.txt (either for Yahoo's user-agent or for catch all *)

Crawl-Delay: 10

This will make bot make 10 second delays between requests. Some other bots also support it.

Additionally have a look at urls that get crawled and maybe some of them can be disallowed in robots.txt - you are the one who knows whether some pages are of zero interest to the search engines.

 

Featured Threads

Hot Threads This Week

Hot Threads This Month