Forum Moderators: open

Message Too Old, No Replies

crawler.searchmarketing.yahoo.com too aggressive, brings server down.

crawler searchmarketing yahoo com crawl aggressive

         

FireBrigade

7:43 am on Mar 25, 2006 (gmt 0)

10+ Year Member



Dear All,

While reading through my logfiles (yes, i know, i have to get a life) I found MASSIVE crawling from an IP address 66.35.192.197 that does not identify itself as a search engine spider (user-agent: Mozilla/4.0+(compatible;+MSIE+5.01;+Windows+NT+5.0)+RPT-HTTPClient/0.3-3E) but is aparently some kind of advertising verifier from Yahoo Search Marketing (Overture).

A trace back to the IP-address results in "crawler.searchmarketing.yahoo.com", which makes me believe this is a verifier that checks whether the links in the adverts are still active.

The ip-address never reads the robots.txt so I guess it's not a regular crawler like 'slurp!'.

The matter of the fact is that this crawler visits our site several times per day and fires between 50 and 100 requests in a single second, which makes my server (Network load balancing routes the requests from this IP to a single server) become overloaded and stalls all traffic for at least 1 minute until all the requests are processed (200) or rejected (error fallback).

(I'm running Win2003 servers.)

Is there anyone who has some more information on this or knows a way to slow this aggressive crawler down? We do not want to block this IP for the risk of loosing all our Yahoo Search Marketing (Overture) campaigns, but I also do not like to make the site unavailable several times per day for regular visitors and being beeped out of my bed because the monitoring system alerts me that the site is unavailable on this server.

Any thoights?...