homepage Welcome to WebmasterWorld Guest from 54.145.183.190
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Home / Forums Index / Search Engines / Ask - Teoma
Forum Library, Charter, Moderator: open

Ask - Teoma Forum

    
Crazy traffic from ask.com crawlers
Three to four thousand page requests in a few minutes ...
thereplicant



 
Msg#: 4208134 posted 2:23 pm on Sep 28, 2010 (gmt 0)

Recently getting a lot of crazy traffic from Ask.com's crawlers moving at 20+ requests per second. Anyone else seeing this behaviour?

Our robots.txt contains the crawl-delay directive (set to 1), which according to the Ask.com FAQ pages is supported by their crawler. I can see that it requested robots.txt, but it's clear that it is ignoring the crawl delay.

For example, one if the crawlers is as follows:
IP: 66.235.116.77
Domain: crawler9077.ask.com
User agent: Teoma/Nutch-1.0 (Question and Answer Search; bot@afarm.com)

On the 24th of September, there were over 20 instances where the crawler hit our servers at a rate of between 33 and 38 times per second. We had over 100 instances where this crawler hit us at over 20 times per second.

In total, this crawler hit us 4958 times between 03:49:51 and 03:53:41 on the 24th of September, which works out to an average of 20 requests per second.

This is not the only crawler from ask.com that has behaved in such a fashion.

On the 23rd of September IP 66.235.116.75 (crawler9075.ask.com, same useragent as the other crawler) queried us 5010 times between the hours of 23:36:12 and 23:39:29, which works out to a rough average of 27 times per second.

In fact, every single day for the last week or two we get Ask.com crawlers coming in and spidering the site at insane speeds such as this. One of the most recent ones was 66.235.116.73 (crawler9073.ask.com), which hit us 4833 times on the 28th of September between 00:18:33 and 00:22:54.

I've tried emailing bot@afarm.com (which is in the useragent), but got no response, and then I tried contacting ask.com using their on-line forms (also no response).

Anyone have any suggestions on what's happening here, or who I could contact?

 

Staffa

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 4208134 posted 5:45 pm on Sep 28, 2010 (gmt 0)

I would block that UA without a second thought.
It may come via ask IPs but the dot com is registered in Australia and Ask does not use Nutch in its UA ;o)

thereplicant



 
Msg#: 4208134 posted 8:53 am on Sep 29, 2010 (gmt 0)

Thanks for the reply.

Where did you see it was registered in Australia? I only get info for the US for this IP ...

Staffa

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 4208134 posted 4:18 pm on Sep 29, 2010 (gmt 0)

The IP is Ask alright but the dot com in the UA seemed suspicious to me so doing a lookup whois on one of those specialized sites will tell you everything about the domain name and the owner.

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Search Engines / Ask - Teoma
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved