homepage Welcome to WebmasterWorld Guest from
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member
Home / Forums Index / Search Engines / Search Engine Spider and User Agent Identification
Forum Library, Charter, Moderators: Ocean10000 & incrediBILL

Search Engine Spider and User Agent Identification Forum

msn bot has lost its mind
1000+ parallel connections

 3:39 pm on Sep 4, 2013 (gmt 0)

I keep getting alerts about msnbot - and it really is msnbot from a real microsoft ip - it's opening 1000+ parallel connections to the server at the same time.

What in the heck do they think they are doing? This is not acceptable.

Note it's all from the same single IP, and I do not mean it is re-requesting the same 1000 files over time, I mean boom, same single second, 1000+ connections attempted.

First saw it from

131.253.38.xx (CA/Canada/msnbot-131-253-38-xx.search.msn.com)

and then from a few different US ips

65.55.213.xx (US/United States/msnbot-65-55-213-xx.search.msn.com)

Never ever, had this problem with google.

Anyone else notice this kind of activity?



 11:52 pm on Sep 4, 2013 (gmt 0)

Maybe they're experimenting with rapid indexing to see if some sites can handle it and how fast can they take it.

Perhaps it's a bug nobody at MSN knows about.

I've sent them log files before when they behaved badly and they fixed the problem so perhaps you should consider that.

When bot owners ignore my polite requests then I blog and post the data for all to see and tweet about it and get retweets and after publicly embarrassing them they often fix the problem.


 1:02 am on Sep 5, 2013 (gmt 0)

Do they honor the Crawl-Delay directive? I know Google doesn't-- you have to set it in wmt-- but I'm ### if I can find the area in Bing wmt that analyzes your robots.txt.


 1:16 am on Sep 5, 2013 (gmt 0)

I have a server setup that trips if an IP establishes more than 11 simultaneous connections. MSNbot gets blocked all the time...


 2:05 am on Sep 5, 2013 (gmt 0)

Does BingBot honor the Crawl-delay directive?:
http://www.bing.com/blogs/site_blogs/b/webmaster/archive/2012/05/03/to-crawl-or-not-to-crawl-that-is-bingbot-s-question.aspx [bing.com]


 3:53 am on Sep 5, 2013 (gmt 0)

On every forum there is one person who always knows where to find things. On WebmasterWorld, that person is phranque :)

Because it would cause a lot of unwanted traffic if BingBot tried to fetch your robots.txt file every single time it wanted to crawl a page on your website, it keeps your directives in memory for a few hours.

Someone remind me: Why don't these forums have a "roflmfao" emoticon?

1000+ parallel connections

You've got a sturdier server than mine :o I'm on shared hosting and I think the ceiling is 30. I've only ever seen it with malicious robots.

:: shuffling papers ::

Yup. Ghastly robot from {server farm} back in February 2012, slew of 503 responses with log message
access to {filename} failed for {IP}, reason: Client exceeded concurrent connection limit of 30, referer: {referer}
Even WebReaper doesn't do that. Certainly not what you'd expect of the bingbot.

Unless all 1000+ concurrent requests were for robots.txt. That I'd believe.


 11:27 am on Sep 5, 2013 (gmt 0)

You can also set the crawl rate in Microsoft's webmaster tool. For example, you could tell it to crawl less aggressively during peak periods.

Global Options:
 top home search open messages active posts  

Home / Forums Index / Search Engines / Search Engine Spider and User Agent Identification
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved