Welcome to WebmasterWorld Guest from 54.144.107.83

Forum Moderators: Ocean10000 & incrediBILL

Message Too Old, No Replies

msn bot has lost its mind

1000+ parallel connections

     
3:39 pm on Sep 4, 2013 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Dec 16, 2002
posts: 2010
votes: 0


I keep getting alerts about msnbot - and it really is msnbot from a real microsoft ip - it's opening 1000+ parallel connections to the server at the same time.

What in the heck do they think they are doing? This is not acceptable.

Note it's all from the same single IP, and I do not mean it is re-requesting the same 1000 files over time, I mean boom, same single second, 1000+ connections attempted.

First saw it from

131.253.38.xx (CA/Canada/msnbot-131-253-38-xx.search.msn.com)

and then from a few different US ips

65.55.213.xx (US/United States/msnbot-65-55-213-xx.search.msn.com)

Never ever, had this problem with google.

Anyone else notice this kind of activity?
11:52 pm on Sept 4, 2013 (gmt 0)

Administrator from US 

WebmasterWorld Administrator incredibill is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Jan 25, 2005
posts:14622
votes: 87


Maybe they're experimenting with rapid indexing to see if some sites can handle it and how fast can they take it.

Perhaps it's a bug nobody at MSN knows about.

I've sent them log files before when they behaved badly and they fixed the problem so perhaps you should consider that.

When bot owners ignore my polite requests then I blog and post the data for all to see and tweet about it and get retweets and after publicly embarrassing them they often fix the problem.
1:02 am on Sept 5, 2013 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month

joined:Apr 9, 2011
posts:12700
votes: 244


Do they honor the Crawl-Delay directive? I know Google doesn't-- you have to set it in wmt-- but I'm ### if I can find the area in Bing wmt that analyzes your robots.txt.
1:16 am on Sept 5, 2013 (gmt 0)

Senior Member

WebmasterWorld Senior Member billys is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:June 1, 2004
posts:3181
votes: 0


I have a server setup that trips if an IP establishes more than 11 simultaneous connections. MSNbot gets blocked all the time...
2:05 am on Sept 5, 2013 (gmt 0)

Administrator

WebmasterWorld Administrator phranque is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Aug 10, 2004
posts:10542
votes: 8

3:53 am on Sept 5, 2013 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month

joined:Apr 9, 2011
posts:12700
votes: 244


On every forum there is one person who always knows where to find things. On WebmasterWorld, that person is phranque :)

Because it would cause a lot of unwanted traffic if BingBot tried to fetch your robots.txt file every single time it wanted to crawl a page on your website, it keeps your directives in memory for a few hours.

Someone remind me: Why don't these forums have a "roflmfao" emoticon?

1000+ parallel connections

You've got a sturdier server than mine :o I'm on shared hosting and I think the ceiling is 30. I've only ever seen it with malicious robots.

:: shuffling papers ::

Yup. Ghastly robot from {server farm} back in February 2012, slew of 503 responses with log message
access to {filename} failed for {IP}, reason: Client exceeded concurrent connection limit of 30, referer: {referer}

Even WebReaper doesn't do that. Certainly not what you'd expect of the bingbot.

Unless all 1000+ concurrent requests were for robots.txt. That I'd believe.
11:27 am on Sept 5, 2013 (gmt 0)

Senior Member

WebmasterWorld Senior Member billys is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:June 1, 2004
posts:3181
votes: 0


You can also set the crawl rate in Microsoft's webmaster tool. For example, you could tell it to crawl less aggressively during peak periods.