homepage Welcome to WebmasterWorld Guest from 54.166.53.169
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Pubcon Platinum Sponsor 2014
Home / Forums Index / Search Engines / Search Engine Spider and User Agent Identification
Forum Library, Charter, Moderators: Ocean10000 & incrediBILL

Search Engine Spider and User Agent Identification Forum

    
msn bot has gone crazy
bwnbwn

WebmasterWorld Senior Member bwnbwn us a WebmasterWorld Top Contributor of All Time 5+ Year Member



 
Msg#: 4534274 posted 1:28 pm on Jan 8, 2013 (gmt 0)

131.253.27.123
131.253.27.124
131.253.27.125

These are ip's from the msn bot. Over the weekend these 3 ip's were acting like a dns attack requesting 1000's of pages over and over. I finally had to block these ip's from the server. Has anybody else had the same problem from these IP's?

 

wilderness

WebmasterWorld Senior Member wilderness us a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



 
Msg#: 4534274 posted 5:54 pm on Jan 8, 2013 (gmt 0)

msbnot-media?

a recent thread [webmasterworld.com]

I replied that I'd had 40+requests in ten hour period for robots.txt.
The number was exceeded a few weeks later with 262 requests in a single 24-hour period and on a single site.

bwnbwn

WebmasterWorld Senior Member bwnbwn us a WebmasterWorld Top Contributor of All Time 5+ Year Member



 
Msg#: 4534274 posted 6:02 pm on Jan 8, 2013 (gmt 0)

We had 1000's of repeated request so many so it was dragging the websites down. Example one site specific nitch might get 100 hits with 400 page views. This went from 400 to over 4k in 4 hours on Monday so I looked at the weekend. It started on Friday afternoon and never let up until I blocked the ip's. The website has only about 100 pages it was pulling the same content over and over and over.

wilderness

WebmasterWorld Senior Member wilderness us a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



 
Msg#: 4534274 posted 6:43 pm on Jan 8, 2013 (gmt 0)

msbnot-media?

bwnbwn

WebmasterWorld Senior Member bwnbwn us a WebmasterWorld Top Contributor of All Time 5+ Year Member



 
Msg#: 4534274 posted 8:34 pm on Jan 8, 2013 (gmt 0)

msnbot-131-253-27-123.search.msn.com

wilderness

WebmasterWorld Senior Member wilderness us a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



 
Msg#: 4534274 posted 10:01 pm on Jan 8, 2013 (gmt 0)

Just add them to your robots.txt and although the requests for robots.txt will not stop, they will comply with your request and leave your images alone.

You will of course be required to take them off of denied access to read your robots.txt, unless you have an exception allow the reading of robots.txt for denied visitors.

lucy24

WebmasterWorld Senior Member lucy24 us a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



 
Msg#: 4534274 posted 10:25 pm on Jan 8, 2013 (gmt 0)

I think he's asking what the UA was. msnbot-media, ordinary bingbot, or the dreaded plainclothes bingbot?

wilderness

WebmasterWorld Senior Member wilderness us a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



 
Msg#: 4534274 posted 10:37 pm on Jan 8, 2013 (gmt 0)

I think he's asking what the UA was


"forget about it" ;)

If he'd just provided a few lines of raw logs it would have been much easier.

My html crawls from the 131.253.x.x have been few.

The majority have been msnbot-media for images.

not2easy

WebmasterWorld Administrator 5+ Year Member Top Contributors Of The Month



 
Msg#: 4534274 posted 5:00 am on Jan 9, 2013 (gmt 0)

Some apparent msnbots are being disavowed by Bing's verify tools where you end up if you try the URL attached to their bots. I am checking a few that seem to be naughty and Bing has disavowed 4 out of 5. I am adding the full info at the older thread mentioned above because that is where the rest of the info is at.

bwnbwn

WebmasterWorld Senior Member bwnbwn us a WebmasterWorld Top Contributor of All Time 5+ Year Member



 
Msg#: 4534274 posted 1:32 pm on Jan 9, 2013 (gmt 0)

Sorry guys I was not asking what the UA was. I know it was msn bot. What I am seeking is has anyone had the bot act in such an aggresive behavior that it acted like a DNS attack on the server. I had all three IP's hitting at the same time requesting 40-60 pages a sec. So in effect the bots were requesting 100 pages a sec or just about the entire website only on this website. We have 100 other domains on the same server and none of them were hit.

Forget the robots.txt file I blocked them from the firewall.

wilderness

WebmasterWorld Senior Member wilderness us a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



 
Msg#: 4534274 posted 3:02 pm on Jan 9, 2013 (gmt 0)

What I am seeking is has anyone had the bot act in such an aggresive behavior that it acted like a DNS attack on the server.


The Bing/MSN bots have been "acting in an aggressive manner" on one of my sites for months, however NOT from the 131.253.2x range.
FWIW, I'd much rather have bot requests all grouped together in what might be deemed an aggressive manner. Their certainly easier to analyze in that order.

In fact, MSN/Bing is still requesting pages from the same site that haven't been online for three years.

If the requests are taking your server down, possibly other issues exist which are causing the overload.

bwnbwn

WebmasterWorld Senior Member bwnbwn us a WebmasterWorld Top Contributor of All Time 5+ Year Member



 
Msg#: 4534274 posted 5:53 pm on Jan 9, 2013 (gmt 0)

thanks wilderness for your info. The sheer number of request from all three of the ip's was the issue. 3500 request on a 100 page website is in my eyes an attack.

keyplyr

WebmasterWorld Senior Member keyplyr us a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



 
Msg#: 4534274 posted 8:47 pm on Jan 9, 2013 (gmt 0)


The Bing/MSN bots have been crawling every single page on my main site daily for over a year, sometime twice. For some reason they also sometimes inject a non-existent directory into otherwise valid file paths creating about a hundred daily 404s, day after day after day.

When I sent in logs showing them this, they just said it would eventually stop on its own. It hasn't.

wilderness

WebmasterWorld Senior Member wilderness us a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



 
Msg#: 4534274 posted 9:23 pm on Jan 9, 2013 (gmt 0)

they just said it would eventually stop on its own. It hasn't.


keyplr,
that's commonly referred to as "double talk".
In the old days, comprehension was best if the speaker was uttering the words from the side of their mouth.

lucy24

WebmasterWorld Senior Member lucy24 us a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



 
Msg#: 4534274 posted 10:52 pm on Jan 9, 2013 (gmt 0)

In fact, MSN/Bing is still requesting pages from the same site that haven't been online for three years.

Based on behavior on my site, Bing-- unlike That Other Search Engine-- doesn't seem to distinguish between 404 and 410. Requests for 410 that are more than a month old are at least 99% Bing. And most of the rest are those casual robots that only stop by every year or so.

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Search Engines / Search Engine Spider and User Agent Identification
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved