homepage Welcome to WebmasterWorld Guest from 54.166.255.168
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Home / Forums Index / Microsoft / Bing Search Engine News
Forum Library, Charter, Moderators: mack

Bing Search Engine News Forum

    
MSNBot killing my server, not this again!
Dave_Hybrid

5+ Year Member



 
Msg#: 4087251 posted 6:36 pm on Feb 25, 2010 (gmt 0)

It's back again and frankly this is annoying me, i dont like blocking the bot even if it only sends me 1% of my daily traffic i'd like to give it a fair shot but when it comes to load i'd rather have my normal load of 0.50 rather than 5, 10 or even 15 at times.

I have spoke to a few engineers over there, they promise to set a delay their end as the bot ignores robots.txt commands but with little effect.

So Bing, i assume you read here, sort your bot out, it's a massive resource hog and how the hell do you expect to win over webmasters and searchers alike if we are all blocking your bot. Right now i'm having to resort to adding a line in my htaccess.

</rant>

 

TheMadScientist

WebmasterWorld Senior Member themadscientist us a WebmasterWorld Top Contributor of All Time 5+ Year Member



 
Msg#: 4087251 posted 11:14 pm on Feb 25, 2010 (gmt 0)

Could you elaborate on what you type of site you're running (EG News, blog, informational, static or dynamic, etc.) and what else you're doing to try to slow the bot down besides blocking it? EG Serving last modified headers, serving e-tag headers, expires, etc.

I ask because I work on a couple of decent sized sites which are both dynamic, but behave as if they are static (serve full headers, including different expiration times by file type, etc.) and haven't ever had an issue with MSNBot at all. It usually requests everything twice in a row, but the issue you are talking about is definitely not an 'everyone issue' so knowing the differences in situations would be good, IMO, and might help figure out what's causing the issue for you and not everyone...

jdMorgan

WebmasterWorld Senior Member jdmorgan us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 4087251 posted 11:39 pm on Feb 25, 2010 (gmt 0)

If no other cause is found, and crawl-delay in robots.txt plus the above cache-, expiry-, and E-tag-header suggestions don't help and and this problem persists, you have the option to serve a 503-Service Unavailable response accompanied with a Retry-After header. If you set the Retry-After time at 5 to 15 seconds, your problem should be alleviated.

The above is based on the HTTP protocol. I personally have no idea whether msnbot will handle it correctly.

Also, don't complain too loudly to them about msnbot's behavior. I did that several years ago, and the site I was complaining about is still "banned" at Bing, although none of the techs can see any problem in the tools available to them, and all report that the site is *not* blocked, despite the fact that it no longer shows even for it's own domain, and there is a "Some results have been removed" message at the bottom of the screen. It's a non-profit, informational site, so I'm not losing any money (or sleep) over it.

Jim

Receptional

WebmasterWorld Administrator receptional us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 4087251 posted 10:37 am on Feb 28, 2010 (gmt 0)

I would also check that the IP number of the bot does I fact belong to Bing. Pretending to be MSNBot is pretty easy to do.

Aside: that's an interesting phenomenon JD. Did you (for a time) tell msnbot to noindex, nofollow? I wonder if.. Somewhere in the depths of bing's database, you are still on a list of sites that msnbot has effctively banned itself from crawling.

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Microsoft / Bing Search Engine News
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved