homepage Welcome to WebmasterWorld Guest from
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Home / Forums Index / Microsoft / Bing Search Engine News
Forum Library, Charter, Moderators: mack

Bing Search Engine News Forum

MSN Crawling Hard
Deindexed site drama...

WebmasterWorld Senior Member billys us a WebmasterWorld Top Contributor of All Time 10+ Year Member

Msg#: 3792182 posted 8:46 pm on Nov 22, 2008 (gmt 0)

I was just looking through my logs and noticed that msnbot was crawling our site pretty hard, grabbing about 10% of the site in the last half hour or so.

I just checked the site: command on Live and we've only got about 100 pages in their index now - which is fewer than the number of pages mentioned above.

Anyway, we keep thinking about blocking msn altogether and stopping them from wasting bandwidth. I know no one really cares about Live anymore, but I was wondering if anyone else noticed the same - especially if you think you're under some kind of penalty.



10+ Year Member

Msg#: 3792182 posted 2:29 am on Nov 30, 2008 (gmt 0)

Gidday BillyS

I use a couple of standard robots.txt instructions (can't remember where I got them from - probably here at WebmasterWorld, and then confirmed with SEs robots pages.)

For Google, Yahoo, MSN (presumably also Livesearch), teoma and another obscure SE;

User-agent: botname
Crawl-delay: 10

for MSN specifically;

User-agent: msnbot
Crawl-delay: 10

and general catch-all for anyone else who decides to start being nice:

User-agent: *
Crawl-delay: 15

That's 10 seconds and 15 seconds, obviously you can make it shorter or even longer, just double check the SE protocol.

I don't know if this creates a conflict with the SEs who recognise the robots crawl delay, I would presume not.

Since I started using it, I notice that the SEs don't seem to be so "grabby' when they come through on a large sweep after an algo update, which is usually 20-30 pages at a time on my primary site now.

I had seen up to 100 pages grabbed in a single pass previously, the bandwidth spikes were ... large ...

Now I mainly get mutiple daily visits of 1,2, upto 10 pages at a time, from the majors and their data centres, so it would appear that if you want "steady drip" rather than "sudden flood", it works.

Hope this is useful.

Global Options:
 top home search open messages active posts  

Home / Forums Index / Microsoft / Bing Search Engine News
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved