Welcome to WebmasterWorld Guest from 34.231.21.123

Forum Moderators: mack

Message Too Old, No Replies

Are you taking the MSNbot Gamble?

how much bandwith are you allowing

     
5:22 pm on May 24, 2004 (gmt 0)

Junior Member

joined:Jan 20, 2003
posts:105
votes: 0


The MSNbot has quite a voracious appetite for spidering websites. Some webmasters love it and try to feed it as much as possible. Other webmasters don't see any reason to use up bandwith for a search engine that doesn't currently exist.

Personally, bandwith is cheap for me and I'm willing to feed MSNbot as long as it doesn't impact performance for users.

So how much bandwith are you allowing?

5:29 pm on June 10, 2004 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Oct 15, 2003
posts:1418
votes: 0


Because all of you people blocking it are going to be the same one's starting, "MSN dumped me for no resason!" threads.

Heh isn't that the truth. Funny how people realize they have the right to block whoever they want from their site, but when someone does it to THEM they get their panties in a twist about it.

5:37 pm on June 10, 2004 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Jan 15, 2004
posts:1300
votes: 0


I have to agree, blocking the first draft of a search engine that could very easily control 30% of the market within 12-18 months does not strike me as a very good idea.

Until somebody suggests something more coherent, my first suspicion is that the msn bot is requesting pages repeatedly that do not have a last modified by header, dynamic pages.

This would be simple to determine if everyone who is posting were to check this on their sites now, and might give the msndudes something concrete to work on re fixing the problem. I have something like 40+ sites, all dynamic, all being aggressively spidered, none return last modified headers. Unfortunately, I don't have any non dynamic sites any longer to do a counter check on, I doubt most people here do.

5:48 pm on June 10, 2004 (gmt 0)

Junior Member

10+ Year Member

joined:Mar 5, 2004
posts:147
votes: 0


Heh isn't that the truth. Funny how people realize they have the right to block whoever they want from their site, but when someone does it to THEM they get their panties in a twist about it.

During the last week, MSNBOT has downloaded one of my sites 4 (FOUR) times. The site has almost 4000 pages (all unique content) and in my logs there are 16757 page hits by msnbot.

All pages are STATIC, 99% of those pages unmodified since 20-May and my server prints a "Last-Modified:" header, yet msnbot just downloads the whole site with HTTP 200 (rather than 304 as a well behaved Web app would).

Plus, my (perhaps biased, I admit it) opinion is that MS Search will index the Web for free (ie x billion pages listed in our index, where x close to Google's) BUT THE MAJORITY OF THE TRAFFIC WILL GO TO THEIR PPC CLIENTS, because the sponsored listings will come first.

I've not blocked msnbot, but I understand it can be a problem for some, especially if people think there is no benefit.

Same why reports suggest that Y owns 20% of the SE market, yet most of my sites receive 3-9% of their free referrals via Yahoo.

5:51 pm on June 10, 2004 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Oct 15, 2003
posts:1418
votes: 0


Same why reports suggest that Y owns 20% of the SE market, yet most of my sites receive 3-9% of their free referrals via Yahoo.

Does ANYONE receive more than 3-9% of their traffic from Yahoo? The only time I got a lot of traffic from Yahoo, even being in their paid directory and everything, was with my paid Overture ads. Even then it wasn't a huge percentage.

5:58 pm on June 10, 2004 (gmt 0)

Junior Member

10+ Year Member

joined:Mar 8, 2004
posts:196
votes: 0


I get about 11% from yahoo.
6:01 pm on June 10, 2004 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:July 22, 2002
posts:1807
votes: 1


Welcome to WebmasterWorld MSNDude! Nice to have some MS representation here. I'm sure you'll do a nice job.
6:04 pm on June 10, 2004 (gmt 0)

Preferred Member

joined:Apr 22, 2004
posts:528
votes: 0


I get 25% from Yahoo
6:13 pm on June 10, 2004 (gmt 0)

Preferred Member

joined:Apr 22, 2004
posts:528
votes: 0


side note though.. I just discovered this site a few monthes ago and started learning about the SEO and marketing side of things, now that I have started working on that I suspect my numbers will be different in a few monthes
6:41 pm on June 10, 2004 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Jan 15, 2004
posts:1300
votes: 0


All pages are STATIC, 99% of those pages unmodified since 20-May and my server prints a "Last-Modified:" header,

thanks, I knew I could confirm or deny that theory very quickly, that denies it, I was giving the msnbot the benefit of the doubt, guess the problem is more fundamental, this would be in keeping with MS's pattern of having version 1 of a new product be essentially useless, primary function just to provide a door into new markets, but I don't see them dropping the search ball, google is too easy of a target right now... of course if ms insists on running the search engine server farm on windows they are going to be a bit handicapped, hotmaiil was never the same since they forced it to run windows and not unix... : )

8:34 pm on June 10, 2004 (gmt 0)

Senior Member

WebmasterWorld Senior Member steveb is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:June 20, 2002
posts:4652
votes: 0


"So its MSN's fault you don't know how to do your job?"

Exactly. Whether it is laziness or non-comptence, complaining that it is someone else's fault that you aren't doing or don't know how to do your job is way beyond rude.

Ban the bot from specific directories if you want, but this aggressive crawling is just about the best news that webmasters could ask for from MSN at this point. Of course they could completely bungle the ranking of the data, but it sure would be a great thing to have every page that I want crawled to be crawled every day.

This 56 message thread spans 6 pages: 56