Welcome to WebmasterWorld Guest from 174.129.135.89

Forum Moderators: Ocean10000 & incrediBILL

Message Too Old, No Replies

Msnbot/0.1

First time I've seen this one.

   
1:50 pm on Jun 17, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



UA "MSNBOT/0.1 (http://search.msn.com/msnbot.htm)"
IP 131.107.137.47

It left a referring URL. Submitting to LookSmart's paid submit will get you crawled by this bot. Obeys robots.txt. The index is not yet on search.msn.com, but they say they do intend to add it in the future.

5:28 pm on Jun 17, 2003 (gmt 0)

10+ Year Member



More info here : [webmasterworld.com ]
8:13 pm on Jun 17, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Thanks marcs... note the new User Agent. I saw 131.107.137.xxx referenced in some of the other threads... has anybody seen any IPs other than the one referenced above?
12:16 pm on Jun 18, 2003 (gmt 0)

WebmasterWorld Senior Member wilderness is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



Doesn't act any different than it did when they weren't identifying themslves.
Tripped my dime-store trap.
11:45 am on Jun 19, 2003 (gmt 0)

WebmasterWorld Administrator anallawalla is a WebmasterWorld Top Contributor of All Time 10+ Year Member



I had a few visits today but all from a different address:

131.107.163.47
MSNBOT/0.1 (http://search.msn.com/msnbot.htm)

I am not a LS advertiser.

- Ash

12:15 am on Jun 20, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



So far, just one hit:

131.107.163.49 - - [18/Jun/2003:23:53:06 -0600] "GET /robots.txt HTTP/1.1" 200 5282 "-" "MSNBOT/0.1 (http://search.msn.com/msnbot.htm)"

dave

12:26 am on Jun 20, 2003 (gmt 0)

10+ Year Member



The earlier version (before it had a UA or name) came to one of our sites several weeks ago, but this latest fully identified as msnbot visited from: 131.107.163.57 and took quite a few of our pages today.

It will be interesting to see what they do with it...

LisaB

2:01 am on Jun 20, 2003 (gmt 0)

WebmasterWorld Senior Member jdmorgan is a WebmasterWorld Top Contributor of All Time 10+ Year Member



Looks like they've got a range of IP's going here.

Grabbed robots.txt, then index.html, then robots gain, then went deep - all with the same IP. No robots.txt violations.

131.107.163.58 - - [19/Jun/2003:02:05:37 -0400] "GET /robots.txt HTTP/1.1" 200 2507 "-" "MSNBOT/0.1 (http://search.msn.com/msnbot.htm)"
131.107.163.58 - - [19/Jun/2003:02:05:38 -0400] "GET / HTTP/1.1" 200 32464 "-" "MSNBOT/0.1 (http://search.msn.com/msnbot.htm)"

Jim

2:14 am on Jun 20, 2003 (gmt 0)

10+ Year Member



Seems well behaved vis a vis robots.txt, but they do grab files of type other than HTML. I note that of the 275 requests MsNBOT made to my office webserver today, 40 were for PDF documents and there scattered others for Postscript files and some binary datasets with odd filename extensions. No sign yet, though, they will be grabbing GIFs, JPEGs, etc.

If you don't want MSNBOT grabbing images, PDFs, etc., then you'll need to modify your RewriteRules appropriately. See the discussion in [webmasterworld.com...] about how to do so.

2:32 am on Jun 20, 2003 (gmt 0)

10+ Year Member



Going deep over here, and is well behaved, seems they are going to spider widely as I'm not in Looksmart or any other paid directory/program either.
3:27 pm on Jun 21, 2003 (gmt 0)

10+ Year Member



I've decided to ban MSNbot for the moment. It seems to generate a lot of rubbish like requests for

www.site.com/path/to/file/9c2
www.site.com/path/to/file/4a3
www.site.com/path/to/file/0c9

which are all 404s. I've sent them some site logs, but it's happened on various sites and I can't be bothered to be used as a guinea pig for their problems.