| 5:28 pm on Jun 17, 2003 (gmt 0)|
More info here : [webmasterworld.com ]
| 8:13 pm on Jun 17, 2003 (gmt 0)|
Thanks marcs... note the new User Agent. I saw 131.107.137.xxx referenced in some of the other threads... has anybody seen any IPs other than the one referenced above?
| 12:16 pm on Jun 18, 2003 (gmt 0)|
Doesn't act any different than it did when they weren't identifying themslves.
Tripped my dime-store trap.
| 11:45 am on Jun 19, 2003 (gmt 0)|
I had a few visits today but all from a different address:
I am not a LS advertiser.
| 12:15 am on Jun 20, 2003 (gmt 0)|
So far, just one hit:
220.127.116.11 - - [18/Jun/2003:23:53:06 -0600] "GET /robots.txt HTTP/1.1" 200 5282 "-" "MSNBOT/0.1 (http://search.msn.com/msnbot.htm)"
| 12:26 am on Jun 20, 2003 (gmt 0)|
The earlier version (before it had a UA or name) came to one of our sites several weeks ago, but this latest fully identified as msnbot visited from: 18.104.22.168 and took quite a few of our pages today.
It will be interesting to see what they do with it...
| 2:01 am on Jun 20, 2003 (gmt 0)|
Looks like they've got a range of IP's going here.
Grabbed robots.txt, then index.html, then robots gain, then went deep - all with the same IP. No robots.txt violations.
22.214.171.124 - - [19/Jun/2003:02:05:37 -0400] "GET /robots.txt HTTP/1.1" 200 2507 "-" "MSNBOT/0.1 (http://search.msn.com/msnbot.htm)"
126.96.36.199 - - [19/Jun/2003:02:05:38 -0400] "GET / HTTP/1.1" 200 32464 "-" "MSNBOT/0.1 (http://search.msn.com/msnbot.htm)"
| 2:14 am on Jun 20, 2003 (gmt 0)|
Seems well behaved vis a vis robots.txt, but they do grab files of type other than HTML. I note that of the 275 requests MsNBOT made to my office webserver today, 40 were for PDF documents and there scattered others for Postscript files and some binary datasets with odd filename extensions. No sign yet, though, they will be grabbing GIFs, JPEGs, etc.
If you don't want MSNBOT grabbing images, PDFs, etc., then you'll need to modify your RewriteRules appropriately. See the discussion in [webmasterworld.com...] about how to do so.
| 2:32 am on Jun 20, 2003 (gmt 0)|
Going deep over here, and is well behaved, seems they are going to spider widely as I'm not in Looksmart or any other paid directory/program either.
| 3:27 pm on Jun 21, 2003 (gmt 0)|
I've decided to ban MSNbot for the moment. It seems to generate a lot of rubbish like requests for
which are all 404s. I've sent them some site logs, but it's happened on various sites and I can't be bothered to be used as a guinea pig for their problems.