Forum Moderators: DixonJones

Message Too Old, No Replies

The new Microsoft spider - "MSRBOT/0.1"

         

balam

3:56 am on Dec 20, 2003 (gmt 0)

10+ Year Member



First time I've seen this...

204.4.XX.X - - [20/Dec/2003:00:55:52 +0000] "GET /robots.txt HTTP/1.1" 200 5276 "-" "MSRBOT/0.1 (http://research.microsoft.com/research/sv/msrbot/)"
204.4.XX.X - - [20/Dec/2003:00:55:53 +0000] "GET /somepage.shtml HTTP/1.1" 301 345 "-" "MSRBOT/0.1 (http://research.microsoft.com/research/sv/msrbot/)"
204.4.XX.X - - [20/Dec/2003:01:24:21 +0000] "GET /some-page.shtml HTTP/1.1" 200 17734 "http://www.mydomain.com:80/somepage.shtml" "MSRBOT/0.1 (http://research.microsoft.com/research/sv/msrbot/)"

Obviously grabbed robots.txt, and grabbing the page that it did doesn't violate my robots.txt - but it's still early in the game. ;)

Interesting that it added a referrer when it grabbed the correct page, I don't see very many bots do that. Also interesting is the addition of a port to the URL...

pendanticist

4:54 am on Dec 20, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Yeah, been mentioned here before [webmasterworld.com].

Then there was this somewhat related discussion [webmasterworld.com].

Pendanticist.

balam

6:53 am on Dec 20, 2003 (gmt 0)

10+ Year Member



How do you say? "My bad?"

I lay blame on relying on the site search and not Google...

<added>
From the thread you mentioned, quoting you...

> this new one only crawls one very remote, very obsure file

Funny... I don't know if you would call it "remote & obscure," since it is accessible from every page, but they hit my privacy page - the least visited page on my site, unfortunately.
</added>

pendanticist

7:04 am on Dec 20, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



;)

Pendanticist.

jdMorgan

1:28 am on Dec 22, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



<edit> posted to wrong thread; msnbot here: [webmasterworld.com...] </edit>