Forum Moderators: open

Message Too Old, No Replies

msnbot/.search.msn.com uses Wget

         

Pfui

2:38 am on Sep 30, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Now this is just plain nasty. From a bit ago:

msnbot-65-55-37-220.search.msn.com
Wget/1.8.2

robots.txt? NO

URI: /?display=blog_rss
(The site has neither blog nor feed.)

dstiles

10:32 pm on Sep 30, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



msnbot is hitting me here with things like /default.htm (which has never been a valid filename) and a cgi extension that has never been on the domain being hit - in fact has not been valid for ten years. Also ignoring robots.txt on a sub-domain that's "banned" there.

Mind you, google are no better at the moment, asking for things like /bsiigrrsobwpib.html - again, never has been a file of that name and the domain has never had html extensions: it's purely dynamic ASP. Also tacking a double query ? onto the end of otherwise valid filenames.

And yahoo is going hard at it with no valid headers whatsoever.

Pfui

2:22 pm on Oct 1, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Often nonexistence URIs are from botched links-to. (See G's Webmaster Tools Diagnostics/Crawl errors/Not found for your site(s) for oft' wacky URIs.)

But Wget's another thing entirely. It sucks, it scrapes, it mirrors, it slices, it dices...

Any incidences of msnbot/.search.msn.com running Wget?

keyplyr

6:56 pm on Oct 1, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I've been blocking Wget categorically for probably 8 years. Started seeing msnbot and its variants using this method about a year ago. I return 403s but it keeps coming every day.

Also of note from the same MSN range are requests using "WinHttp" and "Mozilla/4.0"

caribguy

7:21 pm on Oct 7, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Hmmm... It's only getting better, nothing available to the general public on this server:

management-access-only.example.com 65.55.108.72 - - [dd/Oct/2009:hh:mm:ss -0000] "\x16\x03\x01" 403 269 "-" "-"
management-access-only.example.com 65.55.108.72 - - [dd/Oct/2009:hh:mm:ss -0000] "\x16\x03\x01" 403 269 "-" "-"

That's something looking for an SSL exploit.