Forum Moderators: open

Message Too Old, No Replies

Microsoft spider?

had site hit from multiple IP's one of which assigned to Microsoft

         

ecomagic

11:31 pm on Nov 11, 2004 (gmt 0)

10+ Year Member



One of my sites have been hit hard by the following Useragent:

Mozilla/4.0 (compatible; MSIE 4.0; Windows NT; ....../1.0 )

The thing is that every page on the site was viewed by this agent from a list of "unrelated" IP addresses:

212.138.47.12
212.138.47.16
212.138.47.17
208.252.91.3 << Microsoft IP adderss
12.17.130.27
208.252.91.3
207.155.199.163
65.164.129.91
208.252.91.3

[dnsstuff.com...]

The 208. address is an IP address from Microsoft but the rest seem to unrelated coming from all round the world. All these hits where definately from the same bot and happened within a window of a few hours.

Is this some kind of Microsoft spider to find and bust cloaked sits from their new engine perhaps?

Has anybody else been hit by this same bot?

volatilegx

9:16 pm on Nov 13, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



> Mozilla/4.0 (compatible; MSIE 4.0; Windows NT; ....../1.0 )

Is that the literal user agent, or id you add the "......" part?

Are you sure it's a bot?

bull

10:20 pm on Nov 13, 2004 (gmt 0)

10+ Year Member



[webmasterworld.com...]

Still unanswered by msndude. Perhaps some hijacked MS puter - and, if true, still not fixed after months ;)

ecomagic

8:57 am on Nov 16, 2004 (gmt 0)

10+ Year Member



Yeah that is the right user agent, I didn't add the "...." bit :)

So you where hit by this bot too Bull, interesting.

It is 100% a bot but the weird thing is whom ever is running it is doing something "funny" but using different IP addresses all over the world along with the weird user agent. I spose another possibility could be a bunch of zombied boxen trying to harvest email addresses?

mchlax

6:44 pm on Dec 16, 2004 (gmt 0)

10+ Year Member



I'm having a similar problem with this very strange bot, it is not abiding by robots.txt, and has the exact same UA that the bot you specified does.

It hit my site a total of ~1460 times in a few hours, mostly requesting urls in the form of puchalapalli.com/folder_on_my_site/www.othersite.com/otherfolder etc etc.

Here is a complete, processed log: [puchalapalli.com...]

Note that the file is > 1 MB, so please be patient while it loads, and take it easy on my cheap server :-P

The first green link right before the timestamp is the actual request made by the bot (ie b/www.writersblock.ca/spring1996/feature.htm+do+i+have+a+brain%3F)

Ignore the green link of the URL.

The hostmask provided by the bot was the same as its IP.

wilderness

8:00 pm on Dec 16, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



a solution?

SetEnvIf User-Agent \.\.\./1\.0)$ keep_out

mchlax

9:01 pm on Dec 16, 2004 (gmt 0)

10+ Year Member



Good point, and I might implement that, but I'm still curious as to what the heck this thing is.

bull

11:57 pm on Dec 16, 2004 (gmt 0)

10+ Year Member



Don,
this is nearly exactly the solution I have been using since the bot's first visit.
RewriteCond %{HTTP_USER_AGENT} \.\.\.\.\/1\.

Jan

wilderness

12:51 am on Dec 17, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



but I'm still curious as to what the heck this thing is.

My curiousity ceases after a few googles and a couple of ARIN or other registrar searches. Some times not even that much of an effort.

If the bot doesn't provide a definite link in its UA or even fails in providing a UA?
I see no reason to remain curious about any advantage or disadvantage it may have for my sites.

Don

P.S; Jan Maybe I copied it from you ;) else good minds just think alike ;)