Forum Moderators: mack

Message Too Old, No Replies

Someone at MS just got banned!

Was Bill Gates Surfing My site?

         

carfac

5:21 pm on Apr 11, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Hi:

Just saw this guy, fell into a spider trap:

131.107.137.47 - - [11/Apr/2003:01:31:08 -0600] "GET /a/deep/link.html HTTP/1.1" 200 12589 "-" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; .NET CLR 1.0.3705)"

No referer, came in on a deep link (like from a SE), and d/l pages but no images. After about 5 hits, he tried to grab a trap, and got banned. Grabbed a page every 5 secs or so...

IP resolves to Redmond.... did Bill just get himself banned?

dave

pendanticist

7:18 pm on Apr 24, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I've not heard any more from him at all, bobmark.

There were a couple of posts where someone eluded to 'the next big thing' (or similar) as though they perhaps knew something we don't.

Mr. Birney apparently did not see fit to post here, even though I sent him the thread and suggested he do so. We have communicated twice to date.

There are other mentions on the boards about MS going after Google and etc.

Logic dictates a certain amount of legitimacy especially when one considers how could an employee of MS obtain that IP Number and not get caught during the course of events, such as server draw running crawls without someone at MS tracking him down.

Then again, without Mr. Birney adding 'personal legitimacy' by posting here, tends to sway me the other way.

Having said that, since my domain hasn't been 'pummeled' too badly, I'm going to wait and see using cautious optimism.

Pendanticist.

jim_w

8:55 pm on Apr 24, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



pendanticist

There is a possibility that he disguised his browser type and changed his IP. Like I said, they may be competing with me soon. Just my luck.

131.107.65.225 - - [19/Apr/2003:17:10:03 -0500] "GET /links.html HTTP/1.1" 200 33341 "-" "Mozilla/4.0+(compatible;+MSIE+6.0;+Windows+NT+5.2;+.NET+CLR+1.1.4322)"

Notice the ‘+’s. So I have deny from 131.107.

wilderness

9:06 pm on Apr 24, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Just the "surf nazi" checking here ;)

I've had them denied from their first visit and each time the IP expands? I'll expand my deny range.

It is NOT logical for a legitimate company like MS to disguise and even misrepresent themselves in such a manner.
It's just NOT good business.

So the "surf nazi" suggests letting them "eat 403's"

pendanticist

9:38 pm on Apr 24, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Well then, he/she changed IP Numbers again?!?

Just ran 131.107.163.49 thru SpamCop and it renders this: postmaster@[131.107.163.49].

131.107.137.47 ditto postmaster@[131.107.137.47].
131.107.65.225 ditto postmaster@[131.107.65.225].

131.107.163.49 - - [23/Apr/2003:11:09:17 -0700] "GET /robots.txt HTTP/1.1" 200 220 "-" "MicrosoftPrototypeCrawler (How's my crawling? mailto:newbiecrawler@hotmail.com)"
131.107.163.49 - - [23/Apr/2003:11:09:17 -0700] "GET /blahblah.html HTTP/1.1" 200 8620 "-" "MicrosoftPrototypeCrawler (How's my crawling? mailto:newbiecrawler@hotmail.com)"
131.107.163.49 - - [23/Apr/2003:11:09:17 -0700] "GET /blahblah.html HTTP/1.1" 200 13642 "-" "MicrosoftPrototypeCrawler (How's my crawling? mailto:newbiecrawler@hotmail.com)"

It looks like this is the third IP Number and the second ' message '.

Hmmmmmm. Beginning to look more bogus all the way around.

Ok, then how do we go about shutting this thing down?

Fire off an abuse@msn/hotmail.com message?

If they're spoofing IP Numbers, (and I'm ignorant here) can't that be tracked down and reported? Or, are we simply looking at the .htaccess ban?

I've searched MS's MSDN finding nothing there and then Googled "MicrosoftPrototypeCrawler" [google.com] shows only one more this week than it did last week and that's this thread.

Too bad there no one from here works for MS. <Hint! Hint!>

Pendanticist.

wilderness

10:20 pm on Apr 24, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Pendanticist
My suggestion is to stop toying with the deciever and deny 131.107.
If MS doesn't have the proper regard in proyecting their subcribers? Whe should I?

Don

pendanticist

10:46 pm on Apr 24, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



If they're spoofing IP Numbers, (and I'm ignorant here) can't that be tracked down and reported? Or, are we simply looking at the .htaccess ban?

Even though I've banned ' deny from 131.107.', I'm still interested in learning more about tracking down spoof'd IP Numbers.

Must be time for another thread....

Pendanticist.

bnc929

1:46 am on Apr 25, 2003 (gmt 0)

10+ Year Member



I make it a habit to ban any robot that does not put down a valid contact method in the user agent (unless I know who they are). I don't consider a hotmail account to be valid. It managed to crawl 1700 of my pages before I got it though.

jim_w

5:14 am on Apr 25, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Have you guys seen this mention of newbiecrawler on microdoc news?

Maybe I’m cynical, and no doubt I’m paranoid, (I grew up in the 70’s), and while it could be a new bot, that does not necessarily mean that it is a SE bot. It could be a spy bot just as well, or doing both. Spying while acting as SE bot or visa-versa.

Don’t get me wrong. I sell software written in a MS language, and have since 1990, and I always have been a pro-MS person, but, it looks funny and unethical to me. And lets face it, MS has been sued in the past on several questionable business practices.

I’m not even convinced that it is a bot all the time. Somewhere in my log the original IP that was posted came via google and subscribed to my newsletter, just as many of my competitors have in the past. Now I publish all the graphics for my newsletter on the server where my competitors are banned so all the get is the text until they get home. And they all have a REFERER of hotmail ,yahoo, etc. So I know it goes on.

can't that be tracked down and reported.

It can be just not very easily not to mention not economically. You need a sniffer and you need to sit on it 24/7. Banning is the most economical way I think.

Spoofing just doesn’t make sense. No reason to spoof to sign up for my newsletter. They could just do it via an ISP instead of going to all that trouble. Could it be a firewall thing adding to this confusing issue? I’ll bet it’s a new hire or something at MS, and they don’t realize that when they go out on the web with a MS IP, they are representing MS for better or for worse. i.e. a fresh-out or intern.

Chalupee

10:45 am on Apr 25, 2003 (gmt 0)

10+ Year Member



What MS sued... when did that happen? Im from the 60's and 70's spaced out and paranoid.
From the microdot message link above...
Assuming this new platform runs on Microsoft technology, there is going to be an interesting comparison between a Microsoft search engine and a Linux Search Engine (Google). Since we know Google has about 54,000 computers in what is a mammoth supercomputer made out of PC parts, it will be interesting to see how many NT Servers it takes to make a comparable search engine, or a better one than Google. ++

I think the name of the bot/crawler should be...
SwissCheese/madeinfrance...."Hack me, hack me.."

Chalupee
not from Gaudahlupee

wilderness

12:00 pm on Apr 25, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I was going through some old saved IP and other inforamtion which I had saved for reference concerning IP identification and stumbled across the following (which I had from http ://www.clearwaterbeachcam.com/d--skinner/spiders.html, although the page is still there the referecnes below are not. My saved file is dated 03/25/02 ):

#MSN
#tide01.microsoft.com
#131.107.3.11
#tide02.microsoft.com
#131.107.3.12
#tide03.microsoft.com
#131.107.3.13
#tide04.microsoft.com
#131.107.3.14
#tide05.microsoft.com
#131.107.3.15
#tide06.microsoft.com
#131.107.3.16
#tide07.microsoft.com
#131.107.3.17
#tide08.microsoft.com
#131.107.3.18
#tide09.microsoft.com
#131.107.3.19
#tide10.microsoft.com
#131.107.3.20
#tide11.microsoft.com
#131.107.3.21
#tide12.microsoft.com
#131.107.3.22
#tide14.microsoft.com
#131.107.3.24
#tide15.microsoft.com
#131.107.3.25
#tide16.microsoft.com
#131.107.3.26
#tide17.microsoft.com
#131.107.3.27
#tide18.microsoft.com
#131.107.3.28
#tide19.microsoft.com
#131.107.3.29
#tide20.microsoft.com
#131.107.3.30
#tide21.microsoft.com
#131.107.3.31
#tide22.microsoft.com
#131.107.3.32
#tide23.microsoft.com
#131.107.3.33
#tide24.microsoft.com
#131.107.3.34
#tide25.microsoft.com
#131.107.3.35
#tide26.microsoft.com
#131.107.3.36
#tide27.microsoft.com
#131.107.3.37
#tide28.microsoft.com
#131.107.3.38
#tide29.microsoft.com
#131.107.3.39
#tide30.microsoft.com
#131.107.3.40
#tide33.microsoft.com
#131.107.39.12
#tide34.microsoft.com
#131.107.3.44
#tide35.microsoft.com
#131.107.3.45
#tide36.microsoft.com
#131.107.3.46
#tide70.microsoft.com
#131.107.3.70
#tide71.microsoft.com
#131.107.3.71
#tide72.microsoft.com
#131.107.3.72
#tide73.microsoft.com
#131.107.3.73
#tide74.microsoft.com
#131.107.3.74
#tide75.microsoft.com
#131.107.3.75
#tide76.microsoft.com
#131.107.3.76
#tide77.microsoft.com
#131.107.3.77
#tide78.microsoft.com
#131.107.3.78
#tide79.microsoft.com
#131.107.3.79
#tide82.microsoft.com
#131.107.3.82
#tide83.microsoft.com
#131.107.3.83
#tide84.microsoft.com
#131.107.3.84
#tide85.microsoft.com
#131.107.3.85
#tide86.microsoft.com
#131.107.3.86
#tide87.microsoft.com
#131.107.3.87
#tide93.microsoft.com
#131.107.3.93
#tide94.microsoft.com
#131.107.3.94
#tide110.microsoft.com
#63.64.43.138
#tide111.microsoft.com
#63.64.43.137
#tide112.microsoft.com
#208.249.151.138
#tide113.microsoft.com
#208.249.151.139
#tide114.microsoft.com
#192.237.67.205
#tide115.microsoft.com
#192.237.67.206
#tide116.microsoft.com
#207.46.104.80
#tide117.microsoft.com
#207.46.125.16
#tide118.microsoft.com
#208.147.66.138
#tide119.microsoft.com
#208.147.66.139
#tide120.microsoft.com
#207.46.71.10
#tide121.microsoft.com
#207.46.71.11
#tide122.microsoft.com
#203.127.3.12
#tide123.microsoft.com
#203.127.3.14
#tide124.microsoft.com
#203.41.151.8
#tide125.microsoft.com
#203.41.151.9
#tide130.microsoft.com
#207.46.36.9
#tide131.microsoft.com
#207.46.36.10
#tide132.microsoft.com
#207.46.36.11
#tide133.microsoft.com
#207.46.38.9
#tide134.microsoft.com
#207.46.38.10
#tide135.microsoft.com
#207.46.11.19
#tide136.microsoft.com
#207.46.11.20
#tide137.microsoft.com
#207.46.11.21
#tide138.microsoft.com
#207.46.44.9
#tide139.microsoft.com
#207.46.44.10
#tide140.microsoft.com
#207.46.46.9
#tide141.microsoft.com
#207.46.46.10
#tide142.microsoft.com
#207.46.40.9
#tide143.microsoft.com
#207.46.40.10
#tide144.microsoft.com
#207.46.48.9
#tide145.microsoft.com
#207.46.48.10
#tide146.microsoft.com
#207.46.42.9
#tide147.microsoft.com
#207.46.42.10

This 111 message thread spans 12 pages: 111