Forum Moderators: mack

Message Too Old, No Replies

Someone at MS just got banned!

Was Bill Gates Surfing My site?

         

carfac

5:21 pm on Apr 11, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Hi:

Just saw this guy, fell into a spider trap:

131.107.137.47 - - [11/Apr/2003:01:31:08 -0600] "GET /a/deep/link.html HTTP/1.1" 200 12589 "-" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; .NET CLR 1.0.3705)"

No referer, came in on a deep link (like from a SE), and d/l pages but no images. After about 5 hits, he tried to grab a trap, and got banned. Grabbed a page every 5 secs or so...

IP resolves to Redmond.... did Bill just get himself banned?

dave

bunltd

3:57 am on Apr 28, 2003 (gmt 0)

10+ Year Member



Just came across this thread and checked my logs: FWIW starting around the 17th through the 26th

131.107.163.49 - MicrosoftPrototypeCrawler (please report obnoxious behavior to newbiecrawler@hotmail.com

and

131.107.65.225 - Mozilla/4.0+(compatible;+MSIE+6.0;+Windows+NT+5.2;+.NET+CLR+1.1.4322)

Although I am at a loss as to what exactly it is doing there, doesn't seem to follow any particular pattern, it will be interested to see what pendanticist learns.

LisaB

jim_w

6:12 am on Apr 28, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



<quote>I should hear something back from MS</quote>

Well I hope so, but being the cynic we are, it would be my perception that the only time MS called me back was after I gave them a CC# and they charged me $175.00 to prove to them they had a bug in one of their compilers. (wondering how MS got so rich)

<quote>we'll all speculate our weekends</quote>

Actually the biggest concern I have right now is, are any of the IP’s in 131.107. used by MSN. I already know I need to block their corporate HQ, but I don’t really want to block their SE bot or MSN users. Although, that would probably be fine with AOL and google. God knows I’ve shot myself in the foot enough for one lifetime.

The alleged ‘MicrosoftPrototypeCrawler’ hasn’t been back to see us since 25/Apr/2003:19:25:59 –0500 and they read robots.txt. 2 days before that they tried to snag our policy page and got 403’ed, but it may have already been in a cache somewhere, so it looks like they may be leaving us alone.

If google traffic was their #1 goal, they would probably already have it. They have so much money that they can throw at their top goals, (I don’t have a clue though how they got so much though), that it isn’t even funny. Look at what happened to Netscape. And the Apple law suite was so costly for Apple, I’ll bet Apple wishes they would have spent that money on R&D for their OS in hindsight. They even figured out how to get the upper hand with IBM and WARP.

Someone may find this funny. Even MS uses google. (GRIN)
131.107.3.86 - - [08/Apr/2003:11:34:35 -0500] "GET / HTTP/1.0" 200 31279 "http://www.google.com/search?q=xxx+xxxx&hl=en&lr=&ie=UTF-8&oe=UTF-8&start=10&sa=N" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; Q312461; .NET CLR 1.1.4322)"

If they call, see if you can find out what IP’s their SE bot will use. Tell them it so that 90% of the professional webmasters won’t block their SE bot by mistake. (GRIN) I’ve had to contact several SE’s to get the IP’s they use because I also have an AXS log and I like to filter the SE bots out of so that I have stats on just the eyeballs that see the page, and of course the ‘evil doers’ that I can spot right away and ban. I wish all SE bits would publish what IP’s their bots used.

And may your beer not be warmer or yellowier.

Lisa-

131.107.65.225 - Mozilla/4.0+(compatible;+MSIE+6.0;+Windows+NT+5.2;+.NET+CLR+1.1.4322)

Do you have time and date this happened? I want to correlate it with mine to see if it was about the same time frame. This was obviously a coding error by someone whom no doubt fixed it right away, but I am curious.

bunltd

4:55 pm on Apr 28, 2003 (gmt 0)

10+ Year Member



Do you have time and date this happened?

Jim, yes... here's the gist:

Mozilla/4.0+(compatible;+MSIE+6.0;+Windows+NT+5.2;+.NET+CLR+1.1.4322
showed requests around:
17/Apr/2003:14:09:55
18/Apr/2003:11:58:09
19/Apr/2003:19:58:44
22/Apr/2003:17:24:31
25/Apr/2003:01:03:13
26/Apr/2003:17:41:09

jim_w

5:11 pm on Apr 28, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



This was obviously a coding error by someone whom no doubt fixed it right away

Or maybe they didn't?

AAnnAArchy

8:24 pm on Apr 28, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



131.107.163.50 MicrosoftPrototypeCrawler (How's my crawling? mailto:newbiecrawler@hotmail.com) 04/28/03 01:20 PM Viewing a user's profile

So, has anyone found out what the deal is yet? My site has nothing to do with MS - it's a fansite board that it's crawling right now.

pendanticist

9:16 pm on Apr 28, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



So, has anyone found out what the deal is yet?

Yes and no. Being somewhat impatient, I called them just a few minutes ago (5:00 EST). The receptionist 'did' remember me and stated that whomever is in charge (I assume publicity/legal) underwent some form of surgery Thursday.

As she said to me today, 'when we spoke Friday she had no idea this particular individual was the one who 'clears' any forthcoming information' therefore she couldn't tell me he/she was out.

I believe the 'in charge' speaks to no one in particular, just that this individual must be in the loop with respect to any public discussions/admissions whatever. So, for the moment, let's call him/her Public Relations.

I did re-stress our concerns as unilaterally as I could, saying that "Webmasters from around the World are more than concerned as to the authenticity of the relationship w/MS based solely on the moniker 'MicrosoftPrototypeCrawler as we've seen in our log files."

I wish I had more definitiveinformation, but I do not.

Of course, if Brett wants to make that phone call too.....

Pendanticist.

pendanticist

10:03 pm on Apr 28, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Here's the scoop!

This is indeed a Microsoft sanctioned crawler!

...it is something Microsoft created and soon instead of having the newbiecrawler@hotmail.com contact for question, it will have a microsoft email address to avoid confusion.

There it is, folks.

Take it and ruuuuunnnnnnn.

Pendanticist.

pixel_juice

10:37 pm on Apr 28, 2003 (gmt 0)

10+ Year Member



Wow! Where'd you hear that pendanticist? I was just about to post to agree with "Webmasters from around the World are more than concerned..."

jdMorgan

10:40 pm on Apr 28, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



pendanticist,

Thanks for chasing this down...

You might suggest to them - if you haven't already - that putting up a web page with their crawler particulars on it, and including that URL in their UA string would be a good idea. That way, webmasters don't have to wait for an answer, and they won't have thousands of e-mails to answer every day.

Thanks again,
Jim

pendanticist

10:42 pm on Apr 28, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



As I said earlier in this thread, I called Microsoft Friday and again today.

See also It's Official!
MicrosoftPrototypeCrawler is legitimate!
[webmasterworld.com]

<added>
The response I got (after the phone call as noted above) was the snippet posted which came to me as e-mail. It is by no means comprehensive.

Jim, I'll answer your question thru the above thread, as best I can.

As far as I'm concerned this thread is dead.
</added>

Pendanticist.

This 111 message thread spans 12 pages: 111