Forum Moderators: mack
Just saw this guy, fell into a spider trap:
131.107.137.47 - - [11/Apr/2003:01:31:08 -0600] "GET /a/deep/link.html HTTP/1.1" 200 12589 "-" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; .NET CLR 1.0.3705)"
No referer, came in on a deep link (like from a SE), and d/l pages but no images. After about 5 hits, he tried to grab a trap, and got banned. Grabbed a page every 5 secs or so...
IP resolves to Redmond.... did Bill just get himself banned?
dave
There were a couple of posts where someone eluded to 'the next big thing' (or similar) as though they perhaps knew something we don't.
Mr. Birney apparently did not see fit to post here, even though I sent him the thread and suggested he do so. We have communicated twice to date.
There are other mentions on the boards about MS going after Google and etc.
Logic dictates a certain amount of legitimacy especially when one considers how could an employee of MS obtain that IP Number and not get caught during the course of events, such as server draw running crawls without someone at MS tracking him down.
Then again, without Mr. Birney adding 'personal legitimacy' by posting here, tends to sway me the other way.
Having said that, since my domain hasn't been 'pummeled' too badly, I'm going to wait and see using cautious optimism.
Pendanticist.
There is a possibility that he disguised his browser type and changed his IP. Like I said, they may be competing with me soon. Just my luck.
131.107.65.225 - - [19/Apr/2003:17:10:03 -0500] "GET /links.html HTTP/1.1" 200 33341 "-" "Mozilla/4.0+(compatible;+MSIE+6.0;+Windows+NT+5.2;+.NET+CLR+1.1.4322)"
Notice the ‘+’s. So I have deny from 131.107.
I've had them denied from their first visit and each time the IP expands? I'll expand my deny range.
It is NOT logical for a legitimate company like MS to disguise and even misrepresent themselves in such a manner.
It's just NOT good business.
So the "surf nazi" suggests letting them "eat 403's"
Just ran 131.107.163.49 thru SpamCop and it renders this: postmaster@[131.107.163.49].
131.107.137.47 ditto postmaster@[131.107.137.47].
131.107.65.225 ditto postmaster@[131.107.65.225].
131.107.163.49 - - [23/Apr/2003:11:09:17 -0700] "GET /robots.txt HTTP/1.1" 200 220 "-" "MicrosoftPrototypeCrawler (How's my crawling? mailto:newbiecrawler@hotmail.com)"
131.107.163.49 - - [23/Apr/2003:11:09:17 -0700] "GET /blahblah.html HTTP/1.1" 200 8620 "-" "MicrosoftPrototypeCrawler (How's my crawling? mailto:newbiecrawler@hotmail.com)"
131.107.163.49 - - [23/Apr/2003:11:09:17 -0700] "GET /blahblah.html HTTP/1.1" 200 13642 "-" "MicrosoftPrototypeCrawler (How's my crawling? mailto:newbiecrawler@hotmail.com)"
It looks like this is the third IP Number and the second ' message '.
Hmmmmmm. Beginning to look more bogus all the way around.
Ok, then how do we go about shutting this thing down?
Fire off an abuse@msn/hotmail.com message?
If they're spoofing IP Numbers, (and I'm ignorant here) can't that be tracked down and reported? Or, are we simply looking at the .htaccess ban?
I've searched MS's MSDN finding nothing there and then Googled "MicrosoftPrototypeCrawler" [google.com] shows only one more this week than it did last week and that's this thread.
Too bad there no one from here works for MS. <Hint! Hint!>
Pendanticist.
If they're spoofing IP Numbers, (and I'm ignorant here) can't that be tracked down and reported? Or, are we simply looking at the .htaccess ban?
Even though I've banned ' deny from 131.107.', I'm still interested in learning more about tracking down spoof'd IP Numbers.
Must be time for another thread....
Pendanticist.
Have you guys seen this mention of newbiecrawler on microdoc news?
Maybe I’m cynical, and no doubt I’m paranoid, (I grew up in the 70’s), and while it could be a new bot, that does not necessarily mean that it is a SE bot. It could be a spy bot just as well, or doing both. Spying while acting as SE bot or visa-versa.
Don’t get me wrong. I sell software written in a MS language, and have since 1990, and I always have been a pro-MS person, but, it looks funny and unethical to me. And lets face it, MS has been sued in the past on several questionable business practices.
I’m not even convinced that it is a bot all the time. Somewhere in my log the original IP that was posted came via google and subscribed to my newsletter, just as many of my competitors have in the past. Now I publish all the graphics for my newsletter on the server where my competitors are banned so all the get is the text until they get home. And they all have a REFERER of hotmail ,yahoo, etc. So I know it goes on.
can't that be tracked down and reported.
It can be just not very easily not to mention not economically. You need a sniffer and you need to sit on it 24/7. Banning is the most economical way I think.
Spoofing just doesn’t make sense. No reason to spoof to sign up for my newsletter. They could just do it via an ISP instead of going to all that trouble. Could it be a firewall thing adding to this confusing issue? I’ll bet it’s a new hire or something at MS, and they don’t realize that when they go out on the web with a MS IP, they are representing MS for better or for worse. i.e. a fresh-out or intern.
I think the name of the bot/crawler should be...
SwissCheese/madeinfrance...."Hack me, hack me.."
Chalupee
not from Gaudahlupee
#MSN
#tide01.microsoft.com
#131.107.3.11
#tide02.microsoft.com
#131.107.3.12
#tide03.microsoft.com
#131.107.3.13
#tide04.microsoft.com
#131.107.3.14
#tide05.microsoft.com
#131.107.3.15
#tide06.microsoft.com
#131.107.3.16
#tide07.microsoft.com
#131.107.3.17
#tide08.microsoft.com
#131.107.3.18
#tide09.microsoft.com
#131.107.3.19
#tide10.microsoft.com
#131.107.3.20
#tide11.microsoft.com
#131.107.3.21
#tide12.microsoft.com
#131.107.3.22
#tide14.microsoft.com
#131.107.3.24
#tide15.microsoft.com
#131.107.3.25
#tide16.microsoft.com
#131.107.3.26
#tide17.microsoft.com
#131.107.3.27
#tide18.microsoft.com
#131.107.3.28
#tide19.microsoft.com
#131.107.3.29
#tide20.microsoft.com
#131.107.3.30
#tide21.microsoft.com
#131.107.3.31
#tide22.microsoft.com
#131.107.3.32
#tide23.microsoft.com
#131.107.3.33
#tide24.microsoft.com
#131.107.3.34
#tide25.microsoft.com
#131.107.3.35
#tide26.microsoft.com
#131.107.3.36
#tide27.microsoft.com
#131.107.3.37
#tide28.microsoft.com
#131.107.3.38
#tide29.microsoft.com
#131.107.3.39
#tide30.microsoft.com
#131.107.3.40
#tide33.microsoft.com
#131.107.39.12
#tide34.microsoft.com
#131.107.3.44
#tide35.microsoft.com
#131.107.3.45
#tide36.microsoft.com
#131.107.3.46
#tide70.microsoft.com
#131.107.3.70
#tide71.microsoft.com
#131.107.3.71
#tide72.microsoft.com
#131.107.3.72
#tide73.microsoft.com
#131.107.3.73
#tide74.microsoft.com
#131.107.3.74
#tide75.microsoft.com
#131.107.3.75
#tide76.microsoft.com
#131.107.3.76
#tide77.microsoft.com
#131.107.3.77
#tide78.microsoft.com
#131.107.3.78
#tide79.microsoft.com
#131.107.3.79
#tide82.microsoft.com
#131.107.3.82
#tide83.microsoft.com
#131.107.3.83
#tide84.microsoft.com
#131.107.3.84
#tide85.microsoft.com
#131.107.3.85
#tide86.microsoft.com
#131.107.3.86
#tide87.microsoft.com
#131.107.3.87
#tide93.microsoft.com
#131.107.3.93
#tide94.microsoft.com
#131.107.3.94
#tide110.microsoft.com
#63.64.43.138
#tide111.microsoft.com
#63.64.43.137
#tide112.microsoft.com
#208.249.151.138
#tide113.microsoft.com
#208.249.151.139
#tide114.microsoft.com
#192.237.67.205
#tide115.microsoft.com
#192.237.67.206
#tide116.microsoft.com
#207.46.104.80
#tide117.microsoft.com
#207.46.125.16
#tide118.microsoft.com
#208.147.66.138
#tide119.microsoft.com
#208.147.66.139
#tide120.microsoft.com
#207.46.71.10
#tide121.microsoft.com
#207.46.71.11
#tide122.microsoft.com
#203.127.3.12
#tide123.microsoft.com
#203.127.3.14
#tide124.microsoft.com
#203.41.151.8
#tide125.microsoft.com
#203.41.151.9
#tide130.microsoft.com
#207.46.36.9
#tide131.microsoft.com
#207.46.36.10
#tide132.microsoft.com
#207.46.36.11
#tide133.microsoft.com
#207.46.38.9
#tide134.microsoft.com
#207.46.38.10
#tide135.microsoft.com
#207.46.11.19
#tide136.microsoft.com
#207.46.11.20
#tide137.microsoft.com
#207.46.11.21
#tide138.microsoft.com
#207.46.44.9
#tide139.microsoft.com
#207.46.44.10
#tide140.microsoft.com
#207.46.46.9
#tide141.microsoft.com
#207.46.46.10
#tide142.microsoft.com
#207.46.40.9
#tide143.microsoft.com
#207.46.40.10
#tide144.microsoft.com
#207.46.48.9
#tide145.microsoft.com
#207.46.48.10
#tide146.microsoft.com
#207.46.42.9
#tide147.microsoft.com
#207.46.42.10