Forum Moderators: mack
Just saw this guy, fell into a spider trap:
131.107.137.47 - - [11/Apr/2003:01:31:08 -0600] "GET /a/deep/link.html HTTP/1.1" 200 12589 "-" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; .NET CLR 1.0.3705)"
No referer, came in on a deep link (like from a SE), and d/l pages but no images. After about 5 hits, he tried to grab a trap, and got banned. Grabbed a page every 5 secs or so...
IP resolves to Redmond.... did Bill just get himself banned?
dave
It's always there on my log files.Through the day.
Can somebody clarify what is this?Do have any potential problem with this.
Aravind
I found these forums whilst trying to find out more about the 'MicrosoftPrototypeCrawler'. We've had a slight grazing from this bot and I wanted to know whether it was legitimate or not - the lack of crawler-info URL was highly suspicious.
Much as I hate to diss a good conspiracy theory, our website sells archaeology/history books and has never mentioned MS. The pages it has requested are all listed in Google [never see an http_referer though] so maybe it's working off that.
I wait with baited breath to find out what's really goin' down with this thing. Haven't banned it yet as it's only requested 11 pages so far (since 18th April).
Keep up the good work, I've learned a heck of a lot already this afternoon here.
Best
Tom
I can buy that there is/will be a MS SE bot and that the bot’s IP will be 131.107.163.46 through 131.107.163.50 [webmasterworld.com], but I don’t think 131.107.137.47 necessarily has anything to do with the MS SE bot.
Only time will tell.
[edit]
I’m sorry. I didn’t mean to question your integrity. I said in another thread that I sometimes leave things out. I left the message and then had to step out. While I was out I thought, oh crap, that could be taken the wrong way. And I was going to change it as soon as I got back, which I didn’t get a chance to because you were quicker than I.
I hope you understand my meaning and I truly apologize for any misunderstanding. Please accept it.
[/edit]
Prerequisites to project Code Name: Tahoe
Tracking and Viewing Changes on the Web
Jan 1996
[research.microsoft.com...]Information Retrieval & Analysis
[research.microsoft.com...]Code Name: Tahoe
Tahoe Graduates to SharePoint Portal Server
March 12, 2001
Informationweek.com
[informationweek.com...]
One of the most recent is a product code-named Tahoe. It's a search engine that will debut later this year as part of Microsoft's SharePoint Portal Server
And this is out. It only does intranets. But they have some experience in developing SE’s. So we should not see them crashing systems from day 1.
Now after thinking as long and as hard as I could about this, and comparing M$ history, what about the following? I may be really off in leftfield here, but consider the following.
M$ did not want to kill NN necessarily, they just didn’t want them to be a predominant player in the browser world. This is because M$ has a vision of every computer sharing information easily. And you have to admit, with IE, I can view .doc, .ppt, .xls, etc. all with IE. Thus selling more office products.
Now it could be a concept that major SE’s hold enough power that by changing their algorithms, or if one becomes more prevalent than another one, the SE’s could make or break a .com. This due to the fact, and indeed according to my research, about 50% of the people doing searches never go to page 2. And if changing a SE’s algorithm or one SE getting more users than another, causes too many high stake players to fall to page 2 or lower, it can crush a company. I think Mr. Gates feels that the SE’s have too much power. But I don’t think he wants to get into purchasing over 100,000 PC and the labor, etc. in setting them up to become a Public SE. Too much investment into what is already a saturated market. Keep in mind that Mr. Bill lost something like $1b in the dot com crash.
It is known that .NET has the capability and examples to build a SE. It is missing the filtering algorithms. There are already several 3rd party SE’s based on .NET one can buy.
What if they don’t want google’s traffic, they just don’t want google to have it either. By supplying companies and end users with the tools to create their own search engines based on just that person’s or company’s interests, and allowing that database to share it’s information with secure servers, it could become a distributed computing SE.
For example, Motorola in Chicago could set up their own SE bot to look for information/data that is important for their employees to get their jobs done more efficiently. Motorola in Florida does the same thing, and these 2 SE bots can share information between themselves. Then all the other Motorola facilities do the same. You get a company now with several internal SE bots just getting very specific information/data, they now save time weeding out information that is not important to the work their company is trying to do, and in a search you keep employees from surfing off to sites not work related. If you connect your suppliers into certain portions of the SE database, now they become smarter and can give you better service and products. If all the Fortune 1000 companies did this, what kind of impact would that have on the major SE’s?
It won’t make SE’s obsolete, but it could cut down considerably on their amount of traffic, thus making them less valuable and much less powerful. It also takes out a bunch of sites that have better marketing than products and removes them from traffic flow to major corporations.
Now, misinformation is a game that the CIA as well as big businesses use all the time. Nothing better than a rumor that isn’t true to get your competitors to create a business defense, but it is the wrong defense because of rumors and once they find out what is really going on, they cannot recover fast enough to defend off any attempts. This has happen numerous times in the business world.
While this is just a concept, I don’t think it is that far fetched. Especially for anyone who has read Bill Gates’ book. And of course, M$ sells even more .NET and IIS in the process.
Any opinions? Or have I totally lost it?