Someone at MS just got banned!

Forum Moderators: mack

Message Too Old, No Replies

Someone at MS just got banned!

Was Bill Gates Surfing My site?

carfac

5:21 pm on Apr 11, 2003 (gmt 0)

Hi:

Just saw this guy, fell into a spider trap:

131.107.137.47 - - [11/Apr/2003:01:31:08 -0600] "GET /a/deep/link.html HTTP/1.1" 200 12589 "-" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; .NET CLR 1.0.3705)"

No referer, came in on a deep link (like from a SE), and d/l pages but no images. After about 5 hits, he tried to grab a trap, and got banned. Grabbed a page every 5 secs or so...

IP resolves to Redmond.... did Bill just get himself banned?

dave

AAnnAArchy

12:41 am on Apr 29, 2003 (gmt 0)

Well, we banned it anyway. It was slowing down our board...and there's nothing I dislike more than when my own sites don't load quickly.

aravindgp

9:53 am on Apr 29, 2003 (gmt 0)

>>>Mozilla/4.0+(compatible;+MSIE+5.01;+Windows+NT+5.0) [widget-site.com...]
2003-04-29 08:56:41 - - 64.106.154.165 80 GET /images/widget_1.gif - 200 470 270

It's always there on my log files.Through the day.

Can somebody clarify what is this?Do have any potential problem with this.

Aravind

tomlitt

9:30 pm on May 1, 2003 (gmt 0)

Hi guys, I'm totally new here so bear with me.

I found these forums whilst trying to find out more about the 'MicrosoftPrototypeCrawler'. We've had a slight grazing from this bot and I wanted to know whether it was legitimate or not - the lack of crawler-info URL was highly suspicious.

Much as I hate to diss a good conspiracy theory, our website sells archaeology/history books and has never mentioned MS. The pages it has requested are all listed in Google [never see an http_referer though] so maybe it's working off that.

I wait with baited breath to find out what's really goin' down with this thing. Haven't banned it yet as it's only requested 11 pages so far (since 18th April).

Keep up the good work, I've learned a heck of a lot already this afternoon here.

Best
Tom

NFFC

9:43 pm on May 1, 2003 (gmt 0)

>I wait with baited breath to find out what's really goin' down with this thing

Me too tom [and welcome btw], I have a feeling that in a year or maybe two's time we will look back at this thread and think - we were here when it started.

panos

11:25 am on May 4, 2003 (gmt 0)

This stupid bot crashed my site yesterday
i use phpbb and the sessions table contained 34678 rows!

i dont want my site to be used for microsoft's experiments
so this bot is banned

jim_w

2:50 pm on May 4, 2003 (gmt 0)

Personally, I�m NOT 100% convinced that the alleged [webmasterworld.com] is 100% true.

I can buy that there is/will be a MS SE bot and that the bot�s IP will be 131.107.163.46 through 131.107.163.50 [webmasterworld.com], but I don�t think 131.107.137.47 necessarily has anything to do with the MS SE bot.

Only time will tell.

pendanticist

4:32 pm on May 4, 2003 (gmt 0)

You gotta lotta nerve calling my integrity into question.

Just for your information, I've forwarded the communication from MicroSoft to littleman's outside e-mail account.

I expect to see a full and complete apology from you in a brand new thread real soon.

Glenn E. Carper.
A.K.A. Pendanticist.

jim_w

4:58 pm on May 4, 2003 (gmt 0)

I think you said verbatim what they said to you. I�m just not convinced that what they told you was 100% the truth. Sorry for the misunderstanding.

[edit]
I�m sorry. I didn�t mean to question your integrity. I said in another thread that I sometimes leave things out. I left the message and then had to step out. While I was out I thought, oh crap, that could be taken the wrong way. And I was going to change it as soon as I got back, which I didn�t get a chance to because you were quicker than I.

I hope you understand my meaning and I truly apologize for any misunderstanding. Please accept it.
[/edit]

jim_w

12:34 pm on May 6, 2003 (gmt 0)

I started at
[research.microsoft.com...]

Prerequisites to project Code Name: Tahoe

Tracking and Viewing Changes on the Web
Jan 1996
[research.microsoft.com...]
Information Retrieval & Analysis
[research.microsoft.com...]
Code Name: Tahoe
Tahoe Graduates to SharePoint Portal Server
March 12, 2001
Informationweek.com
[informationweek.com...]
One of the most recent is a product code-named Tahoe. It's a search engine that will debut later this year as part of Microsoft's SharePoint Portal Server

And this is out. It only does intranets. But they have some experience in developing SE�s. So we should not see them crashing systems from day 1.

Now after thinking as long and as hard as I could about this, and comparing M$ history, what about the following? I may be really off in leftfield here, but consider the following.

M$ did not want to kill NN necessarily, they just didn�t want them to be a predominant player in the browser world. This is because M$ has a vision of every computer sharing information easily. And you have to admit, with IE, I can view .doc, .ppt, .xls, etc. all with IE. Thus selling more office products.

Now it could be a concept that major SE�s hold enough power that by changing their algorithms, or if one becomes more prevalent than another one, the SE�s could make or break a .com. This due to the fact, and indeed according to my research, about 50% of the people doing searches never go to page 2. And if changing a SE�s algorithm or one SE getting more users than another, causes too many high stake players to fall to page 2 or lower, it can crush a company. I think Mr. Gates feels that the SE�s have too much power. But I don�t think he wants to get into purchasing over 100,000 PC and the labor, etc. in setting them up to become a Public SE. Too much investment into what is already a saturated market. Keep in mind that Mr. Bill lost something like $1b in the dot com crash.

It is known that .NET has the capability and examples to build a SE. It is missing the filtering algorithms. There are already several 3rd party SE�s based on .NET one can buy.

What if they don�t want google�s traffic, they just don�t want google to have it either. By supplying companies and end users with the tools to create their own search engines based on just that person�s or company�s interests, and allowing that database to share it�s information with secure servers, it could become a distributed computing SE.

For example, Motorola in Chicago could set up their own SE bot to look for information/data that is important for their employees to get their jobs done more efficiently. Motorola in Florida does the same thing, and these 2 SE bots can share information between themselves. Then all the other Motorola facilities do the same. You get a company now with several internal SE bots just getting very specific information/data, they now save time weeding out information that is not important to the work their company is trying to do, and in a search you keep employees from surfing off to sites not work related. If you connect your suppliers into certain portions of the SE database, now they become smarter and can give you better service and products. If all the Fortune 1000 companies did this, what kind of impact would that have on the major SE�s?

It won�t make SE�s obsolete, but it could cut down considerably on their amount of traffic, thus making them less valuable and much less powerful. It also takes out a bunch of sites that have better marketing than products and removes them from traffic flow to major corporations.

Now, misinformation is a game that the CIA as well as big businesses use all the time. Nothing better than a rumor that isn�t true to get your competitors to create a business defense, but it is the wrong defense because of rumors and once they find out what is really going on, they cannot recover fast enough to defend off any attempts. This has happen numerous times in the business world.

While this is just a concept, I don�t think it is that far fetched. Especially for anyone who has read Bill Gates� book. And of course, M$ sells even more .NET and IIS in the process.

Any opinions? Or have I totally lost it?

carfac

4:15 pm on May 6, 2003 (gmt 0)

Interesting theory.... but I do not think MS gains enough DIRECTLY out of it. I think they want to see a direct, tangible result... not "maybe" less people doing google because they ahve an Intranet se...

dave

jim_w

4:27 pm on May 6, 2003 (gmt 0)

On the other hand, for them to try to play catch up with google now may not be cost effective. They have stockholders and stockbrokers to answer to.

How much would it cost them to just get the hardware and ppl in place to start, let a lone to catch up and surpass google? Then add benefits for all those ppl. This way, they distribute the cost across many different companies and they would not have any investment. Only new sales.

Even if they threw as much money as is imaginable at it, how many years before they would start to see a return on that investment? I just don�t think the stockholders would wait that long. And it could cause the price of the stock to go down. A lot about this entire MS SE deal just doesn�t make much sense to me. As a matter of fact, I can�t see anything that makes sense about it.

carfac
I think they expand the SE to look outside of the firewall at filtered sites or based on where people already have bookmarks.

But what the heck do I know?

[edited by: jim_w at 5:18 pm (utc) on May 6, 2003]

martinibuster

4:28 pm on May 6, 2003 (gmt 0)

That's a great opinion jim_w. How do you think that strategy ties in with longhorn?

A unified Search interface... Longhorn can instantly search from a variety of locales, including local files, contacts and the Internet. "Filter by" options can also be used to narrow down results
[betanews.com ]
.

jim_w

4:47 pm on May 6, 2003 (gmt 0)

Since I don�t know the specifics about Longhorn, I can�t really say. But I do know that I am writing in VB5 and it plays with all Windows OS that they sell today. And I would not be surprise to see something in a future release of .NET that would make most everything backward compatible. Performance of backwards compatibility could be an issue. Who knows, Longhorn and a new .NET could make it even easier to achieve what I said.

From ZDNet

[zdnet.com.com...]
Of course, they�re raising every single flag regarding Windows .NET Server and pushing everyone�s attention that way, but for Longhorn (the codename of the next desktop version of Windows) there has been little announced or confirmed. Looking for confirmed facts about Longhorn, and possibly more importantly the versions of Windows AFTER Longhorn is like the proverbial search for the needle

Bill Gates likes to be in control. And if he can't be, then he does what ever it takes to take the control out of someone else hands. At least that is what it appears to me. The sad part here is I�m actually a pro-ms person.

martinibuster
It says search not crawl. I don't know. Samething?

martinibuster

5:59 pm on May 6, 2003 (gmt 0)

It says search not crawl.

Can't search what you haven't crawled. Can't crawl if you don't have a crawler.

One aim of Longhorn seems to be to integrate search into the desktop environment.

korkus2000

6:11 pm on May 6, 2003 (gmt 0)

I think this has to do with the integrated search on the next OS. I also think microsoft is savvy enough to take on a SE and see a return. MS has the history of building a program and using it everywhere forever. If they do build a crawler, which it looks like they have, I would suspect to see it in all programs and out on the web in every shape and form you can think of.

jim_w

9:15 pm on May 6, 2003 (gmt 0)

martinibuster

One aim of Longhorn seems to be to integrate search into the desktop environment.

Are you saying you see it as being an entire index and search facility on the desktop? I think of it as, searching in of itself on the desktop, and an indexing system being 2 different items. Maybe I�m splitting hairs, but I see �integrate search� as not necessarily indexing. But it could be filtering of indexed results. Or are you talking about filtering results from a MS SE? Because if that�s the case, it�s just a question from where those results are gathered from. All the components to do what I have said are in place. It�s just a question of tying them together and making them run fluid and transparent. Why would MS make it so easy for anyone to build a search engine using .NET to compete with them? I don�t think I would do that.

I hope not every Tom, Dick and Harry and of course Tina, Diana, and Hilary, (to be politically correct), will be able to create an indexing system on their desktop. It is already too hard to keep out email harvesters, etc. That would make it impossible. Not to mention, people selling bandwidth are going to get rich pretty darn quick and a lot of ppl that can�t afford to pay for any more bandwidth could be out of business.

Microsoft I said has a product that could, if they expand on it, compete with me. It deals with a business culture that was invented by Motorola in the 80�s called Six Sigma. Six Sigma has been adopted by such companies as GE and Honeywell, to name just a couple. Six Sigma, for it to work, requires a return on any investment in a very short time. This is based on the fact that a lot of companies spend too much on R&D and tooling up before they can get a return on the investment. Then the technology is obsolete and they never get any return. It also deals with measurements in the PPM, (parts per million) ranges.

korkus2000

I also think microsoft is savvy enough to take on a SE and see a return.

Yes I agree, they are savvy enough to do it, but, to me and based on my Six Sigma training, it is a question of after tooling up for such a project, could they go head-to-head with not just google, but at least yahoo also, and make a profit in a reasonable amount of time.

Didn�t I read here somewhere that google has over 100,000 or was it over 50,000 machines involved in the operation? (hell, I could have dreamed it, literally) The cost of the hardware, and the time to put it together, debug it, etc. would be years. Google didn�t start with that many, I�m sure they grew into that many a few at a time. So MS would have to start with that many and would still have to play catch up. Everything would have to go just right for MS if that�s the case. There is no margin for error, and that goes against the Six Sigma philosophy. Remember, what just six months ago, ad revenues were down, not only on the internet, but print ads as well? And if ad revenues could fall for no apparent reason, what else could? Traffic in general?

Look at MSN, it was suppose to be the end to AOL as I recall, now while I realize that AOL/Time Warrner shot themselves in the foot somewhat, MSN has not ended them. I don�t see Microsoft making the same kind of mistake again. I think Gates is too smart for that. But maybe I�m giving him more credit that he deserves.

If they do build a crawler, which it looks like they have

I�ve seen so much stuff published on the internet that was wrong, I don�t believe anything anymore until I have lots-o-proof.

I would suspect to see it in all programs and out on the web in every shape and form you can think of.

Curses! Maybe I�m just wishing that only larger companies will be sending out bots. Maybe if we all wish together really, really hard, only large companies will have the resources to do it. Of course like my mother use to say, �wish in one hand and pee in the other, ��

I�m probably wrong, but it is something to think about. I have a feeling that oneway or another our jobs are going to get harder.

martinibuster

10:17 pm on May 6, 2003 (gmt 0)

Or are you talking about filtering results from a MS SE?

Speculating, yes.

This is something that is integrated into Longhorn, "a refined search interface that lets users dig through local files, contacts, and the Internet."

Longhorn will also feature a brand new file system dubbed WINFS (Windows Future Storage), that intends to give users greater access to their information.

Integrating the web into the desktop environment has been a longtime aim of Microsoft. My speculative point is, if they are going to give users access to internet search, doesn't it make sense to give them a Microsoft crawled and controlled internet database?

and make a profit in a reasonable amount of time.

One word: X-Box.

jim_w

4:03 pm on May 7, 2003 (gmt 0)

Speculating, yes.

Well, only time will tell. I think you will agree, that it would/will/is a major project. And I think that they will need some of their top R&D people on it. And I sure would not want to be the network administrator.

One word: X-Box.

But isn�t that like comparing apples to oranges? Don�t 3rd parties make games for the X-Box? I see the only way to compare the 2 as either MS exclusively making all the games for X-Box and having their own SE, or 3rd parties making games and MS using other search engines.

martinibuster

4:27 pm on May 7, 2003 (gmt 0)

But isn�t that like comparing apples to oranges?

No.

My statement is a response to your questioning if MS would embark on a project if there were a question of their ability to "make a profit in a reasonable amount of time."

MS has a history of gritting their teeth and losing massive amounts of money for the sake of the long term goal. X-box is a perfect example of them knowingly losing money for the long term.

Red Herring article [redherring.com] (from last year)

Microsoft expects to lose $750 million in the current fiscal year ending June 30 and another $1.1 billion in the next fiscal year, according to a source familiar with the matter.

jim_w

4:40 pm on May 7, 2003 (gmt 0)

Fair enough

mil2k

8:29 am on May 16, 2003 (gmt 0)

Wow! What a turn this thread has taken. From Bots(of which i know nothing) to Microsoft(of which i know something!).

Windows 98 had an inbuilt Browser (IE 4) as a part of it's interface. The real reason was the Browser war(with Netscape) and to dominate their product.

Windows XP has an inbuilt Compression utility that infuriated third party vendors like Winzip.

Longhorn has "a refined search interface that lets users dig through local files, contacts, and the Internet."

This just fits into the Microsoft Pattern and goes with their aim of competing with google.

Their desired result is Microsoft everywhere for Everything.

This 111 message thread spans 4 pages: 111