homepage Welcome to WebmasterWorld Guest from 54.167.75.155
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member
Visit PubCon.com
Home / Forums Index / Google / Google News Archive
Forum Library, Charter, Moderator: open

Google News Archive Forum

This 157 message thread spans 6 pages: < < 157 ( 1 2 3 4 [5] 6 > >     
"Mr. Anti-Google"
Our own Everyman is on Salon!
GoogleGuy




msg:147007
 5:22 pm on Aug 29, 2002 (gmt 0)

Haven't seen anyone else mention it, so I thought I'd point out that our own Everyman has an article in Salon today. The story has also been mentioned on geeknews and Slashdot.

The user comments are pretty negative, so I'll try to pull the balance back the other way. I always appreciate hearing Everyman's perspective, even though we've got different views of some things, e.g. how Google ranks internal pages from a site; I think we do a good job of that. If you haven't read Everyman's "search engines and responsibility" thread and his google-watch.org site, I encourage you to. That said, I do disagree with statements like "Eventually, a FAST-type engine should be administered by a consortium of librarians who are protected civil servants of a world government." :)

Anybody have thoughts on the Salon article?

 

estjohn




msg:147127
 7:58 pm on Sep 1, 2002 (gmt 0)

Everyman, in the last post and previous posts you state a few items almost all people would agree with. Then you leap to making several quotes revealing a very bizarre, wrong-headed ideological root to your arguments.

Some of what you say is agreeable, such as a need for ethics in business, including Google. I doubt anyody disagrees with that...it's a bit of a red herring. Also, I think most people would agree they should be socially conscious, after all, that's good business, be considerate of the community you are doing business in (although I'm not sure that should be legislated).

Based on that, you then leap to this:

"Eventually, a FAST-type engine should be administered by a consortium of librarians who are protected civil servants of a world government. Or at least they ought to belong to the American Library Association, or something similar." April 13, Everyman, Webmasterworld

Given agreement on the assumptions of ethics in business, I don't see why this necessitates a search engine being under world government or a librarian association. Maybe you can clear that up.

I doubt your "social insensitivity of nerds" theory could be backed up with any data. However, I strongly disagree from an anecdotal standpoint. Most geeks (nerds as you call them) I know are actually very socially aware and conscious, more so than your average citizen. You are pepetuating a very ignorant, non-socially-conscious stereotype to say that geeks are not social and only care about their math or programming problems.

They are often more scientific and logic-minded in their approach but I don't think that precludes being ethical or socially conscious, do you?

john316




msg:147128
 8:23 pm on Sep 1, 2002 (gmt 0)

Would anyone use the library engine? If not, are you trying to hold SE's accountable to some international "deputy director of internet library systems"?

I'm not sure I get the point here, the theme seems to be pointing to "global village" ideology which is just "one world government" wrapped up in warm fuzzies.

cminblues




msg:147129
 8:42 pm on Sep 1, 2002 (gmt 0)


Ok the Everyman's posts are leaping from allpeople-agreable-statements to wrong ones [from my point of view], like:

Eventually, a FAST-type engine should be administered by a consortium of librarians who are protected civil servants of a world government. Or at least they ought to belong to the American Library Association, or something similar.

But there are at least some privacy issues [->Googlebar,cookies] that would have to be discussed in a 'logical' way.

It means taking care of what it means to have the perhaps biggest database in the world of Internet users browsing.

While this database is in the GoogleGuys PC's, I'm quiet.

I'm not so quiet if I imagine some government/company put its hands on this.

About the social insensitivity of nerds, I think the problem is not at all the Google geeks being not social-conscious.
The problem is, if the GoogleGuys will delegate some part of their biz to 'wrong' people.


Eric_Jarvis




msg:147130
 12:12 am on Sep 2, 2002 (gmt 0)

I'm afraid I've had to rather skim this debate due to time pressures...but something has been completely missed out of the discussion and it seems to be smessing up a lot of people's perspectives

the goal of search engines does not have to be to produce the single perfect set of results for a search

for one thing that isn't possible...when I type "foo" into a search engine I am probably looking for information on a different aspect of "foo" than you are...whoever you are...in fact, when I go looking for information on "foo" in a few hours time I may want to get to a completely different type of site

not long ago I would use quite a number of search engines...Infoseek for definitive academic info...Northern Light for informed comment...Lycos for the popular stuff...and altavista if I needed something obscure...they all produced very different results from very different databases

somewhere along the line this has all been lost...almost every search engine has tried to deal with Google's success by copying...so we now have any number of adequate generalist search engines, adn pretty much no way of distinguishing between them

Google's use of page rank works for Google...it gives a particular character to the SERPS...and coupled with a huge and frequently updated database makes it an excellent search engine for a general search for moderately reliable data

the problem is that there is now nothing else out there...Direct Hit is vaguely useful for a search for the populist sites on a subject...and, if you can get access to it, Northern Light is still very useful...but there isn't the range of options any more

this puts Google in a position where we are expecting it to be all things to all men...that is impossible and unfair...Google is being asked to make up for the failure of other search engines to successfully position themselves in the marketplace

bird




msg:147131
 12:53 am on Sep 2, 2002 (gmt 0)

I was just thinking about the possibilities that Google would theoretically have when really storing all the searches every user with a Cookie has ever made. Then I thought a little more, and after that, I did a quick calculation.

We all know that they run in excess of 10'000 servers to store, process, and query their database of websites. This is necessary, so they can satisfy more than 150 million searches per day, digging through the 2.5 billion pages they crawled.

Now you might think that those 2.5 billion pages result in an enormous amount of data, right? If you think so, then just do the same calculation as I did, and you'll find that storing all the user search profiles for one year would result in more than an order of magnitude more data than that. And this is only for storing. If you add the overhead that would be necessary to perform any searches through this tracking data (without which storing it would be pointless), then you'll arrive at dimensions that are completely unmanageable even for a company like Google.

In other words, if you're worried that Google is tracking your every move, the answer is simple: They can't. The effort to maintain the infrastructure necessary to do that would be way beyond the resources that they can allocate for all their current business activities combined.

And even if they could: What motivation would they have to allocate 90% of their resources for user tracking, instead of for their actual business?

Everyman




msg:147132
 2:08 am on Sep 2, 2002 (gmt 0)

I disagree. If you store search terms, IP number, and time stamp under a single cookie ID, my calculations show that after compression, it would be about 40 bytes per search. 150 million searches per day means 6 gigs of data per day. This information is something that has potential commercial value, and it is certainly worth the cost of storing it, even if you can't effectively search it just yet.

And who says that Google is the one who's interested in the data? Maybe it's going out the back door to the National Security Agency. (Actually, there are laws that prevent the NSA from collecting data on U.S. citizens, but those laws have gotten a lot weaker in the last year. The NSA could "loan" the FBI the computer capacity to handle the information, and that might be sufficient to cover themselves legally.)

You don't think the spooks would pay for this sort of information, assuming that there was no risk of exposure? I think they'd pay $100,000 for each 6-gig disk. That would amount to $36 million per year, which probably covers Google's payroll and operating expenses.

The intelligence community has a budget of some $30 billion per year, so $36 million would be 0.12 percent of the U.S. intelligence budget. No one in Washington would even raise an eyebrow because the intelligence budget is secret.

Google doesn't have to search it, they merely have to store it. Whoever ends up searching it could no doubt develop sophisticated programs to zero in on what they want. If you don't need the data in under one second, the task is not overwhelming.

The point is that Google admits that they store this information. What you're suggesting is that they don't really store it because they don't have the capacity, so they're lying to us when they tell us that they store it. This makes no sense whatsoever.

Key_Master




msg:147133
 2:16 am on Sep 2, 2002 (gmt 0)

I would be surprised if Google hasn't already been served with a warrant by the US Government for portions of their log files (in response to activities leading up to the 9/11 disaster).

Everyman




msg:147134
 2:27 am on Sep 2, 2002 (gmt 0)

Good point, Key_Master. And don't expect GoogleGuy to confirm or deny.

From my reading of the Patriot Act, a court order could be obtained for this information from a judge who was required by law to issue the order without a showing of probable cause.

Secondly, it would be illegal for GoogleGuy to reveal that such an order was served.

So don't ask GoogleGuy, because he wouldn't be able to log into WebmasterWorld from jail.

Apart from all these paranoid (but quite within the realm of possibility nevertheless) scenarios, the bottom line is this: Why does Google persist in collecting this information?

cminblues




msg:147135
 4:04 am on Sep 2, 2002 (gmt 0)


Hehe, some facts about logs storage.

We all know how the servers speed up without the pain of I/O access disk time, which is the VERY reason of hardware latency in web servers.

But, if at Google someone decide to store logs..

Here are some numbers:

1]------------------
Let's examine a typical search, with only IP data [no UserAgent, 'ACCEPT: blahblah' and all others header] and Cookie enabled:

----BEGIN REQUEST----
/search?hl=en&q=randkeyword1+randkeyword2+randkeyword3&btnG=Google+Search
Cookie: id=1234567890
----ENDOF REQUEST----

2]-------------------
Let's examine the same typical search, but now with ALL header's data logged.

----BEGIN REQUEST----
/search?hl=en&q=randkeyword1+randkeyword2+randkeyword3&btnG=Google+Search
Accept: */*
Accept-Language: en-us
Accept-Encoding: gzip, deflate
User-Agent: Mozilla/4.0(compatible; MSIE 5.0; Windows 98; DigExt)
Host: www.google.com
Connection: Keep-Alive
Cookie: id=1234567890
----ENDOF REQUEST----

--------------
--BEGINTEST---
--------------
Note that, in my test, 'randkeyword-1-2-3' are all random generated text, all different per query.
All are from min of 4 char to max of 12 char sized, all the char completely random.
Which, I think, 'recalibrate' [from the compression-software point of view] the other data being ever the same.

Result of test 1]:
1 million queries log file size [gzipped] -> 25.193.746 bytes
10 million queries log file size [gzipped] -> 251.914.576 bytes

Result of test 2]:
1 million queries log file size [gzipped] -> 25.472.646 bytes
10 million queries log file size [gzipped] -> 253.648.124
--------------
--ENDOFTEST---
--------------

As we see, the real data size [at least in this example ;)] is 25M for 1 million queries,
and 250M for 10 million queries.

These amounts of data fits easily in many data storage devices ;)

[I,ve 'attached' the brainless perl-script I've written for this test]

----BEGIN PERL SCRIPT----
$minchar = 4;
$maxchar = 12;
$string='abcdefghijklmnopqrstuvwxyz0123456789.-+()';
$sizestring = length($string);
srand();

$rep = $ARGV[0];
$file = $ARGV[1];

if(@ARGV != 2) {
print "Usage: perl $0 numrep filequery
";
exit 0;
}

open(R, $file);
$firstline = <R>;
while(<R>){
$querytot.= $_;
}
close R;

if($firstline !~ /\w/) {
print "No data in $file\n";
exit 0;
}

open(W, ">$file.OUT");
for(my $cc=0; $cc < $rep; $cc++) {
my $intline = $firstline;
$intline =~ s/kw/&makerndkeyw()/eg;
print W "$intline$querytot";
}
close W;

print "Written file: $file.OUT\n";
exit 0;

sub makerndkeyw() {
my $ret;
my $max = int(rand($maxchar)) + $minchar + 1;
for(my $cc=minchar; $cc < $max; $cc++) {
$ret .= substr($string, int(rand($sizestring)), 1);
}
return $ret;
}
----ENDOF PERL SCRIPT----

P.S. Whoops, I forgot to insert the IP client data into the log size calculation.. I'm an idiot.
But, thinking of how the rand keyw stuff in the script work, I don't think the real sizes really would be changed a lot.


Everyman




msg:147136
 5:15 am on Sep 2, 2002 (gmt 0)

So what you're saying is that it would take about 30 bytes (after adding the IP number) instead of my ballpark estimate of 40 bytes. The search terms can be compressed dramatically if you use a dictionary organized by word-frequency, so that common words require fewer bytes. Heck, let's say that 30 bytes is the upper limit.

So you need to store 4.5 gigs per day. That's about 8 CD roms. Hire a high school kid to come in and burn 8 CDs every day, at $5 per hour. Shucks, let's automate it -- we don't have 50 Ph.D.s on staff for nothing.

How much physical room do we need to store 8 CDs per day? A room about the size of a small bedroom, with lots of filing cabinets, would be perfect -- we'd have decades of storage capacity this way.

I have to insist that the storage of this information is the easiest thing in the world. I don't know what they're doing with this information, if anything, but I don't think anyone can seriously argue that storing it is beyond Google's capacity.

Key_Master




msg:147137
 5:26 am on Sep 2, 2002 (gmt 0)

I can't believe this is being argued. Of course they log visits.

[google.com...]

dantheman




msg:147138
 5:42 am on Sep 2, 2002 (gmt 0)

It's interesting to see a discussion that's not gushing about google, because every other article or comment on the web does. (Gets a bit boring) I agree that a 36 year cookie is absurd and we have never been provided with an explanation from Google as to why they have set it for so long.

I strongly disagree that google should be run by any form of gov't body. Could you imagine the committee meetings? "Well we sure are doing great everybody. Now how are we going to stay competitive in the years ahead?" "I know. Let's aim to have a steering committee setup within the next 12 months. Then the steering committee will select a panel (one member per continent) by the end of the following year who will prepare a list of items for discussion. We'll probably need to lobby for more funding by that point, but we should have the 5 year plan mapped out by 2006. All say "aye"..."

Ok, so I haven't got much time for gov't bodies. :) One last interesting point is how Googleguy raised this thread but hasn't really addressed any of the issues raised.

cminblues




msg:147139
 6:05 am on Sep 2, 2002 (gmt 0)


Yes, I say that Google's access logs storage is not at all a problem, even logging Cookies etc.

Maybe a problem, in this scenario, is the performance of the distributed Google PCs driving the h.disks writing logs.

But, add the USA anti-terrorism-etc laws..

On the other hand, I think that it's impossible, for a major institution like G, to remain completely away from government control.

I know it's a terrible hypothesis, but I would prefer G instead Altavista or some other Company-based s.eng in the above case.

Apart of that, the 'artist-scientist/establishment' issue is a very ancient and sad issue.


Brett_Tabke




msg:147140
 6:28 am on Sep 2, 2002 (gmt 0)

So what you're saying is that it would take about 30 bytes (after adding the IP number) instead of my ballpark estimate of 40 bytes.

IP-4 bytes (possibly compressed to much less depending on log format. Logs could be stored by a-b-c blocks).

Time-less than 1 byte. (running log that marks time such as:
12:02am....all log entries... 12:03 am).

Search term: I'd guess 15-20 characters per search would be average. After compression (tokenization), it probably would be stored in the 5-10 character range.

Total log line length 10-15 bytes (max).

What else could/would be stored like the cookie id is unknown.

Worst case times a few hundred million queries a day, and you're looking at less than 4-5 gig a day in log files. Very doable.

>36 year cookie is absurd

It has to do with working around browser cookie bugs. Cookie programming is *ell. The safest course is to set a perm cookie to the max allowable time range and let the browser figure it out.

Consider a ad brokers cookie (I'm not going to name companies, but we all know who the nets major banner servers/brokers are):

- you visit a search engine and search on something.
- you click a result that goes to a page that has graphic ads on it.
- your browser pulls the ad, and it is often redirected several times before the final image is found.
- during those redirects, your browser fesses up a referrer (often what you just search on). That not only goes the final banner server, but all those redirects inbetween.
- plus, they also get the destination page you just visited, based on the page affiliate id (code that calls for the image in the first place).
- Ok, you think it's done here? What happens when you repeat the process with another site that has the same banner broker!?

So instead of just giving up a search term to Google, you fess up every search term you make from every search engine and hand it to an ad server over years and years worth of time. Hello!? Now that - that's something to get worked up about!

A cookie at Google? In comparison to the ad banners, it's a total, complete, and such a non-issue that it is naive and misleading. The cookie at google is 100% their business. The cookie from the ad servers who've been known to sell the data!? That's a whole different league.

Key_Master




msg:147141
 7:02 am on Sep 2, 2002 (gmt 0)

The cookie at google is 100% their business.

I don't have an issue with Google using logged data in a manner consistent with their privacy policy (as it now stands). I think the concern lies with other outside entities making this information their own business.

There is also the chance that Google's privacy policy will change, perhaps drastically (think Yahoo!) when they go public. Imagine having a life times worth of a users search history, multiplied by thousands of people...

dantheman




msg:147142
 7:15 am on Sep 2, 2002 (gmt 0)

I also agree that your average ad server has a woeful cookie management program. However I don't think they're a great industry group to draw comparisons with, given their track record. I don't know enough about the tech side of configuring cookies, except that I'd regard any problems as easily fixable with 50 Phd's on the team. We've seen Yahoo change their privacy policy on a number of times as it suits them...will Google be any different? Time (and shareholder urging) will tell.


martin




msg:147143
 9:50 am on Sep 2, 2002 (gmt 0)

google-watch.org crashes Opera 6 on Linux

cminblues




msg:147144
 10:02 am on Sep 2, 2002 (gmt 0)


I've Opera 5.0b8 on Linux 2.2.16:

+ Javascript
+ Referrer logging
+ automatic redirection
+ accept all cookies

Browsed 'www.google-watch.org', followed all links.
All quiet.

Only to share.


john316




msg:147145
 2:30 am on Sep 3, 2002 (gmt 0)

"I think they'd pay $100,000 for each 6- gig disk."

You could probably bribe a librarian for a lot less ;).

google-watch.org is also crashing opera 6.01 (linux).

Everyman




msg:147146
 4:07 am on Sep 3, 2002 (gmt 0)

Sorry about the Opera problems. The only thing that I can see that would crash any browser is the JavaScript that tries to steal a cookie from Explorer.

This line is executed only under this situation:

if ( navigator.appName == 'Microsoft Internet Explorer' )

If Opera 6.x under Linux returns true for this line, then it will try to execute something that makes no sense when you click for a cookie. It works fine on Opera 6.0 under Windows, which properly returns false for this line.

Or perhaps it's crashing on this line itself. If it crashes while loading the page, then that may be the problem.

Try disabling JavaScript. I'm open to suggestions as to how I can prevent Opera from executing it. Can I put it inside a condition such as,

if ( navigator.appName != 'Opera' ) or some such thing? Probably not.

I won't delete this exploit. It's too delicious for the 25 percent of Explorer users who get to see their very own Google cookie, including the date when they first got it, and the date when they last set a preference. It makes a more lasting impression on them, compared to a new sample cookie from Google.

savvy1




msg:147147
 6:47 am on Sep 3, 2002 (gmt 0)

Sorry to hear about your buggy browser, but this is not the place to report Opera bugs, probably better to submit it here:

[opera.com...]
https://bugs.opera.com/bugreport.cgi

FWIW, and to continue to be OT :) It didn't crash Opera 6.04 under Windoze XP for me.

Everyman, yes you can browser sniff and detect Opera.. though Opera commonly masquerades as IE, its still possible to detect it. I saw a thread on it recently on one of the webmaster type forums. I'm not a javascript expert, sorry. I believe it basically amounts to making the first condition checked for Opera...

Giacomo




msg:147148
 8:27 am on Sep 4, 2002 (gmt 0)

Just wanted to point out an excellent SEW article [searchenginewatch.com] addressing many of the issues being discussed here and in other recent WebmasterWorld threads.

Any comments?

Everyman




msg:147149
 1:49 pm on Sep 4, 2002 (gmt 0)

Only a Google fan club site would introduce a discussion of Google with an anecdote from The Brady Bunch TV show.

Why Sullivan has a PageRank of 9 is a more interesting question. PageRank skews toward the lowest common denominator.

Giacomo




msg:147150
 5:41 pm on Sep 4, 2002 (gmt 0)

Everyman, I don't see SEW as a Google "fan club", and I don't think the way PageRank is assigned is "skewed" or biased in the way you insinuate.

What I think is that you should start looking elsewhere than conspiracy theories if you want to find the reasons for your greyed-out PR bar.

Filipe




msg:147151
 7:22 pm on Sep 4, 2002 (gmt 0)

Everyman, I've wanted to keep this as fair a discussion as possible, but you keep throwing out loaded after loaded statement. If you really want to make your point make it objectively.

Search Engine Watch is "the lowest common denominator"? Search Engine Watch is a quality-content rich site, the like of which we need more on the Web. Danny Sullivan is knowledgeable and has proven himself time and time again to have functional (not to mention interesting) information.

The lowest common denominator are those sites that have no content, very little quality content, or are not useful whatsoever. If you think otherwise, then you have a less idealistic view of how search engines and the Web should work than you originally portrayed yourself as having.

Furthermore,

Only a Google fan club site would introduce a discussion of Google with an anecdote from The Brady Bunch TV show.

Why do you think that? Google simply is the most popular search engine around, just as Marsha Brady was the most popular Brady girl. That's where the parallel lies. I don't know if you thought Mr. Sullivan was saying something highly opinionated - but it was just a little analogy describing what was already fact.

Everyman




msg:147152
 7:58 pm on Sep 4, 2002 (gmt 0)

My grayed-out toolbar is because google-watch is new. I've never implied otherwise.

Danny Sullivan's piece is a fawning pro-Google piece. I'll cite three examples of why his piece is pro-Google.

1. Most of my site is about Google's pathetic privacy policies. Sullivan's only response is that other search engines do the same thing. In fact, most search engines were not using cookies that expired in 2038 until Google started doing it.

2. Sullivan mentions China's blocking of Google without making an obvious connection to Google's privacy policies. Let me say that if I were a Chinese official, I'd be just a little bit worried that the National Security Agency most likely has easy access to Google's search terms, IP numbers, and cookies from foreign countries, such as China, that are of national security interest. As a Chinese official, I'd be worried that the NSA is doing cluster and link analysis, both geographically and with the search-terms, to map potential pockets of pro-U.S. opinion within China. This would reveal potential recruitment opportunities for U.S. intelligence. It's a lot easier for the NSA to get this information from Google, than for the Chinese to filter it out of packets in mid-stream. Therefore, assuming that Google makes this information available to the NSA, it puts U.S. intelligence at an advantage. Reason enough to block Google, from the Chinese point of view.

3. Sullivan has easy access to key Google people, such as Matt Cutts, and merely lobs softball questions. Mr. Cutts is a former employee of the NSA with a top-secret clearance.

Let's face it, Mr. Sullivan writes pro-Google puff pieces.

Brett_Tabke




msg:147153
 8:34 pm on Sep 4, 2002 (gmt 0)

Lets leave the personal stuff for another time and place. It only ends up in a classic he-said she-said degradation.

You read ds's latest Google peice? (hmmm, I think it might be subscriber only?). Very interesting - comes near to using th M word (monopoly) in describing Google.

<added>
[searchenginewatch.com...]
</added>

[edited by: Brett_Tabke at 3:25 am (utc) on Sep. 5, 2002]

Filipe




msg:147154
 8:37 pm on Sep 4, 2002 (gmt 0)

Therefore, assuming that Google makes this information available to the NSA, it puts U.S. intelligence at an advantage. Reason enough to block Google, from the Chinese point of view.

That's pure theory, and one without much ground to support it. Everything surrounding China's banning Google has to do with fear of subversion in the Chinese government. It's not just SEW that's reporting this - read about it anywhere, and you'll be one of the few people (if not the only person) who thinks they fear Google leaking information to U.S. intelligence.

Mr. Sullivan does not wholeheartedly defend Google. He says that the suggestion that before Google, we couldn't find information on the Web is "a popular, growing myth." He also makes note that several features that people think are unique to Google (e.g., the "Google effect") are things other search engines have done for years. He doesn't claim that they're perfect. Sullivan: Sites do get accidentally dropped [from Google's Index].

He is also not simply touting Google. What is he to say? "Google isn't so hot.", "Don't focus on Google.", "Google isn't the biggest search engine out there"? There's little he says that isn't fact, and that which comes off as opinion, he has grounds for support.

Like you suggest, PageRank in the webmaster's view has become too important. Google itself tries to point out that while PageRank is a fundamental aspect of their unique technology, it is only a small part of it. Sullivan expounds on this fact, saying "Like Marcia [Brady]'s injured nose, everyone pounding on about Google's link use is obscuring the other features of its face. He doesn't go on to say, as you might suggest they would, that PageRank is a Godsend and the answer to all our Search Engine woes.

He easily debunks your theory about only high PageRank sites listing on Google. Sullivan: That's not true, because PageRank -- the importance of a page -- is only one part of many factors that Google uses to rank pages. He then goes on to explain the different factors involved.

Herein lies the big problem: Google's dominant popularity is many ways its greatest weakness. It's what caused you to target them. It's what caused China to target them. Furthermore, it's also why the Chinese people who use the internet are upset. Google is the biggest because people want to use it. They don't think it's spammy, as you suggest in your articles. Sullivan brings up another good point:

Today, no one is worried that Yahoo needs to be regulated. It remains an important search engine, but clearly it does not control what people go to on the web. And if Yahoo doesn't, then neither does Google. Indeed, the fact that Google has competition is the key reason that the company currently does not face the anger and concern that many hold for Microsoft.

Furthermore, he brings up an excellent point, which doesn't apply to you because you're not in their index yet... but you will be:

I would say the vast majority of [webmasters] are appreciative of the traffic they receive -- and they do feel they are getting that traffic because Google's system works.

I agree, and almost all webmasters on this forum would also agree.

Finally, if you fear a monopoly:

"We have very poor lock in. Microsoft has very high lock in," said Google CEO Eric Schmidt, when we spoke at Google's offices last month. "The switchover cost for you to move to one of our competitors is none. As long as the switchover costs are so low, we run scared. Everyday I wonder if there are very smart people at Berkeley coming up with a new algorithm..."

He plugs your site a few times, too. Furthermore, he links to you and goes in great, great depth about your site. He debunks a lot of your theories (in ways most webmasters would agree with, and not just because we're "Sullivan's Sheep" - most of the experienced Webmasters on this forum are wise and can make their own decisions about what is right and what is not), but he also agrees with you:

Brandt is absolutely correct in one of his closing statements: "Overall, linking patterns have changed significantly because of Google," he writes, suggesting these changes aren't for the best. He's right. Cross linking for purely promotional purposes has gone haywire, and as the obsession grows -- and the industrial attempts to build link popularity rise -- Google and all the crawlers will be under increasing pressure to add something new to their mix to keep search results useful.

He is being as objective as he can be. Do you think this isn't so just because he disagrees with you?

dannysullivan




msg:147155
 1:24 pm on Sep 5, 2002 (gmt 0)

Not to belabor the point, but since Everyman made some specific accusations against me, I'd like to post my response.

> 1. Most of my site is about Google's pathetic privacy policies. Sullivan's only response is that other search engines do the same thing. In fact, most search engines were not using cookies that expired in 2038 until Google started doing it.

This wasn't an article about search logging and privacy issues. It was an article about the significant challenges Google faces in the wake of its continuing popularity. The privacy issues deserve their own article, and as I said in my Google article, it's something I intend to come back to. It's also something I've covered before, back in 1998, actually:

GlobalBrain To Offer Profile Searching
[searchenginewatch.com...]

and at the end of last year:

Google May Get Personal
[searchenginewatch.com...]

I also don't see how this point makes my piece pro-Google. To me, the issue isn't that Google's cookies or that anyone else's expire in one day, one week, one year or one century. The bigger issue is what's done with search logs, period.

Google's cookies help it identify a particular machine as being unique, as opposed to knowing it only by IP address. In some cases, they might be able to tie a cookie to someone's email address (if they registered for Google Groups) or an advertiser account (if they registered for AdWords). For the vast majority of users, however, that cookie doesn't tell Google anything about who they are.

In contrast, a cookie at Yahoo may very well help Yahoo understand who you are personally, assuming you registered with Yahoo and didn't lie about your personal information. In addition, this also means that Yahoo can tie your personal searches to your unique identity. And, as I explained the Salon reporter, Yahoo intends to make what I'd consider search engine history make making use of this data to deliver targeted email ads. Search for "cars" on Yahoo, for example, and if you've registered with them, you might get an email ad from Ford.

As I said, this is all something I expect to explore in more depth in a future article and why the search industry *as a whole* may have to adopt some more specific privacy policies about search logging.

2. Sullivan mentions China's blocking of Google without making an obvious connection to Google's privacy policies... Reason enough to block Google, from the Chinese point of view.

My idea as to the "obvious" reason why Google would be blocked is the same as many who've posted on this board, that Google's "cached" feature makes it easy for those in China to see pages that are ordinarily blocked. For that same reason, the AltaVista translation service also appears blocked. I also fail to see why this point made my piece pro-Google.

3. Sullivan has easy access to key Google people, such as Matt Cutts, and merely lobs softball questions. Mr. Cutts is a former employee of the NSA with a top-secret clearance. Let's face it, Mr. Sullivan writes pro-Google puff pieces.

Well, WebmasterWorld.com has even easier access to Google, given that Google proactively comes over here and posts. I wouldn't consider that to mean that the WebmasterWorld.com community simply embraces everything that Google does.

Filipe already provided some cites from my article that dispute the accusation that I wrote a puff piece (thank you), so there's no need to go into depth here with my own detailed responses.

I think it's not an issue with the question I'm asking as with the answers that come back. You don't agree with what Google says. Fine. I wholeheartedly agree with you and anyone having their own viewpoints, and I'll publicize those, just as I did for you and your site. However, Google has a viewpoint as well. I think people want to hear that viewpoint, too.

przero2




msg:147156
 4:19 am on Sep 6, 2002 (gmt 0)

everyman, dannysullivan - as Brett said "Lets leave the personal stuff for another time and place. It only ends up in a classic he-said she-said degradation.". Personally, I am not in favor of anyone using this forum to make accusations on others' web sites and their PR or drawing conclusions like 'PageRank skews toward the lowest common denominator' with one or two examples.

That being said, revert this forum back to Google issues that IMHO need an explanation:

1) 38 year cookie policy
2) Page Rank issues with sites offering large content. When I see quality regional content, granted 10 sub-levels deep in the directory, but highly relevant to the topic buried down with PR0 or PR1, it bothers me!
3) Last, but the most important to me is Google's cache. Why doesn't they make caching an Opt-In (rather than Opt-out)!

deltakits




msg:147157
 11:00 am on Sep 6, 2002 (gmt 0)

1) 38 year cookie policy

Maybe i'm just naive...Why do we care if it is a 38 year cookie? In my opinion, there are two parts of this...

1)If Google is NOT divulging this information, why do we care whether it's a one day or 100 year cookie? Doesn't make any difference.

2)If Google IS divulging this information, it doesn't make any difference either that it is a 38 year cookie or a 38 day cookie! It's bad either way! What am I missing?

3) Last, but the most important to me is Google's cache. Why doesn't they make caching an Opt-In (rather than Opt-out)!

My opinion is that Google feels that this is an important addition to their search engine, and if it was Opt-In, how many people are going to know enough about it to KNOW to Opt-In? Or what exactly they are Opting in TO?

This 157 message thread spans 6 pages: < < 157 ( 1 2 3 4 [5] 6 > >
Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Google / Google News Archive
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved