homepage Welcome to WebmasterWorld Guest from 54.167.177.180
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member
Home / Forums Index / Google / Google News Archive
Forum Library, Charter, Moderator: open

Google News Archive Forum

This 156 message thread spans 6 pages: < < 156 ( 1 [2] 3 4 5 6 > >     
Google cache raises copyright concerns
Clark

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 15143 posted 9:09 pm on Jul 9, 2003 (gmt 0)

Everyone loves to write about Google:
[news.com.com...]

 

the_nerd

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 15143 posted 8:48 am on Jul 10, 2003 (gmt 0)

looks like they will have to change to "opt-in" soon (after their IPO there will be a lot of cash around to go for). If they use a long enough transition period (half a year maybe) - most people who want their stuff "cached" will have time to add something to their robots.txt-file

Morgan

10+ Year Member



 
Msg#: 15143 posted 9:01 am on Jul 10, 2003 (gmt 0)

Good point Brett. Or points rather. I don't know if taking away their displays of cached pages will hurt them too badly, but it will be interesting to see how it all turns out.

gsx

10+ Year Member



 
Msg#: 15143 posted 9:14 am on Jul 10, 2003 (gmt 0)

Of course they need to make caches of the pages for indexing purposes. What do you think AltaVista, Inktomi and FAST do?

But there is difference - to cache the page for your Googles use is like taking a photocopy of a book, pulling out points from it and giving the book author some free advertising out of it.

To publish the whole book without permission (i.e. make the cache public) would violate copyright, there is no other it can be looked at.

Josk

10+ Year Member



 
Msg#: 15143 posted 9:28 am on Jul 10, 2003 (gmt 0)

I've just got in to work, and I see that the Internet has taken *me* Way-Back-When to 2001... Since when is any of this news? I tend to see this as slow news day then anything else. And incompatence on NYT's admins.

Pages get cached evey few seconds, Google just lets you see what they saw. At least you know what Google see... But what about Altavista, Inktomi and AlltheWeb. The cache has often been valuable to me to see exactly what was retrieved by Google...

GlynMusica

10+ Year Member



 
Msg#: 15143 posted 9:39 am on Jul 10, 2003 (gmt 0)

Interesting post Brett.

kaled

WebmasterWorld Senior Member kaled us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 15143 posted 9:41 am on Jul 10, 2003 (gmt 0)

There are lots of services where you don't Opt-in that share your information without asking. For instance, a phonebook will display your name,address,phone number unless you Opt-out.

Generally, copyright on this information does not belong to you, it belongs to the publisher. If you were to copy and republish a page from a phone book, you would not be breaching hundreds of individual copyrights, you'd be breaching the copyright of the phonebook's publisher.

As for opt-in VS opt-out, I don't know how the wind is blowing in other countries, but here in the UK, there have been many legal challenges to opt-out policies and the rulings usually favour an opt-in policy.

In practical terms, I think Google will keep the cache until someone launches a lawsuit against them. However, who is likely to do that? No large company is likely to do this.

Kaled.

PS
I had a quick look at the robots.txt definition and could see no reference to caching. How do I switch off caching using a single robots.txt file rather than adding a meta tag to all my pages.

Clark

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 15143 posted 9:44 am on Jul 10, 2003 (gmt 0)

Looks like everyone is taking a side and repeating each other now [edit, wow a couple of posts while I was typing]. I was just waiting to hear what legal experts have to say and Brett answered that one.

Looks like 'G' had a nice run of luck and will milk it as long as they can. Fair enough. There are so many legitimate reasons to like the cache but none of them make it legal nor address the moral implications of copyright.

The ones using the Robots.txt as an argument assume a lot of knowledge on the part of people that they simply don't have. I'm pretty savvy technically compared to the average population. I'm decent at unix,perl,php,mysql and a bunch of other languages and I've never touched a robots.txt file nor do I know how to.

I've looked at the spec but it looks complicated for the average user who might use geocities or frontpage or pay a webmaster to display copyrighted content.

Why should G expect that person to even know about Google let alone how to format a robots.txt to protect their content? Not having heard of Google is not a reason to expect their copyrighted laws to not be protected.

P.S. Although Brett did not answer what happened to that "GG says" page if you do a search at WW you'll see that he killed it because of copyright violation.

[edited by: Clark at 9:48 am (utc) on July 10, 2003]

lasko

10+ Year Member



 
Msg#: 15143 posted 9:46 am on Jul 10, 2003 (gmt 0)

So will we see:

<Meta name='cache' content='Cache me'>

Or will it be

<meta name='cache' content='I hereby state that permission is granted to google to cache my web site'>

Either way

I think the newspapers don't like the Google news feature,
which is surprising when I bet alot of the newpapers traffic comes from Google.

You just can't please everyone these days :(

Hester

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 15143 posted 10:36 am on Jul 10, 2003 (gmt 0)

The Google cache is useful, but I don't use it every time I search. Without it, I see no reason for Google to continue growing.

The point here seems to be that all SE's use a cache, but Google are allegedly breaking the law by publishing their cache. Google's only defence may be that the contents are "temporary".

Now surely the WayBack Machine is an even bigger culprit as it is archiving millions of sites on a permanent basis? How do they get away with that one then?

Chris_R

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 15143 posted 11:56 am on Jul 10, 2003 (gmt 0)

I think Brett brings up some good points, but this is what I would argue if I was Google's lawyer:

1) People use Microsoft Internet Explorer to find and access webpages.
2) It is a tool.
3) Microsoft Internet Exporer has "BRANDING" on the title bar for every web page it accesses.

AND

1) People use Google to find and access webpages.
2) It is a tool.
3) Google has "BRANDING" at the top of the cache for every web page it accesses.

The cache is fair use - it is no different than a translator or other thing that does something to a webpage - it allows users to find the word they are looking for on that page.

I'd be willing to bet money the google cache holds up in court.

I think EVERY major search engine caches pages - they just all don't let you see it (I could be wrong about this). If that is the case - if they copy it - and no one sees it - is that legal if google's is ilegal.

The Google cache is what built Google. No cached pages - no Google.

I know you strongly believe this, but I don't see it. Google had a search engine that actually crawled a large portion of the internet and indexed it relatively well and in a timely manner - during a time when no one else did. I think that is what built google.

Hester

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 15143 posted 12:15 pm on Jul 10, 2003 (gmt 0)

The difference between your two comparisons still falls short of the law. The reason is that in the first example, you are going directly to a user's website, hosted on their server, or a legally acceptable copy server used to mirror it.

But when the user goes to Google, they are no longer seeing a page controlled by the user. It is now under Google's control.

If the user suddenly updates their page, people will see the updated version. But in Google's cache, it is still the old version.

To widen the argument though - does Google indeed cache ALL of the site? Or just the top ranked pages? If the latter, then it's not copying sites en masse but merely providing a snapshot.

argots

10+ Year Member



 
Msg#: 15143 posted 12:19 pm on Jul 10, 2003 (gmt 0)

Brett, I simply don't understand your case that the cache built Google.

I think the vast majority of users who have used the cache (probably a minority of all users) have only clicked it when the link to the acutal site is down.

To me, this is in the realm of "nice to have" but is not a make-or-break feature.

Can you explain the importance of the cache?

mfishy

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 15143 posted 12:28 pm on Jul 10, 2003 (gmt 0)

<<>What's wrong with meta "NOARCHIVE"?

Because it is Opt Out. You can't opt out on illegal matters. It's like saying if you don't have a sign on your front yard that says stealing is not ok, then any one can help themselves to your stuff. >>

Hehe, that pretty much sums it up.

Also, keep in mind that not everyone is a WW member and most probably have no idea how to prevent this or know about it at all.

zeus

WebmasterWorld Senior Member zeus us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 15143 posted 12:28 pm on Jul 10, 2003 (gmt 0)

Ones again a typical USA thing, get so much money out those companys that have succes, I do understand that so many European companies are afried of moving some of there business to the USA, it dont take long and you have some kind of lawsuit over the head or other complains.

Now where everyone in the US knows Google is a succes, they just want to cash in, some how.

I do LOVE USA, but damn you have some serious problem there.

zeus

aspdaddy

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 15143 posted 12:31 pm on Jul 10, 2003 (gmt 0)

IITitan said:

>a brief abstract of the article, asking one to pay money to read those articles

This is so annoying and its happening more and more. I reckon companies are strategically doing this, putting premium content out for spiders and then charging for it once its been listed.

If they are, they should accept that it can can be cached and copied, not just by Google - ISP's and Users.

Clark

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 15143 posted 12:35 pm on Jul 10, 2003 (gmt 0)

Although I don't agree that the cache MAKES google, it does help keep G less spammy. It defends against cloaking and mishmashing keywords en masse. Here's an example.

If you're altavista and you are targetted for keywords butter and guns, you can enter hundreds of related keywords into a homegrown program. It will create a few hundred thousand pages with varying amounts of your target keywords so you get basically every keyword density, combination, keywords in title, etc. Then you detect that altavista is crawling, you show AV your fake content, but when their users click on the link, you deliver that same 1 page selling guns and butter.

Do that with to Google with a cache and you will be quickly caught and banned.

I don't know if that's what Brett was thinking of, and I don't have experience with such programs but did learn about it here. Maybe someone else w/ experience can tell us if G is still susceptible or if the cache stopped them? And if AV and the others are susceptible (thereby reducing their viability?)

Brett_Tabke

WebmasterWorld Administrator brett_tabke us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 15143 posted 12:39 pm on Jul 10, 2003 (gmt 0)

>understand your case that the cache built Google.

It's a case study in the success of branding. It's the most powerful tool Google has in it's arsenal. It is so hard for the general public to understand the power of branding that most won't get it.

Google has been almost too successful at their branding. They wanted to make themselves synonymous with net searching and now their name is being used in place of searching as a verb: to google something.

1) all net studies have shown the amount of time a user spends on a site, the more likely that site is to be successful.
2) all marketing studies show that branding is paramount to success for the largest sites on the web.

The longer you keep someone on your site looking at your branding with your logo, your branding, and your url in the address bar, the more successful you are going to be. The google cached pages (page jacked pages), keeps users at google, looking at the google branding ad and on the google site.

It was the Google branding cache that built google. No branding cache page - No Google. It's bigger than everthing esle google did. Google is a text book testimony to the awesome power of branding.

heini

WebmasterWorld Senior Member heini us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 15143 posted 12:40 pm on Jul 10, 2003 (gmt 0)

Zeus, I'm pretty sure Google would not stand a better chance to defend their case in most European countries.

Also I would not underestimate the usage of the cache feature. It's fast, it's safe (no popups etc) it has the highlighting - it's essentially a bit like what AOL, Compuserve etc where offering: the web without going into the web. Yes I agree: the cache was an important part of the googlification of the web.

Over time Google must have taken away an incredible amount of traffic from the web.

Clark

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 15143 posted 12:42 pm on Jul 10, 2003 (gmt 0)

Google's logo on Yahoo didn't hurt either :)

lasko

10+ Year Member



 
Msg#: 15143 posted 12:44 pm on Jul 10, 2003 (gmt 0)
I have just read the the NEW york times cache pages from
http://web.archive.org some very good headlines and stories etc.

I can not see this reaching the courts, if it does then their will be big changes in the Internet world.

Is it not possible for google to request a meta tag for permission to cache pages.

I know it would take a long time to catch on and many web sites may forget to include it but a simple

Meta name='cache' content='all'

then google will display the cache link for only those sites that have it.

I have looked at two google caches today because i need support from a forum which was great.

Hester

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 15143 posted 1:00 pm on Jul 10, 2003 (gmt 0)

Brett: It was the Google branding cache that built google. No branding cache page - No Google. It's bigger than everthing esle google did.

Now you clearly know a lot more than me about Google, but from a user's point of view, I don't see this argument is true. If the cache were so important everyone would be talking about it (on other forums and topics). It would be the one thing you thought of when you thought of Google. But it is not. I dare to suggest that most users aren't even aware of it. They click on the main link and they're happy.

I've only used the cache when the main link was down, which is hardly ever.

To me, the brand name of Google refers to a) its speed, b) its large base of sites, c) the Images search facility and d) the logo. It's always fun to see it changed whenever there's a holiday or important event.

If the cache disappeared tomorrow, to be honest I don't think I would even notice! Nor I guess the mass populus.

I see what you mean about them using the cache to brand it with their name, but I simply don't see it as a major event. And certainly not one that "built" their reputation.

heini: Over time Google must have taken away an incredible amount of traffic from the web.

I strongly disagree. It has pointed people to new sites they never knew existed. I bet someone probably typed in "webmaster" and found this forum. Even if people were driven away from the main sites to look at the cache, I'm pretty sure they would want to then visit the main site later to see more of it.

Fearless

10+ Year Member



 
Msg#: 15143 posted 1:02 pm on Jul 10, 2003 (gmt 0)

So many web surfers don't get it when it comes to copyright rules. They basically feel "if it's on the web- it's mine and I can use it how ever i want." I'm sorry to say this, but most of the esteemed WW members are utterly missing the point with comments like this:

I think Google's caching is invaluable for a number of purposes and I can completely understand why they do it.

Just because the cache feature is convenient doesn't make it legal. This is the argument of music pirates: it's easy, everybody does it, therefore- it's OK.

Google cache feature and image cache feature both are egregious copyright violations.

If I create a copyrighted photographic image and Google puts it on THEIR web site for commercial gain (The Plex IS a for profit enterprise!)

That's a copyright violation. Simple.

Brad

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 15143 posted 1:31 pm on Jul 10, 2003 (gmt 0)

Google is vulnerable. Someday, somebody with very deep pockets will launch a lawsuit and tie up Google in a mess of lawyers for _years_, should Google try to fight it.

I expect Google will phase out the cache either right before IPO or shortly after. Some of their direct competitors could use it against them.

In any event Google has had plenty of warning, they can only blame themselves if somebody calls them on it.

Sinner_G

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 15143 posted 1:36 pm on Jul 10, 2003 (gmt 0)

I agree that the Google cache is VERY shaky when it comes to legality. So the question I ask myself is why it has never come into court? And the answer is because most people find it very useful. To me it is a bit like prostitution (hope noone at Google will be upset because of that comparison). It is also illegal in most countries, but still exists and has never been seriously threatened because there are too many people that find it useful.

On Brett's case that no cache = no Google, I must disagree. First, I am one of the majority of users who never (or only very seldom) use the cache. Simply because on the result pages, the link to the original page is much more proeminent than the cache link, so that's where most people go. And if you go there, you will notice that the page title is still the real one, not something like 'Google Cache - Whatever Site'. And looking at the page itself, the Google part of it is kept very straight and dull, nothing eye-catching => No branding there. The only coloured element (but for the links) is the Google logo (ok, some branding), but if you are looking at any regular web page you won't notice it.

Josk

10+ Year Member



 
Msg#: 15143 posted 1:41 pm on Jul 10, 2003 (gmt 0)

> Google cache feature and image cache feature both are
> egregious copyright violations.

> If I create a copyrighted photographic image and Google puts > it on THEIR web site for commercial gain (The Plex IS a for > profit enterprise!)

Um... Have you ever the read text that Google puts on the cached page...?

"Google is not affiliated with the authors of this page nor responsible for its content.". To me this indicates that Google is *not* the copyright holder of the page. Come on now... When you write content do you then distance yourself away from it? Google even put a link to the real owners of the website... How many copyright violators do that?

nipear

10+ Year Member



 
Msg#: 15143 posted 1:46 pm on Jul 10, 2003 (gmt 0)

What would everyone here say if I built myself a nice little bot and indexed a couple million pages on the web in a niche area, let's say sports.

So now I use this info for SportSearch.com and I spend a couple million on getting the word. Now one great feature of my site is my cache feature. In fact you don't even have to click to the other site because I have it all right here. I brand the top like google, and I toss on 2 nice text links while I'm at it. And I'll go one step better than google on my cache pages and make them spiderable. Also I'll get even more branding by having the URL of my cache pages be something like SportSearch.com/file/42355643566.htm This way you get a double dose of SportSearch.com branding.

Now when you search on SportSearch I show results just like google execpt the title tag link is to my "cache" version and you have to click on the smaller link below to actually goto the website. Also I don't feel the need to let you know the URL of the site in question so I don't show it in my search results.

What is really different between this scenario and Google? The placing of a link? The subtraction of a URL?

Clark

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 15143 posted 1:52 pm on Jul 10, 2003 (gmt 0)

> Um... Have you ever the read text that Google puts on the cached page...?

"Google is not affiliated with the authors of this page nor responsible for its content.". To me this indicates that Google is *not* the copyright holder of the page. Come on now... When you write content do you then distance yourself away from it? Google even put a link to the real owners of the website... How many copyright violators do that?

Um, that's even worse. They get the benefit of using someone else's content on THEIR page, but responsibility belongs to the SITE owner. Screwed both ways.

Sinner_G

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 15143 posted 1:53 pm on Jul 10, 2003 (gmt 0)

What is really different between this scenario and Google?

Legally, I would say not much of a difference. Ethically, a huge one, as you would really be trying to get people to visit you page first and the original only as a second solution.

As I said before, the legal case is clear.

Hester

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 15143 posted 2:01 pm on Jul 10, 2003 (gmt 0)

Surely Google though works by offering a link to the actual website first. Only in a smaller font underneath does it add a link to a cached version. So they could claim that it is there as a second option, not the main option. The case might then be thrown out of court.

Also I don't see Google doing it to discredit the owners of the original site. Nor to make vast amounts of money. All they're doing is providing another angle to their main service - turning searches into links to other websites.

Josk

10+ Year Member



 
Msg#: 15143 posted 2:06 pm on Jul 10, 2003 (gmt 0)

> What is really different between this scenario and Google?
> The placing of a link? The subtraction of a URL?

Yes... Google don't try and pass of the page as theirs. You would be. If you don't make it clear that the page is not intended to be yours, then people will assume that it is. And then you a violating the copyright of the owners by displaying it.

> As I said before, the legal case is clear.

If the legal case is so clear, how come no-one is sueing, or has sued, Google. Lawyers (so the stereotype goes) are money-grabbing so-and-so's. I'd have thought that if there was a case against Google then there would have been one by now.

Clark

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 15143 posted 2:12 pm on Jul 10, 2003 (gmt 0)

Is anyone that has commented so far actually a lawyer? I tend to give the most credit on the legal issue to Brett since he actually spoke to lawyers about it and they seem unanimous.

On a practical level of webmasters and Google, let's face it, who wants to make waves with Google? The little bit they're "taking" with the cache is nothing compared to what they give (free traffic). Complain about the cache on your site and they will remove it. But they may remove your site altogether too. You would be a fool to make a stink.

As to why they haven't been sued I think it is a little more complicated than that. I think with a copyright you need to write a cease and desist letter first. Give them a "reasonable" amount of time to take it down. If you do, Google will certainly take it down quickly.

In order to sue them successfully, you would have to skip that step and say that they caused you damages by their cache. How do you prove that?

The above is my understanding of the law. However, I am not a lawyer and don't even play one on TV.

This 156 message thread spans 6 pages: < < 156 ( 1 [2] 3 4 5 6 > >
Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Google / Google News Archive
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved