Forum Moderators: open
But there is difference - to cache the page for your Googles use is like taking a photocopy of a book, pulling out points from it and giving the book author some free advertising out of it.
To publish the whole book without permission (i.e. make the cache public) would violate copyright, there is no other it can be looked at.
Pages get cached evey few seconds, Google just lets you see what they saw. At least you know what Google see... But what about Altavista, Inktomi and AlltheWeb. The cache has often been valuable to me to see exactly what was retrieved by Google...
There are lots of services where you don't Opt-in that share your information without asking. For instance, a phonebook will display your name,address,phone number unless you Opt-out.
Generally, copyright on this information does not belong to you, it belongs to the publisher. If you were to copy and republish a page from a phone book, you would not be breaching hundreds of individual copyrights, you'd be breaching the copyright of the phonebook's publisher.
As for opt-in VS opt-out, I don't know how the wind is blowing in other countries, but here in the UK, there have been many legal challenges to opt-out policies and the rulings usually favour an opt-in policy.
In practical terms, I think Google will keep the cache until someone launches a lawsuit against them. However, who is likely to do that? No large company is likely to do this.
Kaled.
PS
I had a quick look at the robots.txt definition and could see no reference to caching. How do I switch off caching using a single robots.txt file rather than adding a meta tag to all my pages.
Looks like 'G' had a nice run of luck and will milk it as long as they can. Fair enough. There are so many legitimate reasons to like the cache but none of them make it legal nor address the moral implications of copyright.
The ones using the Robots.txt as an argument assume a lot of knowledge on the part of people that they simply don't have. I'm pretty savvy technically compared to the average population. I'm decent at unix,perl,php,mysql and a bunch of other languages and I've never touched a robots.txt file nor do I know how to.
I've looked at the spec but it looks complicated for the average user who might use geocities or frontpage or pay a webmaster to display copyrighted content.
Why should G expect that person to even know about Google let alone how to format a robots.txt to protect their content? Not having heard of Google is not a reason to expect their copyrighted laws to not be protected.
P.S. Although Brett did not answer what happened to that "GG says" page if you do a search at WW you'll see that he killed it because of copyright violation.
[edited by: Clark at 9:48 am (utc) on July 10, 2003]
<Meta name='cache' content='Cache me'>
Or will it be
<meta name='cache' content='I hereby state that permission is granted to google to cache my web site'>
Either way
I think the newspapers don't like the Google news feature,
which is surprising when I bet alot of the newpapers traffic comes from Google.
You just can't please everyone these days :(
The point here seems to be that all SE's use a cache, but Google are allegedly breaking the law by publishing their cache. Google's only defence may be that the contents are "temporary".
Now surely the WayBack Machine is an even bigger culprit as it is archiving millions of sites on a permanent basis? How do they get away with that one then?
1) People use Microsoft Internet Explorer to find and access webpages.
2) It is a tool.
3) Microsoft Internet Exporer has "BRANDING" on the title bar for every web page it accesses.
AND
1) People use Google to find and access webpages.
2) It is a tool.
3) Google has "BRANDING" at the top of the cache for every web page it accesses.
The cache is fair use - it is no different than a translator or other thing that does something to a webpage - it allows users to find the word they are looking for on that page.
I'd be willing to bet money the google cache holds up in court.
I think EVERY major search engine caches pages - they just all don't let you see it (I could be wrong about this). If that is the case - if they copy it - and no one sees it - is that legal if google's is ilegal.
The Google cache is what built Google. No cached pages - no Google.
I know you strongly believe this, but I don't see it. Google had a search engine that actually crawled a large portion of the internet and indexed it relatively well and in a timely manner - during a time when no one else did. I think that is what built google.
But when the user goes to Google, they are no longer seeing a page controlled by the user. It is now under Google's control.
If the user suddenly updates their page, people will see the updated version. But in Google's cache, it is still the old version.
To widen the argument though - does Google indeed cache ALL of the site? Or just the top ranked pages? If the latter, then it's not copying sites en masse but merely providing a snapshot.
I think the vast majority of users who have used the cache (probably a minority of all users) have only clicked it when the link to the acutal site is down.
To me, this is in the realm of "nice to have" but is not a make-or-break feature.
Can you explain the importance of the cache?
Because it is Opt Out. You can't opt out on illegal matters. It's like saying if you don't have a sign on your front yard that says stealing is not ok, then any one can help themselves to your stuff. >>
Hehe, that pretty much sums it up.
Also, keep in mind that not everyone is a WW member and most probably have no idea how to prevent this or know about it at all.
Now where everyone in the US knows Google is a succes, they just want to cash in, some how.
I do LOVE USA, but damn you have some serious problem there.
zeus
>a brief abstract of the article, asking one to pay money to read those articles
This is so annoying and its happening more and more. I reckon companies are strategically doing this, putting premium content out for spiders and then charging for it once its been listed.
If they are, they should accept that it can can be cached and copied, not just by Google - ISP's and Users.
If you're altavista and you are targetted for keywords butter and guns, you can enter hundreds of related keywords into a homegrown program. It will create a few hundred thousand pages with varying amounts of your target keywords so you get basically every keyword density, combination, keywords in title, etc. Then you detect that altavista is crawling, you show AV your fake content, but when their users click on the link, you deliver that same 1 page selling guns and butter.
Do that with to Google with a cache and you will be quickly caught and banned.
I don't know if that's what Brett was thinking of, and I don't have experience with such programs but did learn about it here. Maybe someone else w/ experience can tell us if G is still susceptible or if the cache stopped them? And if AV and the others are susceptible (thereby reducing their viability?)
It's a case study in the success of branding. It's the most powerful tool Google has in it's arsenal. It is so hard for the general public to understand the power of branding that most won't get it.
Google has been almost too successful at their branding. They wanted to make themselves synonymous with net searching and now their name is being used in place of searching as a verb: to google something.
1) all net studies have shown the amount of time a user spends on a site, the more likely that site is to be successful.
2) all marketing studies show that branding is paramount to success for the largest sites on the web.
The longer you keep someone on your site looking at your branding with your logo, your branding, and your url in the address bar, the more successful you are going to be. The google cached pages (page jacked pages), keeps users at google, looking at the google branding ad and on the google site.
It was the Google branding cache that built google. No branding cache page - No Google. It's bigger than everthing esle google did. Google is a text book testimony to the awesome power of branding.
Also I would not underestimate the usage of the cache feature. It's fast, it's safe (no popups etc) it has the highlighting - it's essentially a bit like what AOL, Compuserve etc where offering: the web without going into the web. Yes I agree: the cache was an important part of the googlification of the web.
Over time Google must have taken away an incredible amount of traffic from the web.
I can not see this reaching the courts, if it does then their will be big changes in the Internet world.
Is it not possible for google to request a meta tag for permission to cache pages.
I know it would take a long time to catch on and many web sites may forget to include it but a simple
Meta name='cache' content='all'
then google will display the cache link for only those sites that have it.
I have looked at two google caches today because i need support from a forum which was great.
Brett: It was the Google branding cache that built google. No branding cache page - No Google. It's bigger than everthing esle google did.
Now you clearly know a lot more than me about Google, but from a user's point of view, I don't see this argument is true. If the cache were so important everyone would be talking about it (on other forums and topics). It would be the one thing you thought of when you thought of Google. But it is not. I dare to suggest that most users aren't even aware of it. They click on the main link and they're happy.
I've only used the cache when the main link was down, which is hardly ever.
To me, the brand name of Google refers to a) its speed, b) its large base of sites, c) the Images search facility and d) the logo. It's always fun to see it changed whenever there's a holiday or important event.
If the cache disappeared tomorrow, to be honest I don't think I would even notice! Nor I guess the mass populus.
I see what you mean about them using the cache to brand it with their name, but I simply don't see it as a major event. And certainly not one that "built" their reputation.
heini: Over time Google must have taken away an incredible amount of traffic from the web.
I strongly disagree. It has pointed people to new sites they never knew existed. I bet someone probably typed in "webmaster" and found this forum. Even if people were driven away from the main sites to look at the cache, I'm pretty sure they would want to then visit the main site later to see more of it.
I think Google's caching is invaluable for a number of purposes and I can completely understand why they do it.
Just because the cache feature is convenient doesn't make it legal. This is the argument of music pirates: it's easy, everybody does it, therefore- it's OK.
Google cache feature and image cache feature both are egregious copyright violations.
If I create a copyrighted photographic image and Google puts it on THEIR web site for commercial gain (The Plex IS a for profit enterprise!)
That's a copyright violation. Simple.
I expect Google will phase out the cache either right before IPO or shortly after. Some of their direct competitors could use it against them.
In any event Google has had plenty of warning, they can only blame themselves if somebody calls them on it.
On Brett's case that no cache = no Google, I must disagree. First, I am one of the majority of users who never (or only very seldom) use the cache. Simply because on the result pages, the link to the original page is much more proeminent than the cache link, so that's where most people go. And if you go there, you will notice that the page title is still the real one, not something like 'Google Cache - Whatever Site'. And looking at the page itself, the Google part of it is kept very straight and dull, nothing eye-catching => No branding there. The only coloured element (but for the links) is the Google logo (ok, some branding), but if you are looking at any regular web page you won't notice it.
> If I create a copyrighted photographic image and Google puts > it on THEIR web site for commercial gain (The Plex IS a for > profit enterprise!)
Um... Have you ever the read text that Google puts on the cached page...?
"Google is not affiliated with the authors of this page nor responsible for its content.". To me this indicates that Google is *not* the copyright holder of the page. Come on now... When you write content do you then distance yourself away from it? Google even put a link to the real owners of the website... How many copyright violators do that?
So now I use this info for SportSearch.com and I spend a couple million on getting the word. Now one great feature of my site is my cache feature. In fact you don't even have to click to the other site because I have it all right here. I brand the top like google, and I toss on 2 nice text links while I'm at it. And I'll go one step better than google on my cache pages and make them spiderable. Also I'll get even more branding by having the URL of my cache pages be something like SportSearch.com/file/42355643566.htm This way you get a double dose of SportSearch.com branding.
Now when you search on SportSearch I show results just like google execpt the title tag link is to my "cache" version and you have to click on the smaller link below to actually goto the website. Also I don't feel the need to let you know the URL of the site in question so I don't show it in my search results.
What is really different between this scenario and Google? The placing of a link? The subtraction of a URL?
> Um... Have you ever the read text that Google puts on the cached page...?"Google is not affiliated with the authors of this page nor responsible for its content.". To me this indicates that Google is *not* the copyright holder of the page. Come on now... When you write content do you then distance yourself away from it? Google even put a link to the real owners of the website... How many copyright violators do that?
Um, that's even worse. They get the benefit of using someone else's content on THEIR page, but responsibility belongs to the SITE owner. Screwed both ways.
Also I don't see Google doing it to discredit the owners of the original site. Nor to make vast amounts of money. All they're doing is providing another angle to their main service - turning searches into links to other websites.
Yes... Google don't try and pass of the page as theirs. You would be. If you don't make it clear that the page is not intended to be yours, then people will assume that it is. And then you a violating the copyright of the owners by displaying it.
> As I said before, the legal case is clear.
If the legal case is so clear, how come no-one is sueing, or has sued, Google. Lawyers (so the stereotype goes) are money-grabbing so-and-so's. I'd have thought that if there was a case against Google then there would have been one by now.