homepage Welcome to WebmasterWorld Guest from 50.19.172.0
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Home / Forums Index / Google / Google News Archive
Forum Library, Charter, Moderator: open

Google News Archive Forum

This 156 message thread spans 6 pages: < < 156 ( 1 2 3 [4] 5 6 > >     
Google cache raises copyright concerns
Clark

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 15143 posted 9:09 pm on Jul 9, 2003 (gmt 0)

Everyone loves to write about Google:
[news.com.com...]

 

claus

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 15143 posted 2:53 am on Jul 11, 2003 (gmt 0)

questions:
- are frames and cache the same issue? wisenut does frames.
- how's the NYT policy towards news aggregators?
- quotation rights? common use? is there such a thing? It is the press after all.
- is the displayed copy legally different from indexing or are the two the same? (copy vs. display of it)

comments:
- laws differ worldwide, here you don't need (c) on a page to have copyright. I think it's 50 or 70 years.
- most anything raises copyright concerns these days it seems

g-talk:
- what made google, imho, was "I'm feeling lucky". Big hit with techies, and the techies told the less savvy.

some quote:
"There are also some people who do not know about the robots exclusion protocol, and think their page should be protected from indexing by a statement like, "This page is copyrighted and should not be indexed", which needless to say is difficult for web crawlers to understand." (...) "Since large complex systems such as crawlers will invariably cause problems, there needs to be significant resources devoted to reading the email and solving these problems as they come up" Anatomy of a Large-Scale Hypertextual Web SE

- relates to indexing, not caching, but imho, an index is also some kind of copy, it is just not displayed in its entirety.

/claus

Arnett

10+ Year Member



 
Msg#: 15143 posted 3:16 am on Jul 11, 2003 (gmt 0)

Google shouldn't be caching images from sites either. That's also a copyright violation. If you want to be super-picky about it,no search engine has a right to crawl through your site extracting anything at all. It is generally a condition of submitting your material to a search engine that allows them to use the material. Read the terms of submitting to N*tscape if you want to see an example. I'd cut and paste the statement into this post but it would be a copyright violation.

The major point is that most internet-types really believe that the internet should operate differently than the real world that we live in. It's ok for a sponsor to show the same ad on TV several times an hour or on the radio several times during a broadcast or even multiple times in newspaper classifieds or in magazine. Let someone try it on the internet. People steal content for their software,websites and other net works. They also link to other peoples content with disclaimers like "If this content violates someone's copyright then I'm sorry",as if that makes it ok. The LAW states that if you didn't create it and it is not in the public domain it is NOT YOURS. If it's NOT YOURS then you have NO RIGHT to it. Pretty simple. No loopholes,no rationalization,no finger pointing,no excuses.

Kirby

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 15143 posted 4:51 am on Jul 11, 2003 (gmt 0)

Bill Gates holds the digital copyrights to numerous images that I'm sure Google has cached. Not to mention Google has cached MSN.com. However there is no cache of Microsoft.com...

JustTrying

10+ Year Member



 
Msg#: 15143 posted 4:58 am on Jul 11, 2003 (gmt 0)

Personally, I think that this entire thread is a "straw man" arguement put forward by Brett out of a hope that someway somehow the Google Cache (and the problems associated with it for SPAMMERS) would go away -- not because of any copyright concerns that Brett has, but rather because of the annoyance it brings to this forum from all of the low-medium-level webmasters who becasue they CAN easily uncover competitor "SPAM" using the Google Cache decide to scream in here for GoogleGuy's attention as a sort of ad hoc "SPAM Police;" also, because with the Google Cache SPAM techniques (cloaking in particular) have a much harder time thriving.

As far as the copyright issue itself is concerned, for any lawsuit to be successful there must be some type of damage done; and the question that I have is what damage has the Google Cache actually done to webmasters? Clearly nearly all webmasters work diligently to become associated with Google, and in particular Google's traffic that stems from that association (traffic that just so happens to generates large amounts of revenue). From my vantage point, the only damage that could possible be seen as being done by the Google Cache is to those webmasters who choose to do things that are outside of Google's Terms of Service in order to obtain higher rankings, and then are ultimately discovered through use of the Google Cache -- and then are "turned-in."

From a SPAMMERS point of view, getting rid of the Cache, (not to mention GoogleGuy {to the extent that he is seen as a harvester of information from the forum}, and the "SPAM Report" form) would be the best things that could happen to a webmaster who knows how to beat an engine's algorothim using the "black arts."

I would bet that many senior members here would love to have Google become a search engine just like Inktomi (but with the traffic of Google), where there are no "SPAM Reports forms," cloaking is a sanctioned activity in the form of a Trusted Feed (cloaking scripts work fine as well), the use of the PFI program allows your site to be refreshed every 48 hours (making reverse engineering not so complicated, and dependance on FresBot not so needed), and the WebmasterWorld member Inktomi [webmasterworld.com] is seen in the forums more rarely than Affiliate marketing secrets are discussed these days.

However, in my humble opinion Google is where they are today becasue they are not like Inktomi, and they care more about the quality of their index than they do about PFI and Trusted Feed revenue. The arguement that Google is where they are today becasue of "Cache Branding" is so patently not true that it is not even worthy of debate. Google does have great branding, but that stems first from the quality and speed of their SERPS, and next from the quality of their commitment to be "engaged" with the webmaster community and the web itself. They seem to realize that in order to survive, you must always act like your position and technolgy is about to be undermined by an now unseen competitor.

Even though Overture and Microsoft have a huge mountain of money to create all of the branding and search technologies that they could ever want, the two thing that neither of them has are "practical sincerity" and "creative innovation" -- both of which are unquantifiable in any board meeting. Google just "gets it."

The Google Cache is not their Achilles heel, but rather it is their Ace in the hole. The Cache (and GoogleGuy) keep the SPAMMERS on the move, and within the reach of a webmaster who knows nothing about programming, but knows how to click a cache button, then fill out a "SPAM report." Google could not afford to pay for all of the quality feedback that they get from webmaster everyday about the integrity and "true performance" of their algorithim. Without the Cache, and the SPAM report, Google would have to hire their own staff to research how well their algoritim is actually working in the "real world," and now they get all of this feedback for free.

vincevincevince

WebmasterWorld Senior Member vincevincevince us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 15143 posted 5:52 am on Jul 11, 2003 (gmt 0)

Google could not afford to pay for all of the quality feedback that they get from webmaster everyday about the integrity and "true performance" of their algorithim.

Well said JustTrying

Beachboy

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 15143 posted 6:05 am on Jul 11, 2003 (gmt 0)

For what it's worth, I generally use the cache to locate specific color-highlighted keywords on a page. Very handy tool, shortens the time necessary to locate the info I'm seeking.

projectphp

10+ Year Member



 
Msg#: 15143 posted 6:11 am on Jul 11, 2003 (gmt 0)

If the cache's biggest use is for spam detection, what about a Google Proxy, complete with Googlebot User Agent and all? Google could even charge for the service. It would make many webmasters pretty happy :)

As for the Cache itself, who is going to challenge and Why? If people really think the Cache is theft, why has no one challenged? Why don't you yourself challenge Brett?

Wait, I got it, lets get Patty Bolger and the Lobby Group onto this one. Maybe, as a united force, no one individual organisation would suffer.

JustTrying

10+ Year Member



 
Msg#: 15143 posted 6:35 am on Jul 11, 2003 (gmt 0)

The Google cache's biggest use may not in fact be for spam detection (Google doesn't mention what the cache is for on any of their cached pages, but additional things that other people have mentioned are how nice the cache is for when a page is offline or 404; or, what a nice tool the cache is for highlighting keywords on a page -- ceratinly this handy feature doesn't hurt spam detection either) but, the point that I was trying to make with my previous post is that the most annoying thing to many SPAMMERS (and perhaps even Brett) IS the cache's use for spam detection by the "newbie webmaster masses."

For Brett, the copyright angle may just be the best angle for trying to do away with this SPAMMER annoyance -- that is why I called the thread a "straw man" arguement.

aravindgp

10+ Year Member



 
Msg#: 15143 posted 8:15 am on Jul 11, 2003 (gmt 0)

Why alone google is being questioned,every search engines worth mentioning crawls your site.They take information either for SERPS,phone number information , etc.
Doesn't this mean every search engine voilates copyright if google does.Google atleast is ethical and there's a purpose for them to do , only search results.

I fail to understand this,alltheweb infact shows audio files and mpeg files.I don't have any understanding whether they store these too in their datastoreage, obviously nobody wants this to happen.
Aren't people taking care of this,i.e they keep it safely away from spiders.

My stand all search engines are designed to crawl information.It's the way internet will evolve,today or tommorrow somebody might sue google and may be win a case but the bottom line that Search engines need to crawl remains undisputed.

May be we have not reached the stage technology wise, where a spider will give you what you want in a giffy,crawling trillion sitesin seconds.It's a very long way away , perhaps in 2050 it may happen.

Aravind.

heini

WebmasterWorld Senior Member heini us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 15143 posted 9:23 am on Jul 11, 2003 (gmt 0)

JustTrying - rudeness and calling out people is no substitute for arguments. As to Brett - how much more out in the sun can you possibly act as this board does, where every search engine under the sun is openly engaged?

Second - sorry, but if ever I have seen a strawman - let's call it rather an army of strawmen - than that is to introduce search quality problems into copyright issues.
That's a classic red herring if there ever was one.

To all the Google fans: This debate is not about being pro or anti Google. It's about copyright and Google most likely violating copyright.

JustTrying

10+ Year Member



 
Msg#: 15143 posted 9:47 am on Jul 11, 2003 (gmt 0)

Heini, certainly I do not want to be rude (and I apologize if I came across that way), and perhaps there is some Copyright issue somewhere out there with the Google cache, but the simple fact that nobody has to date brought a single lawsuit against Google seems to be an ominous fact that can not be overlooked.

As to the search quality issues being a strawman, if the Copyright issue is the "real" issue, then I must ask again for a single example where a single webmaster has been hurt or damaged monitarily by the Google cache in the realm of Copyright?

This is not the first time that the Google cache has come up on this board, and the "search quality issues" (as you call them) are not ones that I have invented for this thread. There are members here that have "real" issues with the Google cache that have nothing to do with Copyright, and to imply that this is not true is clearly not reality. The Google cache impacts "black hat" SEO negatively -- to imply that this is not a "real" issue with certain webmasters is not true.

[edited by: JustTrying at 9:49 am (utc) on July 11, 2003]

kaled

WebmasterWorld Senior Member kaled us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 15143 posted 9:47 am on Jul 11, 2003 (gmt 0)

Arnett said
If you want to be super-picky about it,no search engine has a right to crawl through your site extracting anything at all. It is generally a condition of submitting your material to a search engine that allows them to use the material.

Bandwidth issues aside, if joe public is given the right to view your pages then so are spiders.

Search engines follow links. Google states explicitly that there is no need to submit pages or sites provide it can find them through links.

JustTrying said
As far as the copyright issue itself is concerned, for any lawsuit to be successful there must be some type of damage done.

Technically, this may be correct, but where a principle is involved rather than large sums of money, judges typically award damages of $1.00 (or 1.00 here in the UK). However, in such cases, the winner may have to pay their own costs. In the UK, the Government will occasionally pay costs where it considers an important point of law needs clarification.

If anyone (in Europe) really wants the cache wiped out, write to the European Commission. They love to stick their noses into things they don't understand and spend vast sums of money on silly laws, etc.

Kaled.

killroy

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 15143 posted 10:34 am on Jul 11, 2003 (gmt 0)

First, I'd like to draw attention to the title of the forum. Secondly I'd like to mention that the argument was if Google was violating copyright or not. NOT if it was good or bad, or enforceable or litigatable.

Damages only influence what you get, NOT if it's legal or not. That's what laws are for, and in this case the law is pretty damn clear.

I disagree with Brett on the Branding argument, as I believe the visibility of the cache is grossly overestimated by experienced web users.

But regarding instigations as to the reason for this thread I simply point as to the seniority of the posters and rest my case.

This started out as a factual argument, and I hope we will find back to it. The law is clear. That the cache has various uses is also clear. That there hasn't been a high profile suit against Google over IP violations is also clear. But like any other entity breaking the law and not having been caught yet, Google will have to watch it's steps and plan very carefully for it's future, as more and more of it's enemies are getting ready for battle.

I'd hate to see my favourite SE go under over some obscure legal claim of one of it's lesser features. (lesser, compared to it's actual search results)

And if I was an IPO investor, I'd be doubly concerned.

<ADDED>
I'd like to reenforce the point that jsut because nobody gets hurt doesnT' void any laws. If you break the law, that's that, nothign else is neccessary. And beeign able to view a page makes it about as legal as beeign able to steal a car makes it legal to steal it. Most pages have some form of explicit notice that the use of it's contents is for personal use ONLY. This makes it distinctly illegal for any bot to extract the pages and republish them for gain.
</ADDED>

SN

merlin30

10+ Year Member



 
Msg#: 15143 posted 1:51 pm on Jul 11, 2003 (gmt 0)

As a little aside, if you do a seach that returns Google in the results (say "Google") and open up the cache the header states clearly that

"Google is not affiliated with the authors of this page nor responsible for its content"

So there we have it, Google cannot be held responsible for its own actions!

Hester

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 15143 posted 1:57 pm on Jul 11, 2003 (gmt 0)

This makes it distinctly illegal for any bot to extract the pages and republish them for gain.

Google is not using the cache to gain! (Unless increased usage leads to greater advertisements.) It is part of a free service!

If you want to be super-picky about it,no search engine has a right to crawl through your site extracting anything at all.

Do you want people to find your site or not?

Clark

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 15143 posted 2:00 pm on Jul 11, 2003 (gmt 0)

Just Trying,

Even though this has been addressed, I'd like to point out why your post kind of bugged me. I'm saying this in a respectful way because I think you have your heart in the right place but perhaps jumped to a couple of conclusions and unfairly accused an honorable man, probably unintentionally.

Personally, I think that this entire thread is a "straw man" arguement put forward by Brett out of a hope that someway somehow the Google Cache (and the problems associated with it for SPAMMERS) would go away -- not because of any copyright concerns that Brett has, but rather because of the annoyance it brings to this forum from all of the low-medium-level webmasters who becasue they CAN easily uncover competitor "SPAM" using the Google Cache decide to scream in here for GoogleGuy's attention as a sort of ad hoc "SPAM Police;" also, because with the Google Cache SPAM techniques (cloaking in particular) have a much harder time thriving.

If you read this thread from the beginning, you'll see that I started it with a simple link to a CNet article. From there digitalhost brought out a point about the cache that I never thought of but had to agree with. He clearly thought he would be the only one feeling that way. At this point Brett wasn't even in the conversation so I don't even know where you got this strawman thing.

I don't know what DigitalHost does for a living, but I'm not an SEO, just a programmer and webmaster and am not an oldtimer here. So there are no hidden agendas, just an intense interest in Google, perhaps the best techie company on the net today, and one that delivers lots of visitors for free.

I think it's pretty obvious that one of the biggest factors of the crazy success of WW is because Google is the hottest property on the net and they have someone posting here OFFICIALLY. When Brett comes out and makes a stand saying the Google cache violates copyright laws, it's just not right to accuse him of having an agenda here. The guy has credibility and conviction and integrity. He is taking a stand here which Google may not like. That takes guts and it's not the first time. If I were in Brett's shoes, I wouldn't take the chance to say what he's said.

This makes me respect the man and what he's done here tremendously and I hate to see you taking him to task on having an ulterior motive on the very issue that in fact he is actually sticking his neck out to comment on.

However, in my humble opinion Google is where they are today becasue they are not like Inktomi, and they care more about the quality of their index than they do about PFI and Trusted Feed revenue.

Now you're talking. Yes, they are passionate about what they do. I do think profit is an important motive for them, however, they are going about it in a wonderful way and I wish on the entire Google team to be the most successful company on the net. I love Google. But this issue is about copyright. Period.

Personally I don't want to see the cache go away but it is clearly a violation of copyright law. Maybe I'm more sensitive about this issue than you because I've been a victim of copyright theft 5 times (that I know of) and it is not a fun battle to fight. You feel violated very much like someone came into your home and took things away from you. It is a feeling that cannot be conveyed properly until you have experienced it.

Google's cache does not personally bother me too much because they send me most of my traffic, how can I possibly complain? But I don't think this thread is a disservice to them. I think it helps them to see the points of view out there on this issue. I hope they aren't feeling under fire over this issue and rather view it as another source of information. It is certainly not intended to denigrate them.

I hope that they find a way to still offer the cache in a legal way. But in the meantime, what I think they need to do most is to allow you to turn off the cache and not penalize you with removing the fresh tags and/or affecting how often the Freshbot visits you. And the only way anyone would trust this is if they said explicitly that they won't penalize you for turning on NOCACHE.

claus

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 15143 posted 2:44 pm on Jul 11, 2003 (gmt 0)

killroy
>> The law is clear

- imho, sometimes it isn't. The fine print may be there but when it comes to application of these words to a case that has not really had any precedents, sometimes the result is not all that clear.

This, in itself, justifies this thread as a discussion ;)

The thing is: If this exact case has not been tried in court before, you have to find something that is similar, in order to make some kind of educated guess. Then, what is similar enough:

- off-line copying of a book?
- off-line library newspaper reading?
- online exchange of music?
- search results with text abstracts?
- a photo taken for commercial purposes?
- hotlinking news aggregators?
- showing real-time content in an iframe?
- quoting research results in a newspaper?
- the same online? with/without source?

Lawyers are experts at this game, i think. We might have some intuitive feeling that this is "just as bad as..." but it is the "just as" part that need to be defined very exactly.

The general consensus in this thread is that this is useful in some way. Usefulness does not necessarily imply that it is also legal. But, on the other hand, similarity, or the fact that something might be "unfair" or "unreasonable", doesn't necessarily make it illegal either.

I know that in the US, something called the DMCA has recently been passed. Possibly it falls under this. But this is a new set of laws. That means that the courts do not yet have a solid record of past cases, decisions, and settlements under this law set to compare with. So, we have a new situation/issue, and perhaps also a new set of laws.

I'm not surprised to see lawyers recommending such a case. It's only natural that they should be standing in line for it, yelling "pick me, pick me". It's probably a principal case, and thus a career boost. A trial will have a nice long duration, and it will generate good cashflow no matter who wins. Plus, copyright being the hottest buzzword around, it will provide good media coverage.

Hope we're back on topic now ;)

/claus

hutcheson

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 15143 posted 3:47 pm on Jul 11, 2003 (gmt 0)

killroy, my understanding is that the "damages" you can ask for are severely limited unless you've registered the copyright, which requires paying a fee and sending a (partial) copy. That's not very practical for a newspaper or highly dynamic website.

grifter

10+ Year Member



 
Msg#: 15143 posted 4:08 pm on Jul 11, 2003 (gmt 0)

"No cache feature = no Google"-- I highly doubt it. The article mentions the cached page as a "little-known" feature in 2003, and I agree with this. I will agree that Google cache is just one ingredient combined with other features like relevancy, fast load times, clean non-portalish pages, Google Groups, etc. that propelled it, but isn't the make or break feature.

Typically, a visit to the cache should leave a log entry for stuff like images. Perhaps people can post their Google cache hit statistics...

The article establishes that for now, Google is protected by the safe harbor provisions of the DMCA for caching Web content. The more interesting parts of this thread IMHO are the ones asking the damages question. How do you establish damages when Google, in return for presenting your copyrighted material (and using it behind the scenes in the index), is driving unprecedented levels of traffic to your site? What are possible damage scenarios?

Kirby

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 15143 posted 4:37 pm on Jul 11, 2003 (gmt 0)

From the NY Times point of view, damages could be from loss of readership. If you can read it on Google, then you dont need to go to NY Times, thus LESS traffic and less revenue.

JustTrying

10+ Year Member



 
Msg#: 15143 posted 5:08 pm on Jul 11, 2003 (gmt 0)

Clark posted:
If you read this thread from the beginning, you'll see that I started it with a simple link to a CNet article. From there digitalhost brought out a point about the cache that I never thought of but had to agree with. He clearly thought he would be the only one feeling that way. At this point Brett wasn't even in the conversation so I don't even know where you got this strawman thing...

...The guy (Brett)has credibility and conviction and integrity. He is taking a stand here which Google may not like.

First, I very much agree that Brett has incredible credability and conviction and integrity. However, this does not mean that he does not have certain agendas. WebmasterWorld is not just a nice community for webmasters to chat, WebmasterWorld is a massive tool for effecting change within the structure and implementation of the Internet -- and Brett understands this more than anyone else here. Clearly, Brett understands that he is responsible for many amazing changes to the Web within the last couple of years, and what he says carries a good amount of impact on people who have power to make changes.

I have noticed that Brett generally only posts when there are topics that are quite important to him personally (the same is true for a lot of us); also, I have noticed that Brett is usually quite reserved when he posts. Therefore, when Brett posted about the Google Cache with such ferver, and then placed this thread at the top of the Front Page (I think with a different heading than is now there mentioning the Google Cache as Google's Achilles heel) it is to make a point. The reason that I started the idea that this "may" not just be about Copyright issues is becasue of previous discussions about the increase in the number of "SPAM Police," and Google's generous assistance with this movement.

Quite recently I have noticed a marked increase in the amount of disdain lately among many original members over Google's Cache, GoogleGuy, and the large increase in the use of the Google SPAM Report by newbie SEO's.

Also, I know that Brett, as well as many other senior members, are very opposed to Google's ban on cloaking, and GoogleGuy's strong advocating for the use of the SPAM Report Many here od feel that the "SPAM (read cloaking) detection issues" at Google really should be gotten rid of --- but until this Copyright arguement came up, there has never been a feisable method put forward that might actually accomplish changes within Google's SPAM {i.e. cloaking} detection mecahanism (with Google's Cache being a primary tool).

Finally, I will concede that most likely Brett really does genuinely feel very strongly about the Google Cache violating Copyright, and the straw man post was maybe too harsh -- my apologies to Brett if I jumped to a wrong conclusion. However, I do honestly think that the "other issue" of many senior WebmasterWorld members having strong negative opinions' about Google's Cache as a method for SPAM (cloaking) detection should be kept in mind while having this discussion. Many people here who cloak would love to see the Google Cache go away, thus making it much more difficult to reveal cloaking. With that reality in mind, I put forward the question: what is the possibility that the Google Cache Copyright issue might be utilized to effect an end to the annoying SPAM (cloaking) detecting Google Cache?

Brad

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 15143 posted 5:43 pm on Jul 11, 2003 (gmt 0)

How did the subject of this thread get changed from discussion of copyright liability on the part of Google to insinuations that Brett and senior members have some ulterior motives for discussing it?

Let's get back on topic.

digitalghost

WebmasterWorld Senior Member digitalghost us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 15143 posted 5:58 pm on Jul 11, 2003 (gmt 0)

Brad, I was wondering the same thing. I was also wondering how cloaking is automatically assumed to be spam but that's a topic for another thread.

I'm still at a loss for how people are determining that what Google is doing on a large scale is okay, but an individual doing the same thing would result in numerous posts to the Content and Copywriting forum from people complaining, rightfully so, that someone jacked all their content.

>>but until this Copyright arguement came up, there has never been a feisable method put forward that might actually accomplish changes within Google's SPAM {i.e. cloaking} detection mecahanism

This argument didn't just "come up", it has been ongoing for years. I don't care what Google thinks about cloaking, I don't care if Google wants to display snippets in their SERPs, but I do feel that the cache feature is in direct violation of copyright law and for those people that keep parroting that "few people use the cache feature" then I have to ask why they feel Google is so reluctant to remove it?

nipear

10+ Year Member



 
Msg#: 15143 posted 6:05 pm on Jul 11, 2003 (gmt 0)

I did a little looking at the RIAA lawsuit against MP3.com back in 2000 for copyright violations, as it seemed similiar to the google cache issue.

Remeber when MP3 tried to launch a service where you could stick a CD into your computer and they would "save" it to your online account? Then you could stream it to yourself at any time from anyplace.

Well MP3.com copied 45,000 CD's to their database without anyone's permission. Well the RIAA didn't like that at all - The unathorized copying of copyrighted material.

RIAA won based on the illegal copying of the material, not based on damages. Just on the fact that they made illegal copies of copyright material. And from what I've seen MP3.com never launched that service so there couldn't have been any actual damages.

To me Google's database of copyright material doesn't seem so different. I'm no lawyer, and I haven't read up too much on the lawsuit and settlement.

But from an observers perspective the fact that google displays copyright material from their server is copyright infringement. You can sugar coat it all you want but it's there on Google's servers for people to see.

Also on the damages track. I've seen plenty of articles on the web that reference websites that had been taken down. Only Google has it in their cache for me to look at. Many authors reference the google cache as where to find it. That senario seems like there could be damages. For example something gets published to the web on accident that is very embarassing to a company then removed, but is still visible in the google cache for days or weeks. People reference the cache and thousands of people see some copyright material that only exhists in the google cache.

Well, my wife is here. Off to lunch. And GoogleGuy I love ya and your SE so don't get mad at me.... :)

Clark

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 15143 posted 6:18 pm on Jul 11, 2003 (gmt 0)

I don't want to extend the off topic part of the discussion so I won't respond to most of your post, JT, but just to set the record straight, the original post title was CNET: Google cache raises copyright concerns. The word CNET was removed and this is consistent with other threads names I've seen changed. I don't have a problem with the change.

The only other OT thing I'll add in order to defend Brett is as to ulterior motive...I still don't think he has one because he announced that he stopped doing SEO work and is full time on WW right now. And I tend to believe him because shortly before his announcement I stickied him that I might be able to get him a lucrative contract and he didn't bite. I was puzzled but after his announcement, I understood why...

Again, sorry for continuing the OT part but I felt it important to respond to it rather than leave question marks in the minds of newbies reading the thread.

dragonlady7

10+ Year Member



 
Msg#: 15143 posted 6:43 pm on Jul 11, 2003 (gmt 0)

>digitalghost said: what Google is doing on a large scale is okay, but an individual doing the same thing would result in numerous posts to the Content and Copywriting forum from people complaining, rightfully so, that someone jacked all their content<

(btw, i don't know why everyone calls you digital host.? Changes the meaning more than a bit, lol.)

I'd like to point out the difference as I see it.

What Google is doing is copying your pages so people can see them even if your server's down or the page has been moved. Google is a high-traffic source of information.

Anyone else copying your pages would not be copying them into the same context and for the same purpose. Nobody else is a universally-accepted source of information where people go to find new places to look at, in the same way as Google. Therefore, their motive would have to be different. Therefore, it's objectionable.

I don't see how there isn't a difference there.

I'm not saying I can't see how that does raise copyright issues, and I'm not saying there's no case. I'm just saying that it is, on the whole, simply a useful service that is not detrimental to those whose information is copied and is immensely handy for those who do use the service. I disagree that it's that important a feature of Google to most users. And I would argue that its use in the cases of "accidentally" posted pages, or cloaked pages, has more use in keeping the Internet honest (for want of a better phrase) than it has in causing harm to people who accidentally post things online.

But I suppose the fact that I can't understand all the furor after reading all the furor means that I must be missing some point, somewhere. It doesn't seem like that big a deal for me. Useful for many, with undeniable legal concerns but very little actual detriment to anyone. Doesn't seem like a recipe for such controversy to me.

Pete_Dizzle

10+ Year Member



 
Msg#: 15143 posted 9:41 pm on Jul 11, 2003 (gmt 0)

If you search google for IBM you get the following information for the first link. Without going to the cache.

IBM United States
The IBM corporate home page, entry point to information about IBM products
and services. IBM, Skip to main contentUnited States, Home ...
Description: The IBM corporate home page, entry point to information about IBM products and services.

Is that copyright infringment too? Technically google's cache is just a more detailed search result. They could include it inline, in a frame, to give users a better idea of where they wanted to visit.

I think what's important is not whether or not google has violated US copyright law. Let's be real this is the Internet, Google can open up shop in a third world country (even though US companies tend not to do this).

Copyright laws were written for the days of the printing press not for today with google, the internet, mp3s, dvds etc.

We should ask how do we want to apply the principles of Copyright, allowing authors to profit from their works, to todays world with all these new forms of content.

I think it is understood that if you publish material on the web then it will be copied temporarily to a users screen/computer/cache. That's how the Internet works.

There really should be copyright laws created exclusively for the Internet and Computers.
We should ask interesting questions like:
If google's "cache" is ok. Is it also okay for google to publish a monthly set of DVD's for say $100US which includes the whole Internet as google saw it?

In my mind the current acceptable way to use people's content on the Internet is read only. Which means I can visit IBM's website, download it's images and text into my cache and read it in the browser. But if I take that data in my cache folder and publish it to my own website (write) then I've done something ethically wrong. This is basically what google is doing. It's okay for them with current morals to read sites on the net and produce an index, but for them to then write out all the sites they read in on their own servers is wrong. If we made it not okay to read, then the index data siting on their servers at the GooglePlex is also a copyright violation, similar to storing mp3's you don't have the license to access.

Now that we've said that. It would also make Anonymizer.com wrong. Do any of you agree with this logic? Anonymizer.com reads in your site and prints it out on their own servers with their own branding.

I would say that WayBackMachine gets an exception from the writing rule because they are doing a public service by making a history of the Internet. It's completely different from what google and anonymizer are doing.

Google does make it clear that the contents of the cache does not belong to them. They claim not to be responsible for the content, but I find that a legally flawed argument.
Copyright holders contact ISPs not webmasters to remove content, because the ISP is legally (US) responsible for the content on their servers and they are more likely to respond quickly. So imagine an infringing website xyz.com, the ISP is contacted and removes the site from the Internet. Now the website owner and ISP are no longer liable for further infringment. Imagine also that Google is still caching the content, they could now be sued for infringing as well. Most likely they will get a letter first asking them to remove the content, but either way they are still as liable as the ISP hosting the site.

To all the robots.txt fans. I have never used a robots.txt file I've been making websites for many years. Can someone who suggested the robots.txt solution post a file that will make sure no search engine will cache my sites contents?
That includes all present and future search engines.


Tapolyai,
Google used to pay webmasters to put a search box to google on their websites.


i think the whole idea of putting information free to view on the internet ... then later making you have to pay for it ... is somewhat flawed

either the information needs payment to view, or it doesn't


You are paying for the archiving expense. It costs them a lot of money to keep everything web accessible. If they didn't charge for old material, chances are it would not be available at all.

nipear,
RIAA won based on the illegal copying of the material, not based on damages. Just on the fact that they made illegal copies of copyright material. And from what I've seen MP3.com never launched that service so there couldn't have been any actual damages.

Mp3.com did launch that service, they paid $150 Million in damages

xy123

10+ Year Member



 
Msg#: 15143 posted 10:51 pm on Jul 11, 2003 (gmt 0)

The small excerpts Google puts in the SERPS from the sites - as in your IBM example - would, in my view, probably be treated 'fair use' in the legal copyright sense of the term. In many cases these are the meta descriptions put in the site specifically to feed the search engines.

Duplicating a site - as in copying every page from the site into the search engine's cache - I'm less sure that would pass as fair use.

But a complainant site would be rather stupid to complain. After all, they'll just get kicked out of Google and why would they want to do that? Besides, what commercial loss could they demonstrate to a court.

europeforvisitors



 
Msg#: 15143 posted 11:34 pm on Jul 11, 2003 (gmt 0)

DragonLady7 wrote:

What Google is doing is copying your pages so people can see them even if your server's down or the page has been moved. Google is a high-traffic source of information.

Anyone else copying your pages would not be copying them into the same context and for the same purpose. Nobody else is a universally-accepted source of information where people go to find new places to look at, in the same way as Google. Therefore, their motive would have to be different. Therefore, it's objectionable.

I don't see how there isn't a difference there.

There isn't, in terms of copyright law, because copyright infringement is determined by actions, not motives. It doesn't matter if you're copying and distributing copyrighted material for pay, as an act of charity, or because the voice of God told you to do it. And the fact that Google may be trying to perform a public service by caching and distributing your pages doesn't make it legal.

BTW, the caching and serving (redistribution) of entire pages is a whole different kettle of fish from indexing a page or quoting snippets of text on a SERP. Indexing and limited excerpting for editorial purposes are "fair use." To use an analogy, if THE NEW YORK TIMES BOOK REVIEW is reviewing a book, it's free to quote a passage from the book. (That's "fair use.") But it can't simply reproduce the entire work. (That's copyright infringement.)

A court might rule that Google's caching and redistribution of entire pages is legal, but I doubt it. Still, as a practical matter, the issue is likely to be a tempest in a teacup because (a) damages would be hard to prove in most cases, and (b) the fact that Google honors the "nocache" tag means that a technical remedy is available--and the technical remedy is obviously more practical for most people than a legal remedy.

rfgdxm1

WebmasterWorld Senior Member rfgdxm1 us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 15143 posted 11:57 pm on Jul 11, 2003 (gmt 0)

If the Google cache is illegal, archive.org is much more so, and nobody has gone after that yet.

aravindgp

10+ Year Member



 
Msg#: 15143 posted 12:04 am on Jul 12, 2003 (gmt 0)

[news.com.com...]
CNET news July 7, 2003, 8:40 PM PT

Search engines' display of miniature images is fair use under copyright law, a federal appeals court ruled Monday, but the legality of presenting full-size renditions of visual works is yet to be determined.

Europe Visitors wrote:

>>BTW, the caching and serving (redistribution) of entire pages is a whole different kettle of fish from indexing a page or quoting snippets of text on a SERP.

When we look at the above article and what EuropeforVisitors wrote it directly states that cache is perfectly fine under federal law, if it's just a snippet of the whole website.Is cache a snippet of whole website?I have been reading over and over again at this forum,to determine this aspect.

I would love to hear arguments on whether cache is the whole website snapshot or it's a reproduction of webpage.

This 156 message thread spans 6 pages: < < 156 ( 1 2 3 [4] 5 6 > >
Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Google / Google News Archive
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved