| 3:51 pm on Jan 26, 2006 (gmt 0)|
So... if it is fair use for Google, it is fair use for just about anyone else to keep and/or display a cached copy of a third-party page?
Is this a final judgement or can there be appeals etc.? Does this create a legal precedent? Are my ripped, downloaded MP3s just "caches"? It will be interesting to see how this pans out.
[edited by: encyclo at 3:52 pm (utc) on Jan. 26, 2006]
| 3:58 pm on Jan 26, 2006 (gmt 0)|
It's bound to be appealed, IMO....
| 4:11 pm on Jan 26, 2006 (gmt 0)|
|What this does, is effectively neuters all copyright laws on the internet today. |
Brett, thanks for posting the news, but re your comment: isn't that overstating it a bit? For it to neuter all copyright law, the ruling would have to allow any copying for any purpose, and it doesn't do that.
encyclo, I believe this ruling technically only applies in the District in which it was made. Courts in other Federal Districts might give it some deference, but it wouldn't be impossible for them to make a directly conflicting decision. And if that happened, the Supreme Court made need to step in to decide for the country as a whole.
| 4:18 pm on Jan 26, 2006 (gmt 0)|
Just talked with Blake. He does plan on appealing.
We are attempting to find find a copy of the oral arguments in the case. If you know of an online copy (not the findings of fact), please let me know.
> the ruling would have to allow any copying for any purpose
Sure - simply register as a ISP (which anyone can do - webmasterworld is a registered isp) and you too have safe harbor...
| 4:20 pm on Jan 26, 2006 (gmt 0)|
So does this mean I can start my own mp3 search engine and archive any tunes the crawler finds?
| 4:25 pm on Jan 26, 2006 (gmt 0)|
I think a more appropriate analogy is of being in a public space and there being a reasonable assumption that you don't have privacy. When inside your home you have a reasonable assumption of privacy.
Likewise, you have recourse to prevent caching and a reasonable expectation of protecting your content from caching when you use a no-cache tag. And if you don't use a no-cache then you aren't using the tools available to prevent a cache, similar to changing your pants on your front lawn.
So it looks like the court took into consideration that webmasters have a way to stop Google from caching the site by using a no cache tag, just as we can block the bot entirely from our site with a robots.txt.
It's a very narrow ruling that doesn't affect issues beyong caching of a site where there is a remedy against it, i.e. a no-cache tag. I don't see how that means it's open season on content.
Maybe I'm missing something here, but it sounds like a reasonable decision.
| 4:27 pm on Jan 26, 2006 (gmt 0)|
Great point Key_master - we need an RIAA for text. Napster and MP3.com lost the "caching" issue for music.
With privacy - you have zero rights under the law in the US. There are zero provisions in the US Constitution for privacy.
Copyright on the other hand, is specifically spelled out in dozens of laws covering everything from trademarks to writings and direvative works.
> I don't see how that means it's open season on content.
Simply start a search engine, register as an isp, and throw the word "cache" on a page and go go go - rip the internet - it is your oyster.
| 4:28 pm on Jan 26, 2006 (gmt 0)|
I've always thought that Google cache should be illegal.
They use the cache for useful purposes, yes - but they actually profit from showing our content. In any other area, that would be grounds for suing, and most often the plaintiff would win.
In copyright law, the use of copyrighted works without "expressed written permission" is illegal. This means that you don't have the write unless I explicity tell you - If I happen to fail to write to you explicitiy and inform you that you do NOT have the write to use my copyrighted content (ie. no robots.txt), that does not give you the right to the the content based on assumption to use it.
Furthermore - robots.txt is not a legal standard - there is nothing that says that a spider has to conform to robots.txt, or any meta tags. So what happens in you have a robots.txt in place but my spider chooses not to read it - would such an example change the verdict?
[edited by: Chico_Loco at 4:35 pm (utc) on Jan. 26, 2006]
| 4:31 pm on Jan 26, 2006 (gmt 0)|
While the ruling may suggest that a website operator makes an effort to block cache from happening, it also places requirements on search engines (and others crawlers) to show that they will abide by the robots.txt. That could suggest that some level of standardization for all crawlers may be coming.
| 4:34 pm on Jan 26, 2006 (gmt 0)|
What scares me is a precedent quoted in the case saying
|To demonstrate copyright infringement, “the plaintiff must show ownership of the |
copyright and copying by the defendant.” Kelly v. Arriba Soft Corp., 336 F.3d 811, 817 (9th
Cir. 2003); see also 17 U.S.C. § 501. A plaintiff must also show volitional conduct on the part of
the defendant in order to support a finding of direct copyright infringement. See Religious Tech.
Ctr v. Netcom On-Line Commc’n Servs., Inc., 907 F. Supp. 1361, 1369-70 (N.D. Cal. 1995)
(direct infringement requires a volitional act by defendant; automated copying by machines
occasioned by others not sufficient); CoStar Group, Inc. v. LoopNet, Inc., 373 F.3d 544, 555 (4th
Cir. 2004) (“Agreeing with the analysis in Netcom, we hold that the automatic copying, storage,
and transmission of copyrighted materials, when instigated by others, does not render an ISP
strictly liable for copyright infringement under §§ 501 and 106 of the Copyright Act.”).
So scraping can't be considered intellectual property theft as long as it's automated? Ridiculous.
[edited by: volatilegx at 4:35 pm (utc) on Jan. 26, 2006]
| 4:35 pm on Jan 26, 2006 (gmt 0)|
I think the point is that I can now create a 'cached version' of your very high ranking website and original content and it would be considered legal unless you marked your entire website with the 'no-cache' directive.
It should be the other way round, data is NOT to be copied unless marked with a 'cache' directive.
It seems strange that this ruling can override Copyright laws inherent in all original works - as in if I right a 3 page story from scratch that is automatically protected under Copyright - yet if I dont put 'no-cache' on there then it is completely legal for Google to breach the Copyright of the work and copy it for display on their website.
Its not right to make us say 'please dont copy', your permission should be expressed not default.
How many webmasters do you actually think know what the 'no-cache' option is!
Now it doesnt even matter if they fix the 302 redirect hijacking - cause you can just 'cache' (cut & paste) a copy of their page and hijack that way - and its legal!
| 4:40 pm on Jan 26, 2006 (gmt 0)|
This legal ruling is absolutely correct. Remember, when you make your content public on the Web, it is called "to publish your site"! When you make your content publicly available and you don't define any restrictions on your robots.txt file, that means your content can be cached. I'm absolutely satisfied by that ruling.
| 4:42 pm on Jan 26, 2006 (gmt 0)|
|I'm absolutely satisfied by that ruling. |
Perhaps because you're absolutely misunderstanding the situation or ramifications?
| 4:43 pm on Jan 26, 2006 (gmt 0)|
I dont have a problem with Google caching the page, but if you give me your URL, I might just cache it on my PR 9 site - lets see how you like that.
I will just send my robot around to your website everyday to keep it nice and updated, while you get penalised for duplicate content and I steal all your original content.
Of course, if you put no-cache in your meta tags then you could persue me - after a few weeks....
you really dont see the problem with this ruling?
Tell me, if i was to do this to you, what legal argument would you have for me to remove it?
[edited by: otech at 4:45 pm (utc) on Jan. 26, 2006]
| 4:45 pm on Jan 26, 2006 (gmt 0)|
Every search engine has a copy of your page, you can either have a search engne that allows you to see what they have, or have a search engine that keeps it secret.
What would you rather?
| 4:48 pm on Jan 26, 2006 (gmt 0)|
Is anyone actually getting the point?
ITS NOT LIMITED TO SEARCH ENGINES PEOPLE-
Webmasterworld, for example, could right now send a spider to YOUR site and display it on THEIR homepage - and YOU would get banned as they have a very high PR even though its your copyrighted work.
Just as long as its an automated process to update the 'cached' page.
Just look at the quote in volatilegx post above.
| 4:51 pm on Jan 26, 2006 (gmt 0)|
Chico_Loco's post on the previous page also sums it up very well.
"In copyright law, the use of copyrighted works without "expressed written permission" is illegal. This means that you don't have the write unless I explicity tell you - If I happen to fail to write to you explicitiy and inform you that you do NOT have the write to use my copyrighted content (ie. no robots.txt), that does not give you the right to the the content based on assumption to use it.
Furthermore - robots.txt is not a legal standard - there is nothing that says that a spider has to conform to robots.txt, or any meta tags. So what happens in you have a robots.txt in place but my spider chooses not to read it - would such an example change the verdict?"
| 4:53 pm on Jan 26, 2006 (gmt 0)|
Don't lose context - this caching is done for billions of web pages, only a handful of people complain - it is also clear from actual usage of cache feature that it is NOT designed to show your content, it is only useful when site is down. Considering the number of pages involved and actual use of it, it seems to be that the court ruling is justifiable for this context - primary use for cache is non-infringing.
| 4:56 pm on Jan 26, 2006 (gmt 0)|
wildbest, I have to disagree with you.
The "caching" requirement in robots.txt is a unique marking an not sanctioned by any laws.
If I start a new standard, say "dontdownload.txt", and your site does not follow, do I have the right to "cache" it?
There is one piece that is sanctioned - and it appears on every page of this, and most sites - Copyright (c). This is Internationally agreed upon through the Berne Convention for the Protection of Literary and Artistic Works of 1886, then reinforced by Universal Copyright Convention of 1952.
Lord Majestic, that makes the presumption that the copyright holder granted rights to reproduce the content of their site. If I do not care to have Google cache my site, why do I have to take action? This reminds me of the opt-in/opt-out with e-mail. Why do I have to make sure I am not included? I already notified them through the "Copyright (c)" that this material requires my positive consent. What you are suggesting is "if the publisher does not act then I forfeit my copyright".
| 5:02 pm on Jan 26, 2006 (gmt 0)|
|Digital Millennium Copyright Act, which protects databases, ISPs, and other online service providers that don't exert direct control over what content is posted against copyright liability. |
So, either they stop having quality teams and hand jobs or they become liable. A classic condition known by eg. everyone operating free homepages of the tripod variety. Either you leave it exactly as it is or you become liable.
| 5:04 pm on Jan 26, 2006 (gmt 0)|
>>Perhaps because you're absolutely misunderstanding the situation or ramifications?
Maybe, probably, perhaps, but not really!
If caching was illegal, the WWW should not have been invented and used! Your own computer is caching Webmasterworld content to allow you view it. It is technically a must! Do you really want to ask the question "For how long can I view WebmasterWorld content on my screen, and is it legal at all to view it"?! Do you really understand the situation or ramifications?
| 5:05 pm on Jan 26, 2006 (gmt 0)|
|Brett, thanks for posting the news, but re your comment: isn't that overstating it a bit? For it to neuter all copyright law, the ruling would have to allow any copying for any purpose, and it doesn't do that. |
It's nice to hear a voice of reason amid the hysteria. :-)
| 5:06 pm on Jan 26, 2006 (gmt 0)|
Brett is right - this is one of most important rulings on the Internet to date. If it holds, it will be a benchmark for digital rights.
| 5:08 pm on Jan 26, 2006 (gmt 0)|
wildbest, although your argument "sounds" reasonable, they are two different things.
In case of technical caching, which you do for example with WebmasterWorld, you are caching something that was already published to you, and you are caching it not to republish the material, but to simply assist you in viewing it again later in a faster fashion. There is the prime difference.
Google's cache's sole purpose is to republish.
| 5:14 pm on Jan 26, 2006 (gmt 0)|
Tapolyai, you are missing the point!
Until WebmasterWorld content is cached on your computer, it was cached hundred times on diffrenet routers and networks around the globe. So, does it mean it was re-re-re-re-republished hundred times?
| 5:19 pm on Jan 26, 2006 (gmt 0)|
> Don't lose context - this caching is
> done for billions of web pages, only
> a handful of people complain
A better statement is that only a handful of people understand it and how important this issue is to their website. I fundamentally believe, that without the google copied pages, Google would be just another infoseek flavor of the month.
> I don't see how that means it's open season on content.
What about all the vertical search engines that are caching?
> it is called "to publish your site"!
Exactly. Just as you publish a magazine. However, you can not make copies of Time magazine and sell them without breaking the law. It is the entire foundation that copyright laws stand on.
> that don't exert direct control over what content is posted against copyright liability.
Exactly again - search engines do not meet the standard to be called "Caching". Search Engines are not caching your page - they are republishing it with their banner advertisement on the top.
> the ruling would have to allow any
> copying for any purpose, and it doesn't do that.
Just meet the criteria for an ISP. As webmasters - you should have already registered as an isp.
Take a look at all the vertical search engines. They are caching stuff left and right. One of the reasons they do that is to end up in searchengine caches with their own cached page. So Yahoo/MSN/Google caches the cached serp! eg: that is all now legal. Lets all go start vertical search engines. Simply download Aspseek - fire it up, and seed it with the ODP database, and blam - instant legal caching search engine.
| 5:21 pm on Jan 26, 2006 (gmt 0)|
The cached pages add to the revenue of google..
I think, as any organization grows ( as google did ) they come across legal issues they may have never seriously thought of.
If this judgement holds in higher courts too, then caching will slowly rule the copyright laws on the net.
On the other hand, if the judgement is doen away with, google may simply to ask for a (compulsory) permission when a site or a sitemap is submitted. With the sstanding that google has today, it may refuse websites that dont allow cache. or they may be positioned very low somewhere.
Whatever happened to 'Do No Evil'..
| 5:24 pm on Jan 26, 2006 (gmt 0)|
The ruling was correct, but the real problem has yet to be resolved. The problem is that crawlers can not read Copyright (c) in a standard fashion. If you are copyright purist, it is not enough to voice your concern but create a foundation and develop a standard for Copyright (c) on the Web, so that crawlers can programmatically read it and obey its restrictions. But what can we do until that time comes? Ban the Web? Because a handful copyright purists refuse to put the following lines in a simple .txt file on their root folder:
Do we really have to label WWW illegal because they do not want to do that?!
| 5:24 pm on Jan 26, 2006 (gmt 0)|
|I fundamentally believe, that without the google copied pages, Google would be just another infoseek flavor of the month. |
I almost never use "cached" page feature - only at times when original site is down, which thankfully happens rare enough - less than 1% of my searches. I do not have hard stats at hand on usage, but I believe majority of people would click URL rather than cached page.
Infoseek was bad because its searches were irrelevant - if I had a choice between Infoseek with cached pages and Google without cached pages, then the choice would have been clear for me.
| This 189 message thread spans 7 pages: 189 (  2 3 4 5 6 7 ) > > |