| 7:46 pm on Jan 26, 2006 (gmt 0)|
I think the tag you are looking for is noarchive.
| 7:53 pm on Jan 26, 2006 (gmt 0)|
>>This lawsuit was almost custom ordered by Google to ensure a win. Works with no commercial value, entrapment and bad lawyering. What a deal.
Wow, what a conspiracy? :)
| 7:59 pm on Jan 26, 2006 (gmt 0)|
No, not a conspiracy. I think they were just lucky when it came to who decided to sue them, and how he went about it.
Copyright cases in federal district court aren't played the same way as personal injury cases at the county courthouse.
| 8:02 pm on Jan 26, 2006 (gmt 0)|
Instead of getting in a tizzy over caching, it might be worth considering a much more real issue: Framing third-party content with ads. About.com has been doing this for years, having followed a precedent set by Ask Jeeves. (Indeed, About.com's former CEO once defended his company's framing by pointing out that Ask Jeeves did it and hadn't been sued.)
Unlike Google's caching, which may benefit Google indirectly, framing of third-party pages with ads offers direct revenues to the framer.
One could argue that framebreaking code is a quick-and-easy defense, but it isn't that simple, because framebreaking code can cause other problems and needs to be implemented carefully (unlike Google's benign "nocache" tag).
As far as I know, the only major case involving framing was the TotalNews lawsuit of 1997, which was brought by the NY Times, Washington Post, CNN, Reuters, and a few other companies after TotalNews.com framed their stories with its own ads and navigation scheme. Unfortunately, that case was settled out of court, so no precedent was set--and today, at least one major Web property--About.com, which ironically is now owned by the New York Times Company--is profiting from other the content of other Web sites by running ads above their content.
| 8:19 pm on Jan 26, 2006 (gmt 0)|
> If he did in fact fail to properly argue it
He didn't argue it at all because the judgement was not based on that point.
> could have used no cache.
As a lawyer, he knew he could not put a proprietary tag on his site.
<meta name="googlebot" content="noarchive">
is implicity copyright Google and may be covered by G's patents.
| 8:24 pm on Jan 26, 2006 (gmt 0)|
I had a bit of a chance to read the documents on this, thanks to EFF. Although my initial worries are there, I have to say that it's my layman's opinion Blake Field set this up for the specific purpose of suing Google, and not to protect his publishing.
From the legal precedings:
|With this knowledge, Field set out to get his copyrighted works included in Google’s index, and to have Google provide “Cached” links to Web pages containing those works. |
|Field created a robots.txt file for his site and set the permissions within this file to allow all robots to visit and index all of the pages on the site. |
|When Google learned that Field had filed (but not served) his complaint, Google promptly removed the “Cached” links to all of the pages of his site. See MacGillivray Decl. ¶2; see also Countercls. ¶22; Ans. to Countercls. ¶22. Google also wrote to Field explaining that Google had no desire to provide “Cached” links to Field’s pages if Field did not want them to appear. |
It is now clear. If you do not know of a standard, such as "NOFOLLOW, NOINDEX", that is too bad. The Web has just moved from teenager to adult. No longer the uninitiated may just throw up some material and presume everything is fine. Look for the next de facto standard to protect/erode your business.
| 8:24 pm on Jan 26, 2006 (gmt 0)|
|As a lawyer, he knew he could not put a proprietary tag on his site. |
So what, now Netscape/AOL can just sue those sites that use BLINK tag? Or maybe Microsoft can now sue people who use Crawl-Delay in robots.txt?
| 8:26 pm on Jan 26, 2006 (gmt 0)|
Does anyone have any numbers on how many people use Google's cache, besides SEOs? I'm curious how big a problem this is.
| 8:27 pm on Jan 26, 2006 (gmt 0)|
mykel79, I don't thank scale or scope are the issue.
| 8:32 pm on Jan 26, 2006 (gmt 0)|
|If you're really, really worried about this, why not simply put a "nocache" tag on your pages? |
... because that is simply saying that what they are doing is okay, and that I will comply with "The Law of Google".
Google caches everything it comes in contact with without regards to the knowledge level of the webmaster (and it is not their requirement to know that Google will copy their work if they don't "opt out")
This includes artistic works, web based games, text based content... everything. I am not comfortable with this, never have been, and never will be. - and I shouldn't have to risk economical internet suicide to prevent it.
| 8:37 pm on Jan 26, 2006 (gmt 0)|
A brief outline of the case itself: [out-law.com ]
| 8:44 pm on Jan 26, 2006 (gmt 0)|
|It is now clear. If you do not know of a standard, such as "NOFOLLOW, NOINDEX", that is too bad. |
Actually, that is the one thing that I think is very unclear. Field knew of the meta tags and robots.txt. If someone that does not have a knowledge of such things were to file the lawsuit I would be willing to bet that it would not have been decided at the summary judgement stage.
I really think that it was the estoppel and the implied license from the robots.txt that killed the case and will keep it dead.
If the ruling stands it will almost certainly hurt future cases by those that do not know about robots.txt and the meta tags, but with good lawyers they should at least be able to argue about the differences between this case and their own.
Like I said, this is Google's dream case on this sort of issue. The plantiff behaves like the bad guy, so google becomes the good guy by default.
| 8:51 pm on Jan 26, 2006 (gmt 0)|
In my opinion Blake Field is not struggling against Goliath.
It is sad, that this has happened this way. There is something to be said about Google profiting on non-tagged material.
I would have preferred a grandma with no technical knowledge discovering this on her poems or love letters. Then it might have ended differently. I believe Mr. Field, by admitting that he was fully aware of the tags, and his steps of filling copyrights, revealed his intentions.
I stand corrected BigDave. You are right. That is unclear.
| 8:56 pm on Jan 26, 2006 (gmt 0)|
It seems Google is becoming less of a search engine and more of a found engine. Isn't the point of a search engine to drive people to results? To find things elsewhere?
I'd be more comfortable if users could only access the cached version if the original site is down. That would be a service to both users and webmasters.
| 9:01 pm on Jan 26, 2006 (gmt 0)|
|Then it might have ended differently. I believe Mr. Field, by admitting that he was fully aware of the tags, and his steps of filling copyrights, revealed his intentions. |
It does not seem to matter much what his 'intentions' were. In my mind, this is very cut and dried:
Google has a copy of someone else's work without their expressed permission.
That is copyright violation. Period.
Opt out is not expressed permission. Opt-in, however, is, and I believe Google should make inclusion 'opt-in'.
| 9:02 pm on Jan 26, 2006 (gmt 0)|
|I'd be more comfortable if users could only access the cached version if the original site is down. That would be a service to both users and webmasters. |
That's not a bad idea. One difficulty would be determining whether the original site is down. (Sometimes a site is invisible to some users but not to others.)
| 9:08 pm on Jan 26, 2006 (gmt 0)|
> cache like vertical search engines
You mean like the former moderator of this forums <snip> search engine? Or is that a little too close to home Brett?
[edited by: Brett_Tabke at 9:41 pm (utc) on Jan. 26, 2006]
[edit reason] please - no urls... [/edit]
| 9:22 pm on Jan 26, 2006 (gmt 0)|
Brett, you said that registering as an ISP would offer some 'safe harbour' protection...
What would be the use of registering my company as an ISP?
Where could I do that ;)
| 9:26 pm on Jan 26, 2006 (gmt 0)|
|According to the undisputed testimony of Google’s Internet expert, Dr. John Levine, Web site publishers typically communicate their permissions to Internet search engines (such as Google) using "meta-tags." A Web site publisher can instruct a search engine not to cache the publisher’s Web site by using a "no-archive" meta-tag. According to Dr. Levine, the "noarchive" meta-tag is a highly publicized and well-known industry standard. |
Since when did "noarchive" become an industry standard? Sounds to me like Google's expert misrepresented the facts.
| 9:36 pm on Jan 26, 2006 (gmt 0)|
Something does not have to be accepted by a "standards body" to become a "standard". In fact the strongest standards around are defacto standards. The internet is built on RFCs that have never been accepted as standards.
Field even uses one of those defacto standards (robots.txt) to specifically allow google to crawl his site with the knowledge that it will cache those pages.
All the legitimate search engines follow those "standards" and millions of sites use them. They qualify as a standard.
<added>The decision mentions "undisputed". That does not mean that no one can dispute it, it means that Field did not dispute it in his filings or at the hearing. Again, that is where you want to make sure you have a good lawyer and your own exper witnesses.
| 9:41 pm on Jan 26, 2006 (gmt 0)|
|This legal ruling is absolutely correct. Remember, when you make your content public on the Web, it is called "to publish your site"! When you make your content publicly available and you don't define any restrictions on your robots.txt file, that means your content can be cached. |
|If he wants Google not to cache his pages, there's a far simpler remedy than launching a federal lawsuit. Thus the focus of the suit is less on copyright infringement and more on the plaintiff's feeling that he shouldn't have to take any action to prevent cacheing. |
| 9:45 pm on Jan 26, 2006 (gmt 0)|
|Since when did "noarchive" become an industry standard? Sounds to me like Google's expert misrepresented the facts. |
I knew about it, and if I had asked Google, they would have told me about it.
|Google caches everything it comes in contact with without regards to the knowledge level of the webmaster (and it is not their requirement to know that Google will copy their work if they don't "opt out") |
I think I've said this before, and I think someone flamed me for it. But if you're publishing copyrighted material on the Web, you ought to know what you're doing.
[edited by: mcavic at 9:50 pm (utc) on Jan. 26, 2006]
| 9:46 pm on Jan 26, 2006 (gmt 0)|
MSNBOT doesn't obey ROBOTS NOARCHIVE. It requires MSNBOT NOARCHIVE.
There is no standard, de facto or otherwise. Any search engine can come up with their own proprietary form of NOARCHIVE and it is incumbent upon webmaster's to understand and know them all.
| 9:53 pm on Jan 26, 2006 (gmt 0)|
|There is no standard, de facto or otherwise |
The fact that there is no standard is bad. But that's not Google's fault.
| 9:54 pm on Jan 26, 2006 (gmt 0)|
|it is incumbent upon webmaster's to understand and know them all. |
That is not what the ruling said. The ruling was largely based on the fact that Field KNEW how to keep it from being archived, yet did not do it.
By his actions and admissions he wanted google to cache it so he could sue them.
If he does a robots noarchive expecting it to work on MSN and it does not, even if it works on all others, then he very well might have a case against MSN using the argument of Google's expert in this case.
| 10:01 pm on Jan 26, 2006 (gmt 0)|
Oh well, can we create a standard here? How is it correct to put it down, caching or cacheing? :)
| 10:10 pm on Jan 26, 2006 (gmt 0)|
It's not typically easy to nail down the win at the summary judgment level. Yet it happened here, and it happened on multiple points.
The camp which holds the belief G's cache violates copyright, period, over and out, done may desire to evaluate anew this belief rather than dig in heels and stomp and snort.
I think G gets a well played and played well nod here.
I also think the disposition of an appeal, assuming an appeal is taken by the non-prevailing party, will prove an interesting read.
| 10:38 pm on Jan 26, 2006 (gmt 0)|
Wow! I remember starting this discussion with concerned friends back in the early 90s when browsers started caching. Back then, without Google and without any lawyers, we agreed that web pages get cached. Fish like water. People breathe air. Putting up a web site means accepting the fact that it will be cached and used in ways that are out of our control on a daily basis. If someone caches (saves) your IP and then claims it in its entirety (or even substantially) as their own ... that's a violation. If the cached copy is used in a way that benefits you, then ... wait for it ... it benefits you!
Re: volatilegx's post containing precedents:
|Agreeing with the analysis in Netcom, we hold that the automatic copying, storage, and transmission of copyrighted materials, when instigated by others, does not render an ISP strictly liable for copyright infringement |
As each reference noted. This group of precedents refers to whether the ISP can be held liable when one of their customers commits copyright violations.
Re: Brett_Tabke's comment:
|Search Engines are not caching your page - they are republishing it with their banner advertisement on the top |
Well ... they are, in fact, caching it. They are also, in fact, adding an additional brand (theirs) to the version of the page when it is viewed from within their cache. They didn't change the page, and hence the framing arguments. Fortunately, all of the promise held by the original page is still intact, including links to the current version and any other links to your other 'real' (not the cached) pages, and your 'buy stuff from us now' messages. If you're a plumber, the visitor viewing the cached page still has your phone number ... right?
Rather than Google selling/re-publishing/profiting from your complete previously published work (which is the whole site, not just one page, right?), they are basically sticking a copy of one of your pages (at a time) up on the front of their newsstand. You may choose to shop from their offerings while looking at the poster, but if the poster has any interest to you, you would be inclined to follow its promise independent of where you first found it. Google doesn't sell plumbing services, yet. Are they in competition with you? I don't see any ads except for those already embedded in the pages, so in reality they are only benefiting from their branding, not from anything that would cost you a sale.
Re: HeatherR's comment:
|Google has a copy of someone else's work without their expressed permission. That is copyright violation. Period. |
Nope. There's wiggle room, as many have stated.
This is a little 'cache 22', if I may. We like the search engines, they bring us customers, we just don't want to have to do anything to directly address the technology we all know they use ... and goodness knows we don't want them to make any kind of money!
Seriously, folks, the big picture is that this is a very narrow ruling that does not address scrapers or router caching or even browser caching. This only addresses the situation brought up by Mr. Blake with regard to his relationship with Google. The question is nowhere near settled, but for Mr. Blake, it is. (Are we certain he's not a Google contractor, getting some jurisprudence onto the books for their benefit?)
| 10:41 pm on Jan 26, 2006 (gmt 0)|
(the bold is mine)
|I can't just go around the internet copying images and expect a link to the original authors works to suffice |
What concern's me is that this seems to have become accepted. Look at all the scraper sites. They put in a link to me when they copy and publish a snippet from my site. Google links to me when they publish the cached version of my page. Neither one resembles what I thought 'fair use' meant.
I don't have a problem with my site being cached but publishing it must be a copyright violation. I could live with search engines doing it but it seems to be opening it up to everyone else.
First search engines used a snippet to describe my site. Now thousands of scrapers do as well. They will just claim they are providing search information. So now hundreds if not thousands of pages on the internet are using info from my sites.
So maybe the whole thing needs to be revisited. I don't have any answers but I'd sure like to see a solution.
| 11:13 pm on Jan 26, 2006 (gmt 0)|
|The question is nowhere near settled, but for Mr. Blake, it is. |
Exactly why the claim of "the most important decision" yada yada yada is overstating it a bit.
It may lead to that, but if you are hunting without a license, I suspect there will be a lawsuit in your future and this decision wont help you.
|(Are we certain he's not a Google contractor, getting some jurisprudence onto the books for their benefit?) |
Maybe google will show their appreciation by bestowing higher PR on Blake's site.
| 12:14 am on Jan 27, 2006 (gmt 0)|
I see this time and time again (and I only read up to page 3 so far)
>> By putting a simple 'no cache' tag....
How do you put that tag on an image, or on a PDF or Word document; or does this ruling apply to HTML content only?
| This 189 message thread spans 7 pages: < < 189 ( 1 2  4 5 6 7 ) > > |