Any case history for suing a search engine?

Forum Moderators: not2easy

Message Too Old, No Replies

Any case history for suing a search engine?

blaze

9:49 pm on Jul 20, 2004 (gmt 0)

Is there any case history for suing Google/Yahoo for copyright infringment?

BigDave

10:39 pm on Jul 20, 2004 (gmt 0)

Nothing for google or yahoo that I could find, but search engines have been sued for infringement when they have copied another engine's code.

If you do sue, you will be infamous for having helped create new case law, or even new legislation.

Labyrinth

5:45 pm on Jul 21, 2004 (gmt 0)

There is case history on an "image search engine" lawsuit for copyright infringement.

The site being sued (ditto.com, formerly arriba.com) maintained the right to display thumbnails, but not the full sized images. From the judges summary, some text that might be of interest to someone considering a suit against a search engine:

The Copyright Act was intended to promote creativity, thereby benefitting the artist and the public alike. To preserve the potential future use of artistic works for purposes of teaching, research, criticism, and news reporting, Congress created the fair use exception.26 Arriba’s use of Kelly’s images promotes the goals of the Copyright Act and the fair use exception.The thumbnails do not stifle artistic creativity because they are not used for illustrative or artistic purposes and therefore do not supplant the need for the originals. In addition, they benefit the public by enhancing information-gathering techniques on the internet.

emphasis mine

For details, do a search on "kelly vs. arriba".

john_k

5:57 pm on Jul 21, 2004 (gmt 0)

Is there any case history for suing Google/Yahoo for copyright infringment?

There is plenty of case history regarding companies and individuals being sued for copyright infringement. Exactly what manner of infringement are you concerned about? I'm not trying to be nit-picky, I just think you will get more help if you elaborate some.

People have complained about Google displaying content from their sites in the search results, showing images on their image searches, allowing advertisors to use ad copy that infringes, listing sites that have infringed, and probably others that I never heard of.

Brett_Tabke

6:16 pm on Jul 21, 2004 (gmt 0)

> what manner of infringement

Branding cache pages.

chadmg

6:29 pm on Jul 21, 2004 (gmt 0)

Any such lawsuit IMHO would be frivolous in nature and thrown out since you can quite easily stop them from caching or even indexing your website. I suppose the alternative would be to ask you before they indexed/cached your page, an automatic denial and a specified allow list. But I would think that any somewhat knowledgeable judge would understand that that is counterproductive to the purpose of the internet. The internet is for the harvesting and interpretation of information, via person or robot.

HughMungus

6:37 pm on Jul 21, 2004 (gmt 0)

Any such lawsuit IMHO would be frivolous in nature and thrown out since you can quite easily stop them from caching or even indexing your website.

BigDave

6:45 pm on Jul 21, 2004 (gmt 0)

Branding cache pages.

Displaying cached pages are really the only thing were I even see there being any chance. But it certainly will not be any easy win, as there is a standard to inform robots that you do not want them caching your data.

That will be enough to lead to huge amounts of legal analysis. The judge will rule on some matters of law, but this case WILL go to a jury, and the jury will be made up of internet users, the judge will be a google user, and the legislators will rush in and try and fix the copyright law to help take into account the modern age and the realities of the internet.

Brett_Tabke

6:46 pm on Jul 21, 2004 (gmt 0)

>Any such lawsuit IMHO would be frivolous in
> nature and thrown out since you can quite easily stop them

So it is ok to take from your house if the doors are unlocked?

Using that logic, is like saying stealling is ok unless you have an explicit sign up that says not too.

SE's run crawlers that visit sites whether they are invited or not. Most often any more - they are not invited.

>there is a standard to inform robots

Robots.txt was never approved by any known internet standards group. Neither is it strickly supported by the search engines themselves. Lastly, the robots.txt proposal can be interpted many many different ways.

BigDave

6:52 pm on Jul 21, 2004 (gmt 0)

Copyright law allows me to use someone else's creation unless they specifically indicate otherwise?

Actually, in some cases, yes. Or more accurately, in some cases you are allowed to use another's creation without their permission, whether or not they say "no". But if they if they specifically say no, it would be considered polite to agree to their wishes.

By posting it on the internet, you are implicitly agreening to certain forms of copying. The question that the court will have to answer is how much, and what sort of copying you are agreeing to.

Brett_Tabke

6:52 pm on Jul 21, 2004 (gmt 0)

I can't believe I am about to say this...but you know what we as publishers need?

We need an orginaization as strong as the RIAA to protect copyrighted materials.

zooloo

6:55 pm on Jul 21, 2004 (gmt 0)

Copyright law allows me to use someone else's creation unless they specifically indicate otherwise?

There is in the UK "fair use".

This ranges from making a copy of a CD for my car to reproducing sections of a work for, say, review.

zoo

Webwork

6:55 pm on Jul 21, 2004 (gmt 0)

I think the jury award you get for such a lawsuit is having all your websites permanently removed from all search engines.

Sort of a pyrrhic victory to me, ya know, the part about burning all your ships to win the battle?

Labyrinth

7:06 pm on Jul 21, 2004 (gmt 0)

THe "robots.txt/no cache" argument is a red herring -- copyright ownership alone is enough to protect from infringement (at least in the eyes of the law). Copyright holders are not required to take additional measures dependent on publishing medium.

Additionally, nothing you put in robots.txt would prevent other sites from making "fair use" of your content.

HughMungus

7:09 pm on Jul 21, 2004 (gmt 0)

There is in the UK "fair use".

Referring to non-fair use uses.

Labyrinth

7:10 pm on Jul 21, 2004 (gmt 0)

By posting it on the internet, you are implicitly agreening to certain forms of copying.

That statement is either hogwash or so broad as to be meaningless. Publication on the internet does not convey "implicit" agreement to certain forms of copying any more than pulication in a magazine.

HughMungus

7:12 pm on Jul 21, 2004 (gmt 0)

Additionally, nothing you put in robots.txt would prevent other sites from making "fair use" of your content.

I looked into this in relation to the question about site scraper directories and it turns out that there are very specific fair use uses that are allowed. I doubt framing and scraping are allowed. I'm surprised about.com hasn't been sued for framing yet...but maybe the traffic they bring is worth being framed.

edit: or maybe they get permission (not trying to disparage about.com here)

figment88

7:58 pm on Jul 21, 2004 (gmt 0)

>there is a standard to inform robots
Robots.txt was never approved by any known internet standards group. Neither is it strickly supported by the search engines themselves. Lastly, the robots.txt proposal can be interpted many many different ways.

In addition to Brett's reply, one needs to remember that Google and other search engines do not fully implement the robots protocols. One of the directives is allow. Seems to me that a case can be made that the existence of robots protocols strengthens the argument that search engines should not spider any site that does not give them permission.

Also, people need to remember that copyright rules vary greatly by country.

1) No country comes close to the US' fair use guidelines for allowing unliscened use of copyright materials.

2) Many countries give copyright holders moral rights. Interestingly, I think the same argument used to allow thumbnails in the US might deny their use in countries that grant moral rights.

Brett_Tabke

8:05 pm on Jul 21, 2004 (gmt 0)

I agree that it would be a fruitless and futile move for one small individual to take action against any large se that using a branding cache page.

BigDave

8:20 pm on Jul 21, 2004 (gmt 0)

Robots.txt was never approved by any known internet standards group. Neither is it strickly supported by the search engines themselves. Lastly, the robots.txt proposal can be interpted many many different ways.

"Standards groups" do not create standards. Usage creates standards.

Robots.txt and robots meta tags are the "standard" that the big search engines use. That will be brought up in court, and you will have to beat it, because commonly accepted useage does count in copyright cases.

THe "robots.txt/no cache" argument is a red herring -- copyright ownership alone is enough to protect from infringement (at least in the eyes of the law). Copyright holders are not required to take additional measures dependent on publishing medium.

But what if the court rules that it is not infringement? Use or even copying of your protected work is not by definition infringement.

That statement is either hogwash or so broad as to be meaningless. Publication on the internet does not convey "implicit" agreement to certain forms of copying any more than pulication in a magazine.

Oh, how wrong you are. It is not possible to even view a website without making a local copy. It is how the internet and computers work.

I doubt framing and scraping are allowed.

There have been cases on framing decided in both directions. I haven't looked into it much, but it appears to be related to how much you make it look like your own content.

-----

While I don't doubt that a case could be brought successfully against Google, I don't think that the results of such an action are as limited as what you might suspect.

Webwork hit the first stage just right. You will be poison after that.

But the big result is that the law will be changed, either by the court or the legislature. The court will not order the crippling of the web.

Labyrinth

12:18 am on Jul 22, 2004 (gmt 0)

It is not possible to even view a website without making a local copy.

Yeah, I believe that falls into the "so broad as to be meaningless" category.

Any other examples that fall into your "you are implicitly agreening to certain forms of copying"?

hunderdown

4:43 pm on Jul 23, 2004 (gmt 0)

I've been thinking over this issue and it seems to me that any case in this area revolves around how much content a search engine could reasonably "use" under fair use rules.

In the US, "fair use" is tricky because (to the best of my publishing industry layman's knowledge) there isn't a lot of case history.

But if you focus on the glossary issue and ask, "Is it fair use for a search engine to quote an entire definition from a glossary?" you FIRST have to settle the question of whether it's reasonable to claim that each definition is a work in and of itself, or if they are merely parts of a larger work, which is the glossary.

Why does this matter? Because fair use is heavily dependent on context. It's generally accepted as OK to quote, say, three sentences from a 300-page novel, without needing to seek permission or pay a usage fee. But if you quote the same amount of material from a poem, you may be using a meaningful chunk of it. And that's why ASCAP can be so aggressive in charging for the use of song lyrics--they're short, so even a small piece of them is arguably worth something.

In the case of the Google glossary, they don't copy and then display an entire glossary from a site. They just show one item at a time, though they do show the entire item. And I think they could argue that the "work" in this case is the glossary, not the definition, since some at least of the items in just about any subject-area glossary will be worded similarly if not identically to items in someone else's glossary. In other words, what's unique (and therefore valuable under copyright law) is not a particular definition, but someone's complete glossary.

You could argue that they DO end up using a substantial part of a given glossary, over a number of searches. I'm not sure how that would change things. And I have noticed, in the case of the glossary on my site, that it comes up as a source when I search on some of the words that appear on it, but not when I search on others. Perhaps they put a limit on the number of items it pulls from a given glossary.

I end up feeling that Google would have a pretty strong defense if anyone actually went so far as to sue over the use of such content. Probably no one will, but the situation is worth keeping in mind when ANY web developer uses content from any other web site....

BigDave

4:49 pm on Jul 23, 2004 (gmt 0)

I end up feeling that Google would have a pretty strong defense if anyone actually went so far as to sue over the use of such content. Probably no one will,

Not me, that's for sure. I'm adding a glossary to the site that I am working on right now in the hope of picking up some of that google definitions traffic.

I know that I have found some very useful sites from following those links.

chadmg

10:41 pm on Jul 25, 2004 (gmt 0)

So it is ok to take from your house if the doors are unlocked?
Using that logic, is like saying stealling is ok unless you have an explicit sign up that says not too.
SE's run crawlers that visit sites whether they are invited or not. Most often any more - they are not invited.

Brett, are you saying that anyone who has not been specifically invited to WebmasterWorld should not visit? By creating a website, you are putting up a sign that says come on in. Google isn't stealing your work. They are crediting you for it. It's fair use if I've ever seen it. I hope you're just playing devil's advocate.

It's these type of lawsuits that make a mockery of the judicial system. It's frivolous and you know it. Time could be better spent going after real copyright thieves. If you don't like it, password protect your website.

blaze

2:28 pm on Jul 26, 2004 (gmt 0)

So would you say the same thing if I quoted from your website and ended up higher in the SERPs than you because of it?

All I did was auto-scrape your content and put the results on a web page.

And of course, I'd put AdSense on the web pages so I could profit from your content.

This is pretty much all Google is doing..

john_k

3:10 pm on Jul 26, 2004 (gmt 0)

I think comments in this thread need to be attached to the area of concern. Are your comments related to Google's search results listings, or are they directed at the cached full page content?

Search Engine/Results
Some people may have issue with the search results listings. I think that these concerns are without merit. Any search engine that simply lists results with brief, automated snippets from those sites listed is making fair use of published material while providing a valuable and necessary service to internet users.

I think it would be extremely difficult, if not impossible, for this aspect of Google's (and other search engines') business to be deemed as infringing on copyrights. Website publishers that don't want to be found in search engines such as Google can utilize robots.txt.

Page Caching
The cached pages are an entirely different matter. With this service, full content is copied and reproduced without consent of the copyright owner. The presentation is often mangled thereby reflecting a poor image of the publisher. Allowing users to see content of sites that might be down is a neat idea. Maybe Google could sell this service to publishers. In any event, the current implementation is well beyond any definition of fair use; it is not critical to the overall usefulness of the internet; and could be implemented on an opt-in basis.

It is widely recognized that being listed in major search engines is a critical ingredient in a successful online publication (any type of website). This fact warrants liberal approach to the fair use standard when applied to search results provided by a search engine service. However, this same fact also points out that publishers are not able to simply opt out of the full-page caching by using the robots.txt file. Doing so will prevent them from being found in search engines, which is critical. The necessity to give access for search engine indexing cannot be seen as permission to use and repackage the full content.

blaze

3:50 pm on Jul 26, 2004 (gmt 0)

I think Google should ask people to put something in their robots.txt which gives them permission to scrape.

This would solve a lot of copyright problems.

figment88

3:56 pm on Jul 26, 2004 (gmt 0)

I agree that the case for search engine results is not as strong as that for cached pages, but I do not think it is without merit.

1) Why do people keep saying all snipets are or are not infringing of copyright? Why can't some snippets be fair use and others not? Snippets from the description meta-tag, or DMOZ seem to be fairly clear fair use, but others seem more dodgy.

2) No country allows for as wide a use of unliscensed copyright materials as the US does under fair use - why apply a standard from a single country to a global enterprise.

3) Many countries grant copyright holders "moral rights." Mangling copyright material into snippets and not always presenting clear patrimony can be seen as a violation of moral rights.

4) In order to produce the SERP, Google spiders websites. Unless someone a has robots.txt file or meta tags that specifically grant this permission, it can be seen as an unliscensed use of copyright material.

5) In order to produce the SERP, Google makes an index of the page. An index can be argued to be an unliscensed derivative product which is specifically denied under copyright law.

While people may disgree with these points, I think it is a far stretch to say they are all without merit. Besides, I'm sure a smart lawyer could think better arguments than me.

BigDave

4:59 pm on Jul 26, 2004 (gmt 0)

1) Why do people keep saying all snipets are or are not infringing of copyright? Why can't some snippets be fair use and others not? Snippets from the description meta-tag, or DMOZ seem to be fairly clear fair use, but others seem more dodgy.

There is certainly a case to be made that not all snippets are fair use. In many cases, they probably are not fair use, but by suing google for their snippets, you will have trouble ever getting any site licensed to you listed in an SE again.

You will have to pay a lawyer to sue them. In addition to actually winning the case, do you realize what must take place for you to actually win attorney's fees? Was your copyright registered at the time of first infringement? There are several other ways that they can get out of paying your legal fees.

2) No country allows for as wide a use of unliscensed copyright materials as the US does under fair use - why apply a standard from a single country to a global enterprise.

Uh, because the US is where the infringing action took place, therefore it is the juristiction that applies?

3) Many countries grant copyright holders "moral rights." Mangling copyright material into snippets and not always presenting clear patrimony can be seen as a violation of moral rights.

Yup, you should sue on this.

4) In order to produce the SERP, Google spiders websites. Unless someone a has robots.txt file or meta tags that specifically grant this permission, it can be seen as an unliscensed use of copyright material.

Or it *could* be seen as the way the web works when you publish to the web, you are making it available to EVERYONE to look at. Spidering is the eqivalent of looking. The last time I checked, looking was not one of the things controlled by copyright.

You might be able to win on an argument like this, but then again, you might have to pay everyone's legal fees out of your own pocket if the court decides that the other way that it "could be seen" is the correct version.

5) In order to produce the SERP, Google makes an index of the page. An index can be argued to be an unliscensed derivative product which is specifically denied under copyright law.

And how is this different than the browser cache that is on your computer right now?

Go ahead and sue them for the SERPs. I would love to see what happens. I would even like to see you win, because hopefully the legislators will then see what is wrong with the direction they have been going with copyright laws lately.

Labyrinth

7:07 pm on Jul 26, 2004 (gmt 0)

And how is this different than the browser cache that is on your computer right now?

Oh I don't know... perhaps the difference is that Google isn't making a "personal use" copy? That Google publishes the page and makes it accessible to the general public -- supplanting the need for the original?

Is that the difference you meant?

Or are we excluding "cache" from this part of the discussion?

This 46 message thread spans 2 pages: 46