Page is a not externally linkable
hyperkik - 12:39 pm on Jun 8, 2005 (gmt 0)
Beyond that, the question is of whether the use by the search engine or scraper is "fair use". See, e.g., Kelly v Arriba, 366 F.3d 811, 822 (9th Cir. 2002) (Discussing whether, under "Fair Use doctrine" a photograph search engine may present thumbnails of images owned by others). Compare EF Cultural Travel BV v. Explorica, Inc., 274 F.3d 577 (1st Cir., 2001) (Suggesting that a "scraper" designed to collect specific pricing information from a target website for the purpose of creating a competing price structure was unlawful). A scraper in the sense under discussion here has a very weak argument for fair use. Under the four elements of fair use, as discussed in Kelly, those factors which weigh in favor of the search engine weigh against the scraper. All four factors must be applied to any infringing use claiming to be "fair use". Here's a preliminary analysis, limited in scope by my available time: 1. Purpose and character of the use. In Kelly, it was noted that the use of the copyrighted material was incidental, and thus weighed only slightly against fair use. ("Arriba was neither using Kelly's images to directly promote its web site nor trying to profit by selling Kelly's images. Instead, Kelly's images were among thousands of images in Arriba's search engine database.") The same does not hold true of scraper sites, which use the excerpts gleaned from other sites in order to promote their sites in bona fide search engines. While scrapers don't seek to then profit through the sale of the copyrighted material, they do seek to profit indirectly through their use by diverting the Internet user to ad ad or affiliate link instead of to the copyright holder. The question posed in Arriba of whether or not the infringing use is transformative depends upon the scraper site, the manner in which copyrighted work is reproduced, and the amount reproduced. However, as the scraper seeks to supersede the copyright owner's use by diverting traffic to the scraper site, with the result "that people could use both types of transmissions for the same purpose", and given that the scraper is most certainly not about "improving access to information on the internet" by leading surfers to the original content, the scraper's case for transformative use is also very weak. 2. Nature of the copyrighted work. The fact that the copyrighted material at issue is already published on the Internet will weigh "slightly in favor" of a fair use argument. (Materials not yet published are given a bit more protection under "fair use doctrine", as publication of excerpts can substantially effect their future market value.) 3. Amount and substantiality of portion used. The implications of this factor will vary depending upon the nature of the original work, and the purpose of the reproduction. In Arriba, a thumbnail of the entire original work was deemed proper, because a search engine of pictures has little value if its users cannot identify linked content from the thumbnails. The amount of material varies depending upon the scraper site, but the harder test for scrapers to pass is substantiality. The fact that scrapers attempt to glean out those portions of a page which are of the greatest value, whether in terms of attracting Internet users or generating advertising revenue, as opposed to those passages most conducive to directing Internet traffic to the copyright holder's site, would weigh against them. Consider, e.g., Harper & Row, Publishers, Inc. v. Nation Enterprises 471 U.S. 539 (1985) (Holding that the publisher's use of the most valuable portions of a work weighed against its claim of "fair use".) 4. Effect of the use upon the potential market for or value of the copyrighted work. As the Kelly decision explains, this "factor requires courts to consider 'not only the extent of market harm caused by the particular actions of the alleged infringer, but also 'whether unrestricted and wide-spread conduct of the sort engaged in by the defendant . . . would result in a substantially adverse impact on the potential market for the original.''" The Kelly court found that this factor weighed in favor of fair use, because the images search engine would ultimately guide Internet users to the original work, and the infringing use would not substitute for the original. It also noted that the search engine was not in financial competition with the copyright owner, for example, by selling licenses to the original work. This factor seems to weigh heavily against scraper sites. The scraper seeks to divert traffic from the original copyright holder to the scraper's own site. Widespread use of scraper sites will significantly impair the market for the original in a variety of ways, including making it more difficult for potential users to find the original, and possibly by triggering "duplicate content" penalties in search engines. The scraper is often in competition with the copyright holder, and seeks to divert Internet users to its own advertisers instead of any products or advertisements which might be offered by the copyright holder. Scrapers do not wish their users to find the original copyright holder, and many design their content, omit key information, and set up link structures which make it difficult for surfers to actually get to the original material.
Search engines and scraper sites both get their data the same way: they "scrape" content from other sites. That's the technical side of this argument, and it's undeniable.
There's that lame, dishonest parallel between search engines and scrapers again. Perhaps you think if you repeat the same tired nonsense over and over again, or express abject nonsense with a sense of conviction, somebody will be fooled into believing it? First, a typical scraper operates in a very different manner than a search engine. That is, the scraper typically produces static pages which are served to users, whereas a search engine produces pages from a database in response to a specific user inquiry.