Forum Moderators: Robert Charlton & goodroi
Today, we've added new links to "Quick View" PDFs in your browser with the formatting intact. The new links are based on the same technology that's available in Google Docs and Gmail, as well as to webmasters through the Google Docs viewer. We've been rolling this technology out to the search results page since July, and as of today we've added "Quick View" links to more than 50% of the PDFs in our index. The new links appear at the end of the second line of the result, right underneath the title
So it looks like one more way that Google Search can distribute a site's content without requiring a direct visit to the site itself - and in this case, it's an entire document, not just a snippet. And the intention is to roll this out for other file format types, too.
This isn't just a "PDF viewer", it is part of Google's pattern of appropriating, hosting, and redistributing the intellectual property of others for their own gain, without the permission of the copyright owner.
First the cache, then Google books, now this. What's next?
Eventually Google will be able to more or less answer any question, since every other website is embedded in the Google domain.
They're definitely playing it smart, but IMHO it's about time to put a hold on this nonsense.
You don't want to be recognised in Street View when you're doing something embarrassing? How could we know, just file a complaint. You don't want your website indexed? Just modify your robots.txt. You don't want us to steal your PDF? ....How could we know you didn't want that?!
They're pushing it more and more and I don't like it.
Should Google have the right to host and distribute your intellectual property without your explicit permission?
It was old even before the first time someone said anything like this.
If you do NOT exclude something on your site via robots.txt then you have NO RIGHT to complain. It's no different from driving around not knowing what speed limits are and then complaining about getting a ticket for speeding. If you don't know how to operate a website then you don't have any justification for complaining about how others operate theirs.
- John
If you do NOT exclude something on your site via robots.txt then you have NO RIGHT to complain.
is not how the copyright laws have been constructed. The copyright laws have been constructed as "opt-in", i.e. by default noone (not even Google) has the right to use material without acquiring the rights to it.
However, Google -over the past few years- has made clear that they don't care about this and made their search service "opt out", i.e. they WILL take what material they can get (whether they have the rights or not) and will only stop if you as a rights holder stop them. But just because they do, it does NOT mean they have the legal right to do so.
(For search they could get away because they claim to use just snippets and thumbnails which are covered by "fair use". I don't think they can as easily claim "fair use" when displaying entire PDF files.)
Now off to creating PDF files...!
Ripping documents like PDFs off sites and presenting them in a Google branded viewer however is something quite different, and obviously blatant copyright infringement.
I think this is just a fancy viewer. The problem is the webmaster will be seeing stats credited to the Google server, but that's the only real issue. I await to be proven wrong.
The question I have is whether there is an easy way to tell Google (and Bing) not to cache PDF document or show it in the viewer. Yahoo will follow a noarchive HTTP header, do the others?
an easy way to tell Google (and Bing) not to cache PDF document or show it in the viewer
as ted mentioned, the X-robots tag [googleblog.blogspot.com] is probably the way to go.
Initially they will use snippets and then expand on those snippets until the viewer does not need to go to the target site at all.