I just checked another of my sites that denies G IPs access to image dirs, etc., and enforces same via .htaccess. The preview shows up in search results, without the images.
But not for lack of trying --
Mozilla/5.0 (en-us) AppleWebKit/525.13 (KHTML, like Gecko; Google Web Preview) Version/3.1 Safari/525.13
11/18 18:32:38 /dir/filename.gif
Mozilla/5.0 (en-us) AppleWebKit/525.13 (KHTML, like Gecko; Google Web Preview) Version/3.1 Safari/525.13
11/18 18:32:39 /dir/filename.gif
I don't mind the bot that much. It makes a nice preview of my sites. However, the one thing that I don't like is that it does not display NOFLASH content. The previews in Google just display big empty spots when I specifically created a nice flash alternative placeholders. The placeholders communicate basically the same thing as the flash widget but are designed for users who cannot view flash. I guess I'll have to use more JQUERY because Google's preview displays that just fine.
Another thread [webmasterworld.com...] addressed how webmasters might benefit from the new Instant Preview. In my view this missed the point - I see no potential benefits, only potential pitfalls.
There is a lot of confusion as to how previews work, so here is an attempt to dispel it:
Google says the vast majority of "previews" are actually made by Googlebot, and that the Google Web Preview bot is only used "occasionally".
Google Web Preview is a prefetcher bot that is apparently only used when some resources are denied to Googlebot by robots.txt instructions - it only exists to get around those restrictions.
Google Web Preview appearing in your logs does not necessarily mean that somebody actually previewed your site - if someone invokes a preview on any given SERPs page, previews for all the other pages listed are fetched, either from Google's cache or by sending the prefetcher bot.
Google Web Preview bot uses a variety of IP ranges (addressed earlier in this thread). There is apparently no way to verify that it is the genuine article.
Google insists that you must not serve Google Web Preview different content to Googlebot (on pain of severe penalty). Webmasters have no control over their previews.
Google advises that "In order to block crawlable images from being indexed, you can use the "noindex" x-robots-tag HTTP header element." In doing so you will be agreeing to Google downloading and caching your images (bandwidth costs to be borne by you).
Google implemented this new feature without Flash support, and reportedly does not display any alternate content you may provide. Any Flash content on your pages will show an uninviting blank area. Silverlight and Java content is treated similarly.
Google's prescribed method of opting out of previews if you do not want them (the nosnippet tag) penalises you by having your text snippet removed.
Google's previews are very large - far too large to be described as "thumbnails". Google apparently believes it is "fair use" to screenshot every page in the SERPs - and also to modify those screenshots as they see fit.
Google appears not to have listed which browsers support previews, though it seems apparent that not all do. Google suggests testing in Safari, though I have yet to see the feature in my version.
Google launched Instant Previews on 9 November and did not warn webmasters in advance. There are many reports of previews with missing images and misrepresented layouts, as well as skewed analytics caused by the prefetcher bot.
Google's Instant Preview FAQ [sites.google.com...] appeared some time after launch.
The opening post in this thread perceptively stated:
|The obvious question raised by this, is the effect it will have on click-through rates. |
The obvious answer is that any webmaster whose previews currently look bad - through no fault of their own - is likely to be being adversely affected right now.
Fair to say that Bing treats webmasters (aka content owners) no better.
Google's system is using markup as an indicator, one of many no doubt. You'll notice that on wordpress blogs with comments the preview always ends immediately before the comments begin.
I have always had a NOARCHIVE header and now Google has all of my pages in preview.
So, despite me requesting that Google not ARCHIVE my website, it has done that with every page in my domain.
So, Google Preview is essentially violating webmaster trust by doing something against a long standing set of spidering rules.
Further, now that Google Preview does not respect robots.txt, what option is left?
Good point, I have NOARCHIVE set as well.
This starts the writing campaign...
|I have always had a NOARCHIVE header and now Google has all of my pages in preview |
The NOARCHIVE header should still keep your content out of the publicly viewable cache.
The point about the new Google Web Preview bot - which is a prefetcher, and not a spider - is that it only exists in order to circumvent your robots.txt restrictions.
Webmasters should have been given the means to control and provide their own preview image, but while this is very easy to do (by cloaking to the Google Web Preview bot) the technique has been specifically outlawed by Mountain View in the recent FAQ:
|You must show Googlebot and the Google Web Preview the same content that users from that region would see (see our Help Center article on cloaking). |
So you must treat the Google Web Preview bot the same as Googlebot, even though it is specifically designed to bypass the access restrictions you place on Googlebot.
Google wants to use your images, and you are expected to give them up.
As someone mentioned earlier in the thread "This looks like over-a-barrel time!".
This is a prime example of Google not honoring all of their meta directives.
Not only that, but the Google Web Preview spider doesn't have the same reverse DNS full trip used to identify Googlebot leaving webmasters that try to stop site abuse twisting in the wind.
Google needs to step up their game so webmasters can either opt-in or opt-out, but do it with specific tools and not binding multiple crawlers to the same robots.txt entries, it's all getting nutty.
|The NOARCHIVE header should still keep your content out of the publicly viewable cache. |
It does not. The public can view a cached snap shot of all our web pages on an archived server owned by Google.
Google's official definition of NOARCHIVE.
Add the NOARCHIVE tag to a web page and Google won't cache copy of a web page in search results:
Via Google Web Preview, Google is keeping a cached visual copy of your content to show anyone.
|Google implemented this new feature without Flash support, and reportedly does not display any alternate content you may provide. Any Flash content on your pages will show an uninviting blank area. |
So far I don't like the "snippet" preview, seems like a breach of intellectual property rights to make a page of mine look different than it actually appears.
Just realized also, the web preview does not even get the css right. My pages work in chrome, firefox, IE, Safari (windows) etcetera just fine, but however they are parsing my pages it is with a POS browser that gets the header position out of whack big time.
Maybe the clue is in the word "embedded"
I hadn't tested this myself (hence "reportedly") but you appear to be correct - though presumably the alternate content should not contain images blocked by the robots.txt file.
I have seen disgruntled reports of alternate content not showing, and this may explain it.
|seems like a breach of intellectual property rights to make a page of mine look different than it actually appears |
Have to agree.
|Via Google Web Preview, Google is keeping a cached visual copy of your content to show anyone |
Fair point. The "Cached" link in Google's SERPs is now only half the story.
The WebmasterWorld home page link to this thread mentions Google Web Preview "and how to block it". So for those who - after considering all the implications - really want to force "No Preview Available", the answer seems to be to restrict access to images by robots.txt and 403 the Google Web Preview bot.
That way you get to keep your text snippet (for now at least).
A blind intermediary would not modify others' content or display it on their own website.
The Google Web Preview bot only exists to circumvent owners' robots.txt instructions.
|Q: How can I block previews from being shown? |
A: You can block previews using the "nosnippet" robots meta tag or x-robots-tag HTTP header. Keep in mind that blocking previews also blocks normal snippets. There is currently no way to block preview images while allowing normal snippets.
I feel like a David against G(oliath). I gotta get me a slingshot.
Hmm. Maybe it's the phrase: Google Web Preview (with and without quotes). A quick search shows zero Preview icons for those SERPs...
This might be just me, then again...
GWP has been here... so far I have not stopped it. But, and this is why I'm asking, I've noticed that once a page has been "previewed" I don't see it previewed again. Even on popular (highly as in 100-200 pageviews a day) the GWP does not return. Is Google cacheing these "previews" and thus defeating all our attempts to contain the voracious beast?
At least 4 of my "previewed" pages have exhibited this behavior. I have hits before the preview and hits after the preview, but no previews after the first preview... Makes you wonder a bit.
|Is Google cacheing these "previews" |
Yes they are.
However, various different IP ranges are being used by the Google Web Preview bot and it is possible (though untested) that more than one cached version is made.
|There is currently no way to block preview images while allowing normal snippets |
Phranque is quoting Google there, but the site I am testing on consistently shows "Preview Not Available" while retaining the text snippet (method described above).
From Google's John Mueller, 16 November:
|As we use normal crawling to create these previews (on-the-fly accessing is only used for cases where we don't have recent, complete data from crawling), over time the accesses will be mostly limited to normal crawling activity. |
This seems to suggest that Google expects almost all webmasters to remove their robots.txt restrictions on image crawling. I suspect they will probably succeed in this, and if anyone chooses to call it "coercion" I will not argue.
The Google Web Preview bot only exists to circumvent webmasters' robots.txt instructions.
If you have not been blocking the (disguised as a Safari browser) Google Web Preview bot from the start then you may now be too late - the bot will already have cached a screenshot and if it cannot be updated Google will likely use that version in perpetuity.
There is probably no way to remove it.
As I mentioned earlier on another thread (http://www.webmasterworld.com/google/4228491-8-10.htm) and now reported today at SearchEngineLand [searchengineland.com...] this issue may also be playing havoc with your visit/visitor/page view metrics.
On one site that I work for, we are seeing visitors inflated by 25% by Google Instant Preview (about 250k visitors in one week - it's a very large site with millions of URLs in the index).
Check your browser stats: if that version of Safari has shown a big increase you might be suffering the same thing. Clearly this will play havoc with any dependent conversion metrics.
It's not just GA that is affected by this of course, we are seeing the same issues with Omniture SiteCatalyst. Adobe promised to look into it when informed of the situation by a member of my analytics team.
I wonder if this is why my sales have recently exploded - I focus very intensely on pictures - and big ones - and they look fabulous in preview as opposed to my competitors leeeetle teeeeny thumbnails. Hmm.
Congrats on your boom! Regardless of causation, that's nice news!
Previews just premiered this month so if your increases are in the last two to three weeks, I'd be more likely to attribute same to factors beyond G's control -- thus far. Factors like the mid-term elections didn't frighten and/thus the stock market's holding its own.
Another factor could be major retailers using TV-radio spots are heavily trumpeting online holiday buying -- and tie-ins via their standalone sites and their Facebook pages -- and have since well before Halloween. They've seemingly extended this week's (counterintuitively-named) Black Friday into a months-long retail campaign.
For example, eBay started a Christmas countdown atop its pages eight weeks out. Fifty-something days till Christmas anyone? Oy.
Last but not least...
A marketing site's prelim tests (the URL of which I neglected to save & can't re-find now, sorry) showed most people either didn't notice the magnifying glass icon, had no clue what the it did, or simply didn't use it.
For me, the 'Preview Effective?' jury's still out. If only referers indicated they came that way.
I think most people find it the way I did, by accident. I know my sales have spiked due to SEO primarily, and additionally, some very intense adwords tweaking PLUS.....very targeted advertising and promo - but I wonder how many of the wigglers, I call them, hesitiating ove rwhich link to click, found my site accidentally - once you discover the preview, its very fun to jump from preview to preview, and in that, my site rocks.
I wish there was some way to include an alt message saying "visit the site, or "click here" to see the images" that would get the visitor to the actual site.
I don't expect that function to be avalable anytime soon.
if you click on the preview, you go straight to the site...or are you meaning something else?
I'd like to see referers from people actually clicking on/through the Preview. For example, oh:
That's an actual G referer [keywords-obfuscated], plus my simplistic addition in bold, formatted according to the other parameters.
|I'd like to see referers from people actually clicking on/through the Preview |
In my test just now the referrer was included, same as usual, when clicking through.
Note, however, that a "Preview Not Available" image is not clickable.
If you are talking about logged hits from the Google Web Preview bot that is different - seeing the bot in your logs does not necessarily mean that your site was previewed, just that it appeared on the same SERPs page as one that was (and that you haven't removed your robots.txt restrictions).
|I wish there was some way to include an alt message |
It is technically very easy for webmasters to control what is displayed in the preview.
Unfortunately, doing so is likely to get your site banned.
Apologies, I misunderstood - you want to know how many people actually use the feature.
One problem with that is that people may look at the preview then click on the text SERP.
@Samizdata : If imminent visitors click on the text, no prob. That's the 'usual' method and shows in referers (browser-willing). If G added a referer param -- for example, &click=preview -- that would indicate a click on the preview. Then again, if wishes were horses... :)
Another look at this issue:
A Google employee quoted in The Register:
So much for testing the beast before unleashing it on the world.
It would also have been sensible - not to mention polite - to warn webmasters in advance that they are expected to remove their robots.txt restrictions so that their images can be used.
As for offering a "nopreview" tag, they wouldn't want to copy Bing, would they?
| This 66 message thread spans 3 pages: < < 66 ( 1  3 ) > > |