Welcome to WebmasterWorld Guest from 54.162.240.235

Forum Moderators: Ocean10000 & incrediBILL

Message Too Old, No Replies

Google Web Preview hits show stock referrer

http://www.google.com/search

     
11:58 pm on Jan 6, 2012 (gmt 0)

WebmasterWorld Senior Member 5+ Year Member



This just in: Google Web Preview (GWP) hits include a referrer -- sort of:

http://www.google.com/search
(no trailing slash)

On my largest site:

74.125.78.89
Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/534.51 (KHTML, like Gecko; Google Web Preview) Chrome/12.0.742 Safari/534.51

17:54:03 /dir/filename.html

robots.txt? NO (never)
REF? http://www.google.com/search (never before)

Yesterday, I saw it. Today, I tested it and every time I requested a SERP preview, the hit to the page always showed it. Image hits do not; they show site pages as referrers.

(Aside: Another GWP change occurred along the way; I forget when. In SERPs, G changed the little magnifying glass icon to a big rectangular hoverbutton showing only >>. As with the old icon, click on the >> and the Previews appear, if available. Click again and they're gone.)

Okay. Back to the new /search referrer:

- It doesn't include ANY parameters, but clicks-thru from Previews do.
- ALL GWP hits to .html files show it -- so are they really in real time?

Anyone else seeing "http://www.google.com/search" in their logs? If not, SERP your domain, >> a few Previews, and see if GWP shows up and shows it.
12:54 am on Jan 7, 2012 (gmt 0)

WebmasterWorld Senior Member keyplyr is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



GWP did an actual (small) crawl last week. Took 50+ HTML pages, no images/scripts/css/etc.

Sorry, don't remember if it used that referrer and I stopped saving old logs so I can't look.

Also, I *think* G stopped using the magnifying icon for the previews when they moved that icon up to the search box. Seeing that, I did it on my site too :)
2:54 am on Feb 1, 2012 (gmt 0)

WebmasterWorld Senior Member 5+ Year Member



Another tweak in the format of GWP's now-standard fake referrer. Note the /m/ directory:

http://www.google.com/m/search

Note, too, the Android Mobile UA:

74.125.64.95 [projecthoneypot.org...]
Mozilla/5.0 (Linux; U; Android 2.3.4; generic) AppleWebKit/534.51 (KHTML, like Gecko; Google Web Preview) Version/4.0 Mobile Safari/534.51

Too bad we can't get paid for being Google's guinea pigs.
3:16 am on Feb 1, 2012 (gmt 0)

WebmasterWorld Senior Member keyplyr is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



Too bad we can't get paid for being Google's guinea pigs.

You're not getting paid?
10:06 am on Feb 1, 2012 (gmt 0)

WebmasterWorld Senior Member 5+ Year Member



Pfffffft :)
9:31 pm on Feb 1, 2012 (gmt 0)

WebmasterWorld Senior Member dstiles is a WebmasterWorld Top Contributor of All Time 5+ Year Member



I've been seeing the "referer" for a few days now. I wonder if they are trying to lull us into accepting it if it's got a referer. :(
11:58 pm on Feb 1, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I saw three today, each with the http: //www.google.com/search as referrer and the full UA but identifying themselves as :

ee-in-f155.1e100.net
ee-in-f159.1e100.net
ee-in-f144.1e100.net

At first I thought it was someone faking but the IP numbers are genuine : 74.125.16.155 / 159 / 144

Another way of "make believe" ?
12:41 am on Feb 2, 2012 (gmt 0)

WebmasterWorld Senior Member 5+ Year Member



Google's been GWP'ing from bare IPs and lots of its (known) Hosts for some time now. [webmasterworld.com...] (And using 1e100.net for even longer.) [webmasterworld.com...]

Unfortunately, G runs things we might want through the same numbers it also runs junk. [webmasterworld.com...]
2:47 am on Feb 2, 2012 (gmt 0)

WebmasterWorld Senior Member 5+ Year Member



My logs suggest that the number of people actually using Google Web Preview is tiny.

And the number who care that it gets a 403 on my sites appears to be even tinier.

I don't have a PhD though.

...
3:06 am on Feb 2, 2012 (gmt 0)

WebmasterWorld Senior Member keyplyr is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



My logs suggest that the number of people actually using Google Web Preview is tiny.

I have the opposite opinion.
9:23 am on Feb 2, 2012 (gmt 0)

WebmasterWorld Senior Member 5+ Year Member



I have no proof from my logs that anyone's using it, or not using it, neither when the referrer was faked blank for all hits, nor now that it's faked blank for graphics and faked /search for .html files.
11:25 am on Feb 2, 2012 (gmt 0)

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



I needed to test something about a particular Preview of one of my own pages. The Preview came with a blurb saying something about showing a cached version. This kinda implies that there also exist non-cached Previews.

Possibly the Previewbot figured out that I was rewriting it from a 700K page to a 35K mockup which looks exactly the same. I deliberately searched for a unique phrase that occurs late in the (complete) file. Odd thing is, the logs show a filesize that's even smaller-- though still much bigger than any detour or error document. But it comes through as 200 so they must have got something.

And yes, the referer line is the "search" version. I tend to take that at face value. The stylesheet had a normal referer, same as when a human views the page.
9:28 pm on Feb 2, 2012 (gmt 0)

WebmasterWorld Senior Member dstiles is a WebmasterWorld Top Contributor of All Time 5+ Year Member



From what I recall, web preview only goes to a site if it does not have something it needs through a normal bot scan. If you block images, css, whatever in robots.txt then preview comes looking for the missing bits (and in my case gets a 403 or 405 - can't recall which off-hand).
12:05 am on Feb 3, 2012 (gmt 0)

WebmasterWorld Senior Member 5+ Year Member



Erm, missing piece-gathering is not my experience, and I'm not sure GWP's actions are that clear-cut. (Would be nice tho'.)
3:21 am on Feb 3, 2012 (gmt 0)

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



Hm. I thought right away of a page where I could test that, because half of its images are in a roboted-out directory. (People were using it mainly as hotlink fodder.) Got that same "cached" blurb-- but I can't imagine what was in the cache, since my logs, downloaded immediately afterward, show that the html, css and every single image on the page were downloaded fresh.

Further investigation tells me the regular googlebot last saw the page 5 days ago, recording a 304. It picked up the stylesheet a day earlier in with-referer mode (different page in same directory). Dates on the images range from 2004 to 2007-- this is not an actively maintained page-- but it picked up all of them for good measure, including the ones it has seen within the past month.

No piwik, so it didn't get a chance to have the door slammed in its face there.

Wonder what "cached" means?

Edit:
Apparently it means "If it wasn't cached before, it is now." I tried half an hour later with a different browser and different search phrase-- mildly gratifying to find myself at #2 for this one-- and the logs were silent. If I remember, I will try it again after 6 or 12 hours. If Preview works like Translate, they keep the cache sitting around for a few hours before tossing it.
4:38 pm on Feb 3, 2012 (gmt 0)

WebmasterWorld Senior Member 5+ Year Member



[Googlebot] picked up the stylesheet a day earlier in with-referer mode

Was that a legit Googlebot? Because with almost all, if not all, of Google's bots, showing referrers is rare as hen's teeth (...or was, until GWP started up its fake REF repertoire).
8:34 pm on Feb 3, 2012 (gmt 0)

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



There's a thread about googlebot and referers. You can get a series of googlebot hits from the identical IP, identical UA, and in the middle there's one or more giving a referer.

It's a fairly recent development, calculated to make people suspect they're up to no good ;) Maybe it's another way to test if humans and robots see the same page: normally a robot would arrive without referer for images and stylesheets, so it would be the easiest thing in the world to rewrite them to a different form.
11:23 pm on Feb 3, 2012 (gmt 0)

WebmasterWorld Senior Member 5+ Year Member



I located your report of one session in Google SEO News and Discussion here [webmasterworld.com] It gives me something to watch for, thanks, because Googlebot et al sending referrers is still rare as hen's teeth.
12:29 am on Feb 4, 2012 (gmt 0)

WebmasterWorld Senior Member g1smd is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



The Preview came with a blurb saying something about showing a cached version. This kinda implies that there also exist non-cached Previews.

What I don't get is that sometimes the Preview is quite stale perhaps days or more old with a much newer version of the same page viewable in the normal Google cache, while other times the Preview is very new but the cache copy is days or more old.
12:55 am on Feb 4, 2012 (gmt 0)

WebmasterWorld Senior Member keyplyr is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



What I don't get is that sometimes the Preview is quite stale perhaps days or more old with a much newer version of the same page viewable in the normal Google cache, while other times the Preview is very new but the cache copy is days or more old.

A definite enigma.
2:57 am on Feb 4, 2012 (gmt 0)

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



Oh, that reminded me. Preview request #3, using a third browser and different search string-- ###! If I'd thought of it I would have tried from the library which has a different IP and, of course, entirely different computers. Must still be cached, because there's nothing new in logs.

I did find an unrelated adding-insult-to-injury entry though. In fact, two of them back to back. Two different people inexplicably asked for Previews of a page that included hotlinks. Grr. Nothing to do about it except what I'm already doing, which is to show the garish No Hotlinks image. The offending site comes through as referer, with Google's IP, and Preview in the UA.

What I don't get is that sometimes the Preview is quite stale perhaps days or more old with a much newer version of the same page viewable in the normal Google cache, while other times the Preview is very new but the cache copy is days or more old.

I think that reinforces the idea that the Googlebot and the I Am Not A Robot don't talk to each other.
1:56 am on Feb 17, 2012 (gmt 0)

WebmasterWorld Senior Member g1smd is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



Web Preview is a bit flaky to say the least and the tools in WMT are quite unreliable.

In WMT the "Pre-render Desktop Search Instant Preview" image is completely broken for every page on a site I am currently looking at. There's a message below the images to say that there were many errors fetching resources from the site. However, looking in the server logs for those resources shows that Google requested all of them and all were served in full and with "200 OK" status. Requests came from multiple IPs to grab the page, images, scripts, etc in a very short time.

For those files tagged as "Fetch failure" all I can assume is that Google's rendering script didn't wait long enough for the fetch part of the process to finish fetching them all.
 

Featured Threads

Hot Threads This Week

Hot Threads This Month