Welcome to WebmasterWorld Guest from 54.224.18.114

Forum Moderators: Ocean10000 & incrediBILL & keyplyr

Message Too Old, No Replies

Google Web Preview hits show stock referrer

http://www.google.com/search

     
11:58 pm on Jan 6, 2012 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Nov 5, 2005
posts: 2038
votes: 1


This just in: Google Web Preview (GWP) hits include a referrer -- sort of:

http://www.google.com/search
(no trailing slash)

On my largest site:

74.125.78.89
Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/534.51 (KHTML, like Gecko; Google Web Preview) Chrome/12.0.742 Safari/534.51

17:54:03 /dir/filename.html

robots.txt? NO (never)
REF? http://www.google.com/search (never before)

Yesterday, I saw it. Today, I tested it and every time I requested a SERP preview, the hit to the page always showed it. Image hits do not; they show site pages as referrers.

(Aside: Another GWP change occurred along the way; I forget when. In SERPs, G changed the little magnifying glass icon to a big rectangular hoverbutton showing only >>. As with the old icon, click on the >> and the Previews appear, if available. Click again and they're gone.)

Okay. Back to the new /search referrer:

- It doesn't include ANY parameters, but clicks-thru from Previews do.
- ALL GWP hits to .html files show it -- so are they really in real time?

Anyone else seeing "http://www.google.com/search" in their logs? If not, SERP your domain, >> a few Previews, and see if GWP shows up and shows it.
12:54 am on Jan 7, 2012 (gmt 0)

Moderator This Forum from US 

WebmasterWorld Administrator keyplyr is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Sept 26, 2001
posts:10108
votes: 550


GWP did an actual (small) crawl last week. Took 50+ HTML pages, no images/scripts/css/etc.

Sorry, don't remember if it used that referrer and I stopped saving old logs so I can't look.

Also, I *think* G stopped using the magnifying icon for the previews when they moved that icon up to the search box. Seeing that, I did it on my site too :)
2:54 am on Feb 1, 2012 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Nov 5, 2005
posts: 2038
votes: 1


Another tweak in the format of GWP's now-standard fake referrer. Note the /m/ directory:

http://www.google.com/m/search

Note, too, the Android Mobile UA:

74.125.64.95 [projecthoneypot.org...]
Mozilla/5.0 (Linux; U; Android 2.3.4; generic) AppleWebKit/534.51 (KHTML, like Gecko; Google Web Preview) Version/4.0 Mobile Safari/534.51

Too bad we can't get paid for being Google's guinea pigs.
3:16 am on Feb 1, 2012 (gmt 0)

Moderator This Forum from US 

WebmasterWorld Administrator keyplyr is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Sept 26, 2001
posts:10108
votes: 550


Too bad we can't get paid for being Google's guinea pigs.

You're not getting paid?
10:06 am on Feb 1, 2012 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Nov 5, 2005
posts: 2038
votes: 1


Pfffffft :)
9:31 pm on Feb 1, 2012 (gmt 0)

Senior Member from GB 

WebmasterWorld Senior Member dstiles is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:May 14, 2008
posts:3165
votes: 8


I've been seeing the "referer" for a few days now. I wonder if they are trying to lull us into accepting it if it's got a referer. :(
11:58 pm on Feb 1, 2012 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:May 24, 2002
posts:894
votes: 0


I saw three today, each with the http: //www.google.com/search as referrer and the full UA but identifying themselves as :

ee-in-f155.1e100.net
ee-in-f159.1e100.net
ee-in-f144.1e100.net

At first I thought it was someone faking but the IP numbers are genuine : 74.125.16.155 / 159 / 144

Another way of "make believe" ?
12:41 am on Feb 2, 2012 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Nov 5, 2005
posts: 2038
votes: 1


Google's been GWP'ing from bare IPs and lots of its (known) Hosts for some time now. [webmasterworld.com...] (And using 1e100.net for even longer.) [webmasterworld.com...]

Unfortunately, G runs things we might want through the same numbers it also runs junk. [webmasterworld.com...]
2:47 am on Feb 2, 2012 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month

joined:Aug 29, 2006
posts:1312
votes: 0


My logs suggest that the number of people actually using Google Web Preview is tiny.

And the number who care that it gets a 403 on my sites appears to be even tinier.

I don't have a PhD though.

...
3:06 am on Feb 2, 2012 (gmt 0)

Moderator This Forum from US 

WebmasterWorld Administrator keyplyr is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Sept 26, 2001
posts:10108
votes: 550


My logs suggest that the number of people actually using Google Web Preview is tiny.

I have the opposite opinion.
9:23 am on Feb 2, 2012 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Nov 5, 2005
posts: 2038
votes: 1


I have no proof from my logs that anyone's using it, or not using it, neither when the referrer was faked blank for all hits, nor now that it's faked blank for graphics and faked /search for .html files.
11:25 am on Feb 2, 2012 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:Apr 9, 2011
posts:14256
votes: 551


I needed to test something about a particular Preview of one of my own pages. The Preview came with a blurb saying something about showing a cached version. This kinda implies that there also exist non-cached Previews.

Possibly the Previewbot figured out that I was rewriting it from a 700K page to a 35K mockup which looks exactly the same. I deliberately searched for a unique phrase that occurs late in the (complete) file. Odd thing is, the logs show a filesize that's even smaller-- though still much bigger than any detour or error document. But it comes through as 200 so they must have got something.

And yes, the referer line is the "search" version. I tend to take that at face value. The stylesheet had a normal referer, same as when a human views the page.
9:28 pm on Feb 2, 2012 (gmt 0)

Senior Member from GB 

WebmasterWorld Senior Member dstiles is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:May 14, 2008
posts:3165
votes: 8


From what I recall, web preview only goes to a site if it does not have something it needs through a normal bot scan. If you block images, css, whatever in robots.txt then preview comes looking for the missing bits (and in my case gets a 403 or 405 - can't recall which off-hand).
12:05 am on Feb 3, 2012 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Nov 5, 2005
posts: 2038
votes: 1


Erm, missing piece-gathering is not my experience, and I'm not sure GWP's actions are that clear-cut. (Would be nice tho'.)
3:21 am on Feb 3, 2012 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:Apr 9, 2011
posts:14256
votes: 551


Hm. I thought right away of a page where I could test that, because half of its images are in a roboted-out directory. (People were using it mainly as hotlink fodder.) Got that same "cached" blurb-- but I can't imagine what was in the cache, since my logs, downloaded immediately afterward, show that the html, css and every single image on the page were downloaded fresh.

Further investigation tells me the regular googlebot last saw the page 5 days ago, recording a 304. It picked up the stylesheet a day earlier in with-referer mode (different page in same directory). Dates on the images range from 2004 to 2007-- this is not an actively maintained page-- but it picked up all of them for good measure, including the ones it has seen within the past month.

No piwik, so it didn't get a chance to have the door slammed in its face there.

Wonder what "cached" means?

Edit:
Apparently it means "If it wasn't cached before, it is now." I tried half an hour later with a different browser and different search phrase-- mildly gratifying to find myself at #2 for this one-- and the logs were silent. If I remember, I will try it again after 6 or 12 hours. If Preview works like Translate, they keep the cache sitting around for a few hours before tossing it.
4:38 pm on Feb 3, 2012 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Nov 5, 2005
posts: 2038
votes: 1


[Googlebot] picked up the stylesheet a day earlier in with-referer mode

Was that a legit Googlebot? Because with almost all, if not all, of Google's bots, showing referrers is rare as hen's teeth (...or was, until GWP started up its fake REF repertoire).
8:34 pm on Feb 3, 2012 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:Apr 9, 2011
posts:14256
votes: 551


There's a thread about googlebot and referers. You can get a series of googlebot hits from the identical IP, identical UA, and in the middle there's one or more giving a referer.

It's a fairly recent development, calculated to make people suspect they're up to no good ;) Maybe it's another way to test if humans and robots see the same page: normally a robot would arrive without referer for images and stylesheets, so it would be the easiest thing in the world to rewrite them to a different form.
11:23 pm on Feb 3, 2012 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Nov 5, 2005
posts: 2038
votes: 1


I located your report of one session in Google SEO News and Discussion here [webmasterworld.com] It gives me something to watch for, thanks, because Googlebot et al sending referrers is still rare as hen's teeth.
12:29 am on Feb 4, 2012 (gmt 0)

Senior Member

WebmasterWorld Senior Member g1smd is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:July 3, 2002
posts:18903
votes: 0


The Preview came with a blurb saying something about showing a cached version. This kinda implies that there also exist non-cached Previews.

What I don't get is that sometimes the Preview is quite stale perhaps days or more old with a much newer version of the same page viewable in the normal Google cache, while other times the Preview is very new but the cache copy is days or more old.
12:55 am on Feb 4, 2012 (gmt 0)

Moderator This Forum from US 

WebmasterWorld Administrator keyplyr is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Sept 26, 2001
posts:10108
votes: 550


What I don't get is that sometimes the Preview is quite stale perhaps days or more old with a much newer version of the same page viewable in the normal Google cache, while other times the Preview is very new but the cache copy is days or more old.

A definite enigma.
2:57 am on Feb 4, 2012 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:Apr 9, 2011
posts:14256
votes: 551


Oh, that reminded me. Preview request #3, using a third browser and different search string-- ###! If I'd thought of it I would have tried from the library which has a different IP and, of course, entirely different computers. Must still be cached, because there's nothing new in logs.

I did find an unrelated adding-insult-to-injury entry though. In fact, two of them back to back. Two different people inexplicably asked for Previews of a page that included hotlinks. Grr. Nothing to do about it except what I'm already doing, which is to show the garish No Hotlinks image. The offending site comes through as referer, with Google's IP, and Preview in the UA.

What I don't get is that sometimes the Preview is quite stale perhaps days or more old with a much newer version of the same page viewable in the normal Google cache, while other times the Preview is very new but the cache copy is days or more old.

I think that reinforces the idea that the Googlebot and the I Am Not A Robot don't talk to each other.
1:56 am on Feb 17, 2012 (gmt 0)

Senior Member

WebmasterWorld Senior Member g1smd is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:July 3, 2002
posts:18903
votes: 0


Web Preview is a bit flaky to say the least and the tools in WMT are quite unreliable.

In WMT the "Pre-render Desktop Search Instant Preview" image is completely broken for every page on a site I am currently looking at. There's a message below the images to say that there were many errors fetching resources from the site. However, looking in the server logs for those resources shows that Google requested all of them and all were served in full and with "200 OK" status. Requests came from multiple IPs to grab the page, images, scripts, etc in a very short time.

For those files tagged as "Fetch failure" all I can assume is that Google's rendering script didn't wait long enough for the fetch part of the process to finish fetching them all.
 

Join The Conversation

Moderators and Top Contributors

Hot Threads This Week

Featured Threads

Free SEO Tools

Hire Expert Members