Forum Moderators: open

Message Too Old, No Replies

Private prefetch proxy in Chrome

These are the hits from fetch.tunnel.googlezip.net?

         

SumGuy

1:41 pm on Oct 16, 2024 (gmt 0)

5+ Year Member Top Contributors Of The Month



Based on the explanation on this page:

[developer.chrome.com...]

"Chrome will sometimes prefetch links on the Google Search results page, and other participating websites, before the user clicks on them. This feature relies on a CONNECT proxy which hides the user's IP address from the website that needs to be prefetched."

I've come across some comments expressing the belief that this was going to be phased out, but the examples I see include user-agent chrome versions up to version 129. These are requests for specific html files and not the file-fingerprint I would see during actual human browsing. I usually see 1 or 2 examples of this per day.

Starting in early 2021 I began to see hits from 192.186.4.0/24 and 72.14.201.0/24. These come back as (IP).v4.fetch.tunnel.googlezip.net. I think there might be a few other /24 ranges but these two are examples I saw yesterday.

Here's the thing - I don't see any associated actual human browsing happening in conjunction with these hits at the time. Has anyone seen this?

Question 1)

When I see a solitary hit from a googlezip IP, but no actual web browsing (presumably from the actual user IP) shortly after the googlezip hit, should I assume that the user performed a google search, my page showed up in the search results, but the user did not click on the link to my page?

Or would google actually serve the file (and associated graphics or accessory files to render the page) from it's cache if the user clicked on my page from the search results?

Question 2)

These hits from googlezip.net - do they preserve the (human) user's User-Agent and other request-header fields (ie browser language) ?

Question 3)

This chrome pre-fetch function - does the user have to activate this or install an add-on for this to happen, or is this the default behavior for all chrome browsers (since early 2021) ?

not2easy

3:49 pm on Oct 16, 2024 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



It would be a lot easier to find examples if you had included the UA here. I'm not wading through one line at a time to find them but I know I have seen such prefetched instances in my logs.

I can't tell you what interaction is or isn't required by the user for Google to use the Chrome prefetch function, but I can recall seeing them.

BTW - Whois shows that 72.14. range larger:
72.14.192.0 - 72.14.255.255
72.14.192.0/18

SumGuy

2:44 am on Oct 17, 2024 (gmt 0)

5+ Year Member Top Contributors Of The Month



99% of the time the UA is one of these (chrome versions will vary):

Mozilla/5.0 (Linux; Android 10; K) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/123.0.0.0 Mobile Safari/537.36

Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/116.0.0.0 Safari/537.36

Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36

The referrer is always google, 5% of the time it's a country other than US (ie other than google.com).

The browser language is maybe 75% en-US, other times it's various (en-GB, es-ES, fr-FR, ko-KR, a few others).

Sometimes the page being requested is my landing page, most times it's an interior page. Robots.txt is never requested.

lucy24

6:27 am on Oct 17, 2024 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Robots.txt is never requested.
Well, G does have that longstanding notion that if there is any human involvement in the page request--no matter how tangential--then robots.txt isn't required.

SumGuy

11:53 pm on Oct 17, 2024 (gmt 0)

5+ Year Member Top Contributors Of The Month



I believe google hits from 66.102.6.0/24 play a role with these googlezip hits.

I am seeing requests for /.well-known/traffic-advice from that CIDR. These requests started in May 2023 but was very rare until August 28 this year when they became several times per week, sometimes per day.

A request for /.well-known/traffic-advice seems to immediately preceed a hit from the googlezip CIDR's. They have this user-agent:

Chrome Privacy Preserving Prefetch Proxy

And pretty much nothing else in the request headers and no referrer. This 66.102.6 CIDR returns hostnames along the lines of google-proxy-(ip-address).google.com

I do not have /.well-known/traffic-advice so google gets 404 from me. That file is supposed to contain information about whether visiting agents can or should (or should not?) prefetch resources. I don't know what the default behavior is if the file does not exist.

This seems to be google's answer to Apple's private proxy (iCloud Private Relay) service, except that google knows the site or resource you are requesting, whereas Apple anonymizes your request through a third party proxy.

It remains unclear to me whether these instances represent organic human visits to my site that I am not able to log because the content is being served from google's cache and not my server. I would think that I'm not alone in that concern.

vivalasvegas

3:22 pm on Nov 18, 2024 (gmt 0)

10+ Year Member



Hello Sum,

I've been researching this too since noticing numerous hits from the two Google IPs you mentioned earlier in the thread.

Do this simple test: search for one of your most highly ranked phrases using Google and its Chrome browser (your url has to be on the first page) while watching your live log entries using Linux' "tail" command (sudo tail -f /path-to/your-log-file); this assumes you are on a Linux server and have root access. You'll see that simply showing in the SERPS on the first page will log a visit from one of these Google IPs to your log file. Actually clicking the link will log a visit from your IP.

My conclusion: hits from these IPs are merely prefetching of your pages, whatever that means, but not real traffic for sure, while clicking the link will open the live page.