Forum Moderators: Robert Charlton & goodroi
Mozilla browsers support a feature called link prefetching, which allows a web page to tell the browser to prefetch a url if it is idle. Google has been using this technique in their search results, telling Mozilla to start loading the first result. I also noticed that MXNA 2.0 is including 3 prefetch tags.
How to you tell the browser to prefetch a url?
By using the following code:
<link rel="prefetch" href="http://url.to.prefetch/" />
How can I detect prefetching on my web site?
When mozilla does a prefetch, it sends a header X-moz: prefetch, you can then block based on that header. Google recommends sending a 404 back to block the prefetch.
How can I block prefetching?
Using mod_rewrite to send a 404:
RewriteEngine On
SetEnvIf X-moz prefetch HAS_X-moz
RewriteCond %{ENV:HAS_X-moz} prefetch
RewriteRule .* /prefetch-attempt [L]
This will redirect all prefetch-attempts to /prefetch-attempt as long as that file doesn't exist, the client will get a 404.
You could also block with a 503 Forbidden response:
RewriteEngine On
SetEnvIf X-moz prefetch HAS_X-moz
RewriteCond %{ENV:HAS_X-moz} prefetch
RewriteRule .* [F,L]
17. How can I opt my website out of Chrome Instant URL loading?
If a Google Chrome user has enabled the "Chrome Instant" feature, most webpages will load as soon as the URL has been typed into the address bar, before the user hits Enter.
If you are a website administrator, you can prevent Google Chrome from exhibiting this behavior on your website:
- When Google Chrome makes the request to your website server, it will send the following header:
X-Purpose: instant- Detect this, and return an HTTP 403 ("Forbidden") status code.
- When Google Chrome receives this status code, it will add your website to a blacklist maintained on the client. This blacklist will last the duration of that user’s browsing session.
When Google Chrome makes the request to your website server, it will send the following header:
X-Purpose: instant
# Detect this, and return an HTTP 403 ("Forbidden") status code.
Angonasec wrote:
Could the Apache Bods kindly confirm that:
RewriteCond %{HTTP:X-moz} ^prefetch
RewriteRule .* - [F,L]
Will therefore block this behaviour in Chrome... if so we will continue to use it without qualms.
RewriteCond %{HTTP:X-Purpose} ^instant$
RewriteRule .* - [F] RewriteCond %{HTTP:X-Moz} ^prefetch$ [OR]
RewriteCond %{HTTP:X-Purpose} ^instant$
RewriteRule .* - [F]
Does any one know if the G Toolbar, in this case GTB6.6 on MSIE 7.0, pre-fetches also ? ... out of the blue the home page is called without a click from the visitor
nnn.nn.nn.nnn - - [05/Aug/2011:21:04:26 -0700] "GET /paintings/myrats/dreams.html HTTP/1.1" 200 1071 "http://www.example.com/paintings/myrats.html" "Mozilla/5.0 (Windows NT 6.1) AppleWebKit/535.1 (KHTML, like Gecko) Chrome/13.0.782.107 Safari/535.1"
nnn.nn.nn.nnn - - [05/Aug/2011:21:04:27 -0700] "GET /paintings/myrats/blowups/largedreams.jpg HTTP/1.1" 200 51821 "http://www.example.com/paintings/myrats/dreams.html" "{Chrome}"
nnn.nn.nn.nnn - - [05/Aug/2011:21:05:41 -0700] "GET /paintings/myrats/blowups/largecatfight.jpg HTTP/1.1" 200 2770 "http://webcache.googleusercontent.com/search?q=cache:http://www.example.com/paintings/myrats/catfight.html" "{Chrome}"
nnn.nn.nn.nnn - - [05/Aug/2011:21:05:57 -0700] "GET /paintings/myrats/catfight.html HTTP/1.1" 200 1098 "http://www.example.com/paintings/myrats.html" "{Chrome}"
nnn.nn.nn.nnn - - [05/Aug/2011:21:06:14 -0700] "GET /paintings/myrats/mychip.html HTTP/1.1" 200 957 "http://www.example.com/paintings/myrats.html" "{Chrome}"
nnn.nn.nn.nnn - - [05/Aug/2011:21:06:15 -0700] "GET /paintings/myrats/blowups/largemychip.jpg HTTP/1.1" 200 53467 "http://www.example.com/paintings/myrats/mychip.html" "{Chrome}" dstiles wrote:
Where does it say in there that it's the prefetch only that is blacklisted? Does anyone have proof that I'm reading this wrongly?
Paranoid I may be, but I'm sure I'm not deranged. :)
Another question: who actually prefetches and caches the page - the searcher or google?
If the searcher then google are in for some heavy criticism, especially of they prefetch for on-the-fly typing. I can see how very easy it would be to "accidentally" cache some undesirable - even trojan - pages.
Zk178 wrote:
It seems that the X-Purpose header is sent for "Google Instant" requests, not for "Google Instant Pages" requests. There is even an unresolved bug on this submitted in the Chromium project - [webmasterworld.com...]
OK, now I'm confused. What is it about Google Instant that needs to make a request to a website? I was under the impression that it was just an "instant search results" feature.
But I think you're right; that information I linked to above is for Google Instant, not Google Instant Pages (curse Google and their naming nonsense).
From the issue you linked to...
At this point we're still gathering developer feedback about prerendering (including this need) and haven't made a decision.
That was nearly 2 months ago now. They must have decided on something if they've made the feature available and on-by-default in a stable release.
--
Ryan
If your site includes a third-party script for analytics or advertising, in many cases you shouldn't have to make any modifications to your site—the third party will simply modify the script they provide slightly to make use of the Page Visibility API. You should contact the third party directly to see if their scripts are prerender-aware.
Situations in which prerendering is aborted
In some cases while prerendering a site Chrome may run into a situation that could potentially lead to user-visible behavior that is incorrect. In those cases, the prerender will be silently aborted. Some of these cases include:
Note: This is not an exhaustive list. Last updated 6/13/11.
- The URL initiates a download
- HTMLAudio or Video in the page
- POST, PUT, and DELETE XMLHTTPRequests
- HTTP Authentication
- HTTPS pages
- Pages that trigger the malware warning
- Popup/window creation
- Detection of high resource utilization
Plugins such as Flash will have their initialization deferred until the user actually visits the prerendered page.