Forum Moderators: DixonJones
-gps
Okay, i think these will work - haven't tested them but i did look up in the Apache specs before writing them, so they should be allright. Sort of. I hope. (It depends on your server settings of course, eg. not everyone has "setenvif" enabled) ;)
This one should ban them:
---------------------------------
SetEnvIf X-moz prefetch HAVE_X-moz
Deny from env=HAVE_X-moz
---------------------------------
...and this one should 404 them:
---------------------------------
SetEnvIf X-moz prefetch HAVE_X-moz
RewriteCond %{ENV:HAVE_X-moz} !^$
RewriteRule .* some-filename-that-does-not-exist.htm [L]
---------------------------------
..or, with another syntax:
---------------------------------
SetEnvIf X-moz prefetch HAVE_X-moz="prefetch"
RewriteCond %{ENV:HAVE_X-moz} prefetch
RewriteRule .* some-filename-that-does-not-exist.htm [L]
---------------------------------
In some server configs you don't need the "ENV:" part, but the Apache docs say that you can't do without it... it's a strange world.
...add the [OR] flag as needed.
References:
1) [httpd.apache.org...]
2) [httpd.apache.org...]
3) [httpd.apache.org...]
---------------------------------
SetEnvIf X-moz prefetch HAVE_X-moz="prefetch"
RewriteCond %{ENV:HAVE_X-moz} prefetch
RewriteCond %{HTTP_USER_AGENT} ^(.*)$
RewriteRule (.*) [example.com...] [E=HTTP_USER_AGENT:%1_PREFETCH,L]
---------------------------------
...and this one should set a cookie with name "visit" and value "prefetch", valid for three minutes:
---------------------------------
SetEnvIf X-moz prefetch HAVE_X-moz="prefetch"
RewriteCond %{ENV:HAVE_X-moz} prefetch
RewriteRule (.*) [example.com...] [CO=visit:prefetch:.example.com:3,L]
---------------------------------
- feel free to combine all the above ones.
---
Ref: [httpd.apache.org...]
If a users does click on a link to a prefetched document, while the prefetch is still in progress, the document will be requested again. That screws your logs even more.
I keep checking to make sure I'm in the "tracking and logging" forum, as I can't believe some of these posts.
If your current log analysis cannot interpret double-fetches reasonably, then you were screwed a long time ago. What do you think happens when someone gets tired of waiting for your 125KB home page to download, hits the stop button, and then refreshes?
If you RTFA, you will see that the additional double-fetches induced by this Google tweak are unlikely to match the number of normal double-fetches you get (especially since they would be caused by exactly the same phenomenon -- your page doesn't load fast enough).
I'll wager that 95% of all WebmasterWorld websites will not experience this phenomenon a single time in the next year. I'll further wager that the Internet traffic spent on discussing the issue will exceed that actually caused by the issue itself.
Of course, in the Mozilla Link Prefetching FAQ [mozilla.org] that was mentioned in an earlier post they state that they won't prefetch links that have a query string, which will cut down on some of the pages prefetched. Also, I don't think many prefetch links are being put in by Google at this time.
Personally, I don't see the advantage in adding the prefetch links - it will either be on such a small percentage that it won't make any difference, or if it becomes more prevalent then more websites will start blocking it.
As it is a php page, to see the prefetch in the logs, I just added a
<#php # set additional logging info
if (isset($_SERVER['HTTP_X_MOZ'])) {
$additionalloginfo = 'X-moz:' . $_SERVER['HTTP_X_MOZ'];
apache_note('p1',$additionalloginfo);
}
?>
LogFormat "%.......... \"%{p1}n\"" combined
Regards,
R.
Looking at the http headers of what my server sends, there is no cache-relevant info inside which could force the user's browser to fetch a fresh copy of the page. So I am baffled about this behavior, which leads to the conclusion:
my server has more traffic but the visitor has no benefit, because he will reload a fresh page apparently not using the prefetch cache.
Pls correct me, if you see other things in your logs.
Regards,
R.
The traffic distortion is bad enough, but if the page at the end of an Adword link is prefetched, does that result in Google revenue inflation at the advertisers' expense? Google could I guess argue that their involvement is delivering the search results, but I'd like to know that our Adword click throughs are only from user choice.
From a webmaster viewpoint I use a good log analyzer so I don't care what's in my stats. Not only that but I will probably create a custom filter to detect firefox prefetch requests because it will tell me which pages I have on Google that are not only the first listing, but a page and a search listing so good that Google's algo believes the Get request is practically automatic. Do you think I am going to mess with that page on my site?
Server loads? Maybe that's something to think about about on a huge popular site always on top of the search listings - but if that is the case you probably have a server capable of handling the server load with ease.
Download speed on cable modems? I don't care if they are using a light pipe I want the page in their face as fast as possible before they get up and go to the bathroom.
Cookies? Personally, I have a bunch of tools swatting cookies all the time. They are like bugs in the house - no matter how many times you have the Orkin man come over you still find bugs in the corner of the bathroom.
Privacy on the web is almost a joke anyway. Smart users who don't want the pre-fetch feature will turn it off. The people who don't know how to do anything more than browse the web will be no worse off for the wear and tear.
If I could hit a button and nuke something it would be the firefox favicon.ico downloads. Favicons used to be useful to me for grading PPC campaigns or page popularity. Now (unless you want to spend the time needed to creat the filters needed to determine learn which favicons were probably "bookmarks" versus worthless firefox favicons) they are useless to me.
Lastly, all of the paranoid schizophrenic posts about Google drive me nuts. As the mob guys say "nobody's gonna tell me how to runna my business!". Google is a business and does whatever they think is in their best interest - just like me. Thank goodness their business interests and mine are the same - making money on the web.
And inflate the Firefox stats
In some lightly travelled domains I have already seen this before prefetch reared it's ugly head. If you look at the logs from a site that is lightly used you can often see Mozilla derivatives stumbling around like a drunken sailor with multiple requests for each single file over a number of files. It looks like the browser is almost stuttering. I have no way of knowing why it does this, but it does. Browser spam?
These are domains that sometimes only get a single visit a day so it is very easy to observe in isolation. A 3k logfile is pretty easy to read carefully :)
++
[edited by: plumsauce at 4:57 am (utc) on April 4, 2005]
Cookies? Personally, I have a bunch of tools swatting cookies all the time. They are like bugs in the house - no matter how many times you have the Orkin man come over you still find bugs in the corner of the bathroom.
Not knowing who the Orkin man is, I fatally misread that sentence and was wondering if there was some aspect to this cookie-squashing business which had previously eluded me ;-).
Research on a wide variety of hypertext systems has shown that users need response times of less than one second ... studies done at IBM in the 1970s and 1980s found that mainframe users were more productive when the time between hitting a function key and getting the requested screen was less than a second.
If I manually look thru my access_log files, can I see that a prefetch has happened? - Larry
If you want to know for sure: no, unless you implement a mechanism similar to those described in msg #62 or #69 to grab that information from the browser's request header and write it to the log.
Regards,
R.
If I manually look thru my access_log files, can I see that a prefetch has happened?
You can check whether a Mozilla/Firefox user from google also requested images or other non-page files (e.g. css). Browsers of real users usually request images to show pages so if only page files are requested it should be a spider or prefetch request.
This method won't work if your pages don't contain images or external css, js, etc but it's quite unusual now. There may be also problems with users who disable images but number of such users should be also quite low now.
Actually no, it is a Google issue as well as a Mozilla issue. That statement is akin to saying that because companies manufacture weapons, the results of the use of the weapons is the fault of the manufacturer, not the wielder.
One could argue the merits of the feature in Mozilla, but it is the use of the feature that causes the problem. Google should let the user decide which site to access, and not pre-fetch sites just in case the user wants to go there regardless of the probability.
Craig_F is correct. What's good for the Goose is good for the Google.
Onya
Woz
All I'm asking is for people to give it a try for a few days before they decide whether to block it or not..
Sorry GG, but the "value" of prefetch does not outweigh my privacy. I resent the fact that Google just goes ahead and does this sort of thing without allowing the user to opt out before it is implemented. In fact, it should be an opt in feature rather than the other way around.