Welcome to WebmasterWorld Guest from 188.8.131.52
How about some lines in the robots.txt file
When Googlebot checks the robots.txt file it would be nice IF they could note they we would rather they didn't put link tags in to prefetch the any pages.
Having Mozilla check the robots.txt before requesting the page would still result in a hit, but at least its only a single hit.
Any thoughts or suggestions that google could implement?
Again, lets not worry about the side effect of Google using prefetch and instead, worry about the disease that the Mozilla group would put the feature in there to start with.
why would Google want to include "prefetch" in a bot crawl?
they are already storing enough data about each page they index...
I would think that the bot would always want to find the latest version of a page .. not anticipate that page..but actually see it and calculate the latest version...
The page might be pre-fetched without the searcher actually clicking on your result = wasted bandwidth
It'll be a tiny fraction of your total bandwidth. By far the majority of people will click on the link.
Have you seen the kind of search terms the prefetch is used for? It's things like "webmaster world". How many people are likely to type in webmaster world but not click on the link?
- uses bandwidth. With little indication that a user would ever visit.
- if the site is dynamic and using cache busting headers, the vistor will have to redownload the page anywhere - thus, you get speed penalized twice.
- falsifies referral information in your referral log.
- flasifies agent information in your agent log.
- causes system load.
- is basically an unstopable spider.
Lastly, from those whacky fun loving affiliate guys, comes this little tidbit:
Mozies prefetch offers the perfect opportunity to blindly cloak away for moz users with little risk of ever being caught - think about it.
I just disliked their suggestion of returning a 404 error to the x-moz agent...
Mainly 'cos if someone looks at your page after its been prefetched how would you know they've read it.
Basically my customers could claim that I haven't provided as much traffic due to pages being prefetched and not actually read and I'd have know way of proving them wrong.
I thought if Google was aware that I didn't want any of my site prefetched then they could be kind enough not to tell mozilla to prefetch any of it.
End of my rant...
I saw that the other day when a site asked to set a cookie, and I was still looking at the SERP - not clicked on anything yet.
How many rogue sites are setting a prefetch to rogue content scam sites? I disabled it immediately.
There is an option to disable this in Mozilla 1.3+ which I had never noticed. It is in:
Preferences --> Advanced --> Cache