Forum Moderators: Robert Charlton & goodroi
System Requirements
Operating System: Win XP or Win 2000 SP3+
Browser: IE 5.5+ or Firefox 1.0+
Availability: For users in North America and Europe (during beta testing phase)
Press Release:
Google Web
Accelerator significantly reduces the time that it takes broadband users to
download and view web pages. The Google Web Accelerator appears as a small
speedometer in the browser chrome with a cumulative "Time saved" indicator.Here's how it works. A user downloads and installs the client and begins
browsing the web as she normally would. In the background, the Google Web
Accelerator employs a number of techniques to speed up the delivery of
content to users.
Looks like some of the Mozilla hires are paying dvidends.
Requests currently made to my servers using the accelerator are causing some problems I don't want.
Users are loosing their sessions and other users get content they are not supposed to see, caused by users seeing pages prefetched for other users.
So, in short, I want to avoid the accelerator to `accelerate` my pages.
What response should I sent to the Google Servers that will tell them not to accelerate that specific page and let the browser itself fetch the page directly from the server?
Any specific http-response-code that will allow this?
Any specific header that will prevent google to cache the page?
Will it listen to an entry in the robots.txt file?
I welcome usefull software that does what it is supposed to do, but when it fails to do that. It must be possible to do something about it...
Am i crazy, or do webmasters DO NOT want their sites accessed faster and thereby improving user experience?
Not to sound callous, but did you read this thread?
It literally breaks sites that rely on cookies for shopping, login and other info.
It also skews all results of your visitor logs/tracking system as you have no idea if the visitor actually viewed a page or hovered over your navigation which triggered the prefetch of that page.
What does this mean? If I go to a site, lets say dmoz.org and use my mouse as a pointer as I'm reading through 100 listings in a category – those sites pages will be prefetched showing that I visited their site when I really haven't. Same goes for an individual site. Say I go to your site and hover the navigation consisting of 15 areas of your site, I download those pages via prefetch triggering, although I never left your homepage. So as a site owner you think I went through your whole site since I "downloaded" those pages, but I really only looked at your homepage and left.
The webaccelerator does not speed up anything in "most" cases, it only appears that way since it is downloading other pages in the background. If it really worked, you could use it on a dial-up connection and notice a speed improvement. As it is it would probably kill your dial-up connection bringing it to a crawl.
It also provides a very fast proxy for site downloaders/rippers. I mentioned that and also saw it firsthand in a clients logs where over 1200 pages were downloaded with 0 seconds between pageviews. How was this done, simple, go download FireFox and the plugin for httrack. Jeez, you could set the UA string to that of Googlebot and when the unsuspecting site owner did a whois on the IP they would think it was Google crawling their site if they didn't know any better.
I'll ask your other questions as well, but I'd ask people to give it a little time before jumping to a conclusion that the accelerator is bad for your sites--it's been less than two days since the Labs demo was put up.
I'd ask people to give it a little time before jumping to a conclusion that the accelerator is bad for your sites-- it's been less than two days since the Labs demo was put up.
Hey, you know that I don't mean anything personal by my posts ;)
I think the problems that have been mentioned by others and myself should have been thought of and/or addressed long before the release IMHO. C'mon, you have some of the brightest minds in the business.... I give you more credit than this...
I'm wondering if Google will come up with a fix for Urchin so they will have the only stats program out there who's results are skewed by the Google webaccelerator :)
If you go back a year to Gmail's debut, there were also people who wanted to block Gmail right after it was introduced. Fast forward a year, and many many people like Gmail now that they've seen the direction that we've gone with it: 2 gigs of storage and growing, a solid UI, free POP access, and free email forwarding as well. The accelerator had been out for less than a day and a half before people were forming conclusions about whether it was helpful for their site or not. I'm saying give it a little time (hey, maybe the weekend :) before deciding. And in the meantime, I'm happy to find an engineer who worked on this and ask them questions about what people are asking here.
- Users losing session:
This is a very simple thing and I'm able to reproduce this.
User does something on page 1 that causes some preference to be stored in the session. This information is used on subsequent pages. At some point it is clear from the users point of view this preference is no longer stored in the session, but they don't know why.
In some cases reloading the page works and it turns out it was simply an old version of the page that was cached was being viewed. In other cases reloading doesn't help.
When I take a look at the sessions on the server at that moment; the user appears to be having 2 (or more!) sessions. ie: they `lost` their session and a new one was created.
Allthough it happens that users loose their session without using the google accelerator; using the accelerator does cause this to happen more often.
- some system information
sessions are passed along via cookies, if no cookie has been set (yet), a parameter in the url is used.
The url's don't seem to be dynamic; as they don't contain ANY extention (.php/.html or whatever) and there is NO? in it at all. Url's just look like a path to some folder on the server.
Problem is that allmost ALL my pages are dynamic to some extend, because of the use of preferences that can be set by visitors.
Using a proxy or a system like google's accelerator doesn't really work for a site constructed like this.
[edited by: engine at 3:34 pm (utc) on May 6, 2005]
[edit reason] See TOS [webmasterworld.com] [/edit]
You say in that analysis: "Imagine a million people downloading the Google Web Accelerator and all of a sudden, you have an infrastructure that finds out about a lot of pages very quickly."
Sorry to be obtuse, but what do you think the Google Toolbar does? Surely it doesn't collect all that information about every single page you go to just for fun.
what issues are people seeing with cookies
Yes, they are not set on the users machine. I would venture to say that PowDork's problems stem from the images on every page being re-cached instead of them being cached/stored like they normally are in a browsers cache.
I stopped playing with the program as I can't use it with a clear conscience ;)
I have done some digging in the windows registry and have seen where it assigns a unique ID along with my Company name and my own name – I never entered that info, so it grabbed it from elsewhere.
What is the ultimate goal of webaccelerator? Ok, so I know you couldn't answer that even if you knew, but I had to ask.
I think you'd actually do better by releasing your own browser to gather users info/habits and leave this prefetch crap alone. I'm sure that's coming, just don't know when..
That way you get the info you want without screwing up peoples sites ;)
I give you credit/thanks for lots of useful tools, products, and services released to endusers, but this is one that I hope never gains a wide audience…hehe
I'd ask people to give it a little time before jumping to a conclusion that the accelerator is bad for your sites
Google WA has ZERO upside for my site.
It interferes with my page impression tracking which is a HUGE issue. We track total page impressions to compute how much advertising inventory we can sell ($$,$$$) per month and if you're caching the pages those numbers are skewed. We also track the total accesses per each page as we sell sponsored links and charge customers a flat fee based on the total impressions per page sponsored so those numbers may be overinflated with pre-fetch based on their actual impressions. Not to mention ALL pages are dynamic so none of them should ever be cached as the content changes in real-time.
Example: My traffic is 1.5M page views/month, pre-fetch could boost it to 3M (or more) with just 1 pre-fetch per current page view.
So explain how Google WA caching and pre-fetch helps me in any way whatsoever other than directly interfering with my business and livelihood?
If you honor no-cache requests at least the worst thing Google WA will do is slow down access to the site which it appears to do already.
You need to let us easily OPT-OUT the entire site, it should be domain-wide opt-out, not no-cache per page or any other silly little busy make work nonsense that results in thousands of page edits.
[edited by: incrediBILL at 3:38 pm (utc) on May 6, 2005]
url will allways look something like this:
mydomain.tld/pagename/parameter.value/
with some minor variatons. But there will never be a? in the url, nor an extention or filename.
When only looking at the url, an automated system will probably label it as `static`, allthough some humans may recognise there is some dynamic stuff happening on the server.
* All pages dynamic.
* Statistics are incorrect because:
- some pages are requested more often because of prefetching, but no actual requesting
- some pages are requested less often because of caching.
result: stats are quite useless.
* `clickpaths` of users are no longer accurate. (stats)
* users loosing their sessions.
So, same for me as IncrediBill here. No upside's, only bad things for my site.
Good idea is an OPT-OUT that will prevent this from happening, using a header for all pages will cause a (dramatic?) increase in serverload.
So a manual opt-out for an entire domain can be usefull.
An automatic `opt-out` for site's that issue A LOT of no-cache headers may also be a good idea. (a site may also be sending `no cache` for every request!)
An automatic `opt-out` for site's that issue A LOT of no-cache headers may also be a good idea. (a site may also be sending `no cache` for every request!)
I'd go further and say that WA should treat any link from a page served with a no-cache header as possibily being dynamic and therefore not to prefetch. I'd be happy with that :). In fact I really quite like the effect it has on my "outside" (public) site. I have a "screenshots" page that contains thumbnails that you click to view a full screen version - the pre-fetch on mouse over really makes the whole thing look very slick.
Your web application pages should be sending no-cache anyway so there's no additional load on your app.
After reading WA privacy statement (bellow), I wish to ask you:
Does WA honor the contents of publishers who exclude Googglebot (through robots.txt) and don´t wish neither to be indexed in Google nor to share their sites or their visitors information with Google?
Thanks.
<5. How does using Google Web Accelerator affect my privacy?
Google Web Accelerator receives much of the same kind of information you currently send to your ISP when you surf the Web:
* Google will receive your requests for unencrypted pages (those with "HTTP:", not "HTTPS:", at the beginning of the URL), along with information such as the date and time of the request, your IP address, and computer and connection information
* If you enter personally identifiable information (such as an email address) onto a form on an unencrypted web page, some sites may send this information through Google. Whenever your computer sends cookies with browsing or prefetching page requests for unencrypted sites, we temporarily cache these cookies in order to improve performance
* In order to speed up the display of pages generally, Google Web Accelerator may store copies of web pages, including prefetched pages that you did not visit, in the Google Web Accelerator cache on your machine. This is separate from your browser's cache, which only identifies pages that you actually visited. You can empty your Google Web Accelerator cache at any time by following these instructions.
The policies for the Google Web Accelerator, like those for Google.com, uphold the highest level of integrity and respect for our users' information.>