I assume if we block Google from caching our pages our sites are immune from this?
Email, blog, guestbook and formmail spammers are going to absolutely love it.
Immediately blocked from all the sites on my servers. No need, no way.
Google you are getting carried away with yourselves.
btw, AOL has been using a proxy cache for years and it's the first thing users want to bypass.
Why would any broadband user want a 1%-5% speed improvement?
Q. Why has Google made the Google Web Accelerator?
A. It will make the Google Browser more appealing when it is released!
trimmer80, from the people that I've talked to, I've gotten the impression that there's less emphasis on prefetching than on smart proxy caching including incremental changes to pages. Another focus that I know of is on compressing data that goes over the connection.
Warning Off Topic
It impresses me to no end that a company representative would actually take the time to not only read what we have to say, but to offer clarification, suggestions and advice.
End Off Topic
"Another focus that I know of is on compressing data that goes over the connection."
Now, this is the way to go.
"Prefetching" pages is just plain dumb waste of bandwidth and creates more problems than what it is worth.
A competitive broadband accelator that I use uses Content Sensitive Compression (CSC) to individually compress each element of a web page.
The speed increases on using the competitive accelator on DLS broadband have been averaging 40% (and I'm not even using the maximum compression settings, as they decrease the quality of images too much).
Over GPRS I've achieved 200%-300% speed increase.
|What is their ultimate goal with pre-fetch? |
not nice thought:
"behaviour targeted ads" -? - i had a conversation with a web marketing exec just yesterday and that side of the industry is saying that looking past "contextual" is "behavioural" - i.e. tracking surfing patterns etc... something others have been doing for awhile...
so targeting ads based on where someone has been and where they might go would give more useful targeting options for advertisers than just showing ads related to what page the surfer is currently on etc...
linking this in to AdWords somehow would produce this sort of option...
seriously doubt this "behaviour tracking" would be employed by Google... and WebAccelerator is probably just another genuine app to offer users like Picasa...
|Craven de Kere|
RobbieD wrote: "There is some good help for webmasters at:
It addresses issues like Page fetching, Advertising, usage, stats etc.
There's nothing in there about an easy ability to prevent the automated requests to our sites. Just a well worded attempt to explain it.
Would Google be satisfied if I "addressed" automated queries to them that they found problematic merely by explaining to them what I was doing?
I can see it now....
"I was sending you automated requests. I was doing so to your search pages but did not touch your PPC ads. This can have the effect of increasing load to your servers without the ability for you to monetize the traffic. You can identify my traffic by the header 'X-moz: you are acting discourteously google' but do note that this is a non-standard header that may change*".
*That's no joke. Read the link on their FAQ.
I think that one of the main G's intentions behind this product is to gather data about real browsing behaviour. They will know what pages are visitied, WHICH LINKS ARE CLICKED, etc. I think those data can help them to improve their search algo, thus making the life harder for search spammers.
Have installed and un-installed
"As a User"
I have to say I was very impressed, all pages were loading faster and smoothly.
My first visit to WW after installation required me to log in and my surfing IP address was Googles IP address.
Although I was highly impressed with the results I had to un-install simply because of 'Privacy'.
The idea of sending uncrypted data, cookies and form submissions through Google won't help my 7 hours of sleep at night.
This is a great tool but saving a few seconds for the price of privacy is just not worth it.
It's not spyware due to the option of uninstalling; however I feel uneasy about potential attacks google may come under, plus the serving of ads based on the data I send to Google.
I prefer my ISP saving 3 months of my surfing habbits then Google caching every bit of data for an unknown period?
"As a webmaster"
I'll monitor the effects this may have but at this moment in time I'm not taking any action until we see more clear evidence of bandwidth abuse.
It will help google find our sites and index the content that is for sure, but if this tool grows to be popular we can't just block the IP due to the site being potentially un-available to the users.
So in the end we may well be forced to allow the users to surf our pages with the tool.
Another great tool from the labs and I'm very impressed, however their is a BUT! A users privacy is the price, thats the price of free software and I won't use it for this reason.
|Craven de Kere|
Google could easily address concerns of the automated requests by simply allowing us to block the accelerator in robots.txt
|I've gotten the impression that there's less emphasis on prefetching than on smart proxy caching including incremental changes to pages. |
Ah that's alright then. If it's only a small a part of it that's, just disable the pre-fetching and everybody's happy!
[edited by: mrMister at 8:45 am (utc) on May 5, 2005]
How does this thing handle web applications where prefetching could essentially fire off an action within the application that was not desired?
|It's not spyware due to the option of uninstalling |
By this logic, Spam wouldn't be spam if it has an unsubscribe option.
|How does this thing handle web applications where prefetching could essentially fire off an action within the application that was not desired? |
You mean something like an admin system for your web site, where at the press of a button you can add, edit and delete pages.
Yes, Google WA was more than capable of "prefetching" these delete events for you, thus deleting pages on your web site for you!
From what I can tell though, Google seem to have taken steps to prevent this from happening now. The prefetch doesn't seem to be working in the same way as it did when I first analysed it.
Has anyone tried <link rel="noprefetch" href="http://url/to/get/">?
|<link rel="noprefetch" href="http://url/to/get/">? |
I wish it were that simple
This is really bad news
I can't believe google is going the Way of the "gator" like businesses
"Download our Free, Appealing but useless little App" and we will be "In Control" w'll know exactly what you do on the web and when, thus better targeting you with our ads down the line....
Google : You don't need to do this kind of Stuff, better continue to concentrate on Pure, Plain Web Search.
First, this is a product which fullfils a made up consumer need (broadband acceleration... duh!)
Second, it is both a Scraper, and Spyware, and a Proxy.
Third, it does not honor robots.txt and does not even specify an User-Agent string for that file.
Forth, it generates extra useless traffic and wastes your bandwith.
I have only two things to say about this
- I will block this thing from access, and
- I will strongly encourage everybody to block it
No, this is not negociable. If Google wants to have access, they will have to have a different view on users than Gator / Claria. There is absolutely nothing even remotely positive to say about this product
[edited by: claus at 11:19 am (utc) on May 5, 2005]
Has anyone been able to identify it via User Agent yet?
>Has anyone been able to identify it via User Agent yet?
any quality proxy must pass the ua unchanged. Thus, it will be the same ua as your browser.
about 100 pages surfed: total saving 9 seconds. It still isn't as fast as raw Opera...
I completely agree with claus on this. You know how I mentioned about site ripping through a proxy - already saw it on a clients site as I'm sure they didn't prefetch over 1200 pages (at 0 seconds between pageviews/pages)... So this morning I am banning it from all sites I have control over instead of just a couple I was concerned with.
|Has anyone been able to identify it via User Agent yet |
As Brett stated it uses your browsers UA so it can't be blocked via UA string.
I think the only way to block it is through the IP range of 18.104.22.168 - 22.214.171.124 via .htaccess . I send them to a custom 403 on another domain (the .net,.info etc of the same domain name) in which I added one of the reasons they may have been blocked was by using the Google's webaccelerator.
As a Google employee was quoted as saying on SEW, it does use a Google User Agent.
|As Brett stated it uses your browsers UA |
See, this is part of what has me so concerned.
WebAccelerator could well be a fine product with benefits to users and webmasters alike.
But the constant stream of contradictory information makes me wonder why Google seems to be going out of its way to obfuscate these issues.
If someone gives me the impression they're hiding something I always assume the worst possible scenario. Am I alone in this?
With apologies to GoogleGuy, what has he really told us about WebAccelerator?
about 100 pages surfed: total saving 9 seconds. It still isn't as fast as raw Opera...
Yep, I still use the old Opera 6.06 and have yet to find a faster browser for a Windows based PC ;)
Only users it will benefit are porn surfers. Who else needs faster broadband?
You need to show this thread to your colleagues as well:
Seems like Google Accellerator becomes a referral spam application when the prefetched page contains some sort of referral URI.
I have had 6 referrals from Webmasterworld to a website of mine since I posted the first time to this thread. I have never disclosed the website URI to Webmasterworld.
"Yep, I still use the old Opera 6.06 and have yet to find a faster browser for a Windows based PC"
News Flash: Google hires hires lead Opera programmer. Ex-Mozilla programmers, re-join the Mozilla team :).
We might not be their target market though. AOL stil has some 20 million users, not to mention netzero, earthlink etc., who still use the modem.
If Claus is blocking it, so am I.
Claus: Could you please post the complete (and most compact) required code to block this thing so mod_rewrite novices like me don't wipe out their traffic.
Could you also post the relevant text of your custom 403 so we can help Google users get the message.
does not honor robots.txt
And rightfully so. Robots.txt is designed for spiders with automatic methods to determine which link to crawl next. With Google's Web Accelerator, however, each request is directly triggered by the behaviour of a human user.
There's a lot of hype and misinformation here. For those complaining loudly that Google is distributing spyware, is collecting personal information, is messing with your web stats, blah blah; have you only just noticed? Google has been doing this for years, and the accelarator is just another step down that road. Will I use it? Hell, no, not on your life. Will I block it? No. you're on to a losing battle with that approach.
So, it's a proxy. Google aren't the first in this field: my ISP has a transparent proxy, yours does too. AOL have been doing something like this since... well, forever.
Rather than worry about the proxy accessing your site, I think it would be more useful to look at the wider consequences of the proxy in terms of Google's operations, be it advertizing or search. Google is above all else a data-collection and data-mining company. So simply put, what does this new data bring?
Overall, with the proxy Google are getting a taste of the web from the user's perspective, rather than a bot's perspective. The user data is giving eyes to the bot. It could bring their search to a whole new level. It is the cloakers who are screaming loudest today, mostly about the (undoubtedly important) privacy concerns, but their real fear is coming from the massive breach in their technology that the accelerator is bringing.
> So, it's a proxy.
More accurately, it is a compressing proxy.
Dialups do not benefit because modern 56k modems all compress data to/from their isp anyway. The rest of the web is still stuck in the uncompressed dark ages of the 70's. A compressing proxy does indeed stand to speed up the web. One thing we know for absolute certainty is that bandwidth requirements and page sizes are going to increase. I believe average page sizes will increase 3 fold in the next 5 years.
I agree in principle with claus's feelings on the product, but I disagree on some of the specifics:
> a made up consumer need
I do think there is a small justifiable (aka: excuse) need to speed up the web. In theory, the web should have been 100% compressed data years ago, but we are still living out that 1960-70's uncompressed legacy. Compressing data - makes sense for all the major isps, and Google is just muscling in on that territory itself.
> Second, it is both a Scraper, and Spyware, and a Proxy.
How is it a scraper? I see it as a human still at the kb.
Spyware? Agreed, but that is not new - it is just *more* of the same thing they have with the toolbar and all their sources of data now.
Proxy? So what?
> it does not honor robots.txt
It is a proxy on behalf of a human, it isn't a bot.
> User-Agent string for that file.
It can't claus, it *has* to pass the UA unfettered.
> wastes your bandwidth.
Agreed. how much it uses is open for debate.
I think the take away here is that if everyone would just install GZip on their websites, we would have the same effect in ALL browsers and not just in IE/Moz.
"Agreed. how much it uses is open for debate."
for a site like WebmasterWorld it's probably a lot. I have 1000GB limmit on mine and don't even get close to reaching it, so it doesn't bother me, but I could see other sites being upset.
as far as spyware: Sadly, I think we lost the war. Between the bar, this or other future products, they'll know your every move.