homepage Welcome to WebmasterWorld Guest from 54.227.89.236
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member
Home / Forums Index / Google / Google SEO News and Discussion
Forum Library, Charter, Moderators: Robert Charlton & aakk9999 & brotherhood of lan & goodroi

Google SEO News and Discussion Forum

This 476 message thread spans 16 pages: < < 476 ( 1 2 3 4 5 6 7 [8] 9 10 11 12 13 14 15 16 > >     
Google Windows Web Accelerator
Brett_Tabke




msg:736302
 8:09 pm on May 4, 2005 (gmt 0)

[webaccelerator.google.com...]


System Requirements
Operating System: Win XP or Win 2000 SP3+
Browser: IE 5.5+ or Firefox 1.0+
Availability: For users in North America and Europe (during beta testing phase)

Press Release:

Google Web
Accelerator significantly reduces the time that it takes broadband users to
download and view web pages. The Google Web Accelerator appears as a small
speedometer in the browser chrome with a cumulative "Time saved" indicator.

Here's how it works. A user downloads and installs the client and begins
browsing the web as she normally would. In the background, the Google Web
Accelerator employs a number of techniques to speed up the delivery of
content to users.

Looks like some of the Mozilla hires are paying dvidends.

 

dmorison




msg:736512
 6:56 am on May 6, 2005 (gmt 0)

- We do add an "X-moz: prefetch" header to all prefetch requests. That way, webmasters can choose to just ignore prefetch requests if they so choose.

What's the best server response to a request containing the X-moz: prefetch header if you choose to "just ignore" it?

I just checked the mod_rewrite docs and couldn't see a rule flag that means "drop silently"; so you have to return _something_. What's the best _something_ to return in order to avoid additional side effects?

The particular side-effect that i'm worrying about is the WA assuming that my attempt to ignore the request is the actual response and return that to the end user....

jd01




msg:736513
 7:23 am on May 6, 2005 (gmt 0)

You should be able to return anything you like, as long as you are only denying 'prefetch', because if the user actually clicks on the link, a 'regular' header will be sent and therefore the condition will fail allowing the link to be followed as usual.

Justin

Added: Of course this may *not* help for outbound links.

GoogleGuy




msg:736514
 7:34 am on May 6, 2005 (gmt 0)

That's strange, Powdork. It goes slower when you try it out?

Powdork




msg:736515
 7:48 am on May 6, 2005 (gmt 0)

Yes. Are you familiar with the galleries generated by photoshop? If so you will recognize the next, previous, and index buttons that are added to each page. I do a site: search that brings up lots of image pages. Then I start visiting pages with the accelerator on and then off. When I visit with the accelerator I visually watch the buttons load as well as the main image. Without the accelerator the buttons snap up and I have to wait a little for the image. I am making sure that each time I am opening a page in a different directory to ensure it is calling a different file for next.gif, etc. The next, previous, and index buttons are placed first in the code.

Causes
1. My DSL is too slow for this to work and it actually has a negative impact. When i signed up for dsl it was very fast. I was near the source. Then I moved to a location further away from the source but still had time on the contract so I couldn't switch to cable, which is much faster at my location. I am probably about 50% faster than a 56k.
2. Since the main links on the page are to other image laden pages, the browser is fetching other images while trying to download and render the ones on the current page. However, I am noticing a general 'slowerness' on other pages I visit, but there are no easy measures of speed like the buttons in my galleries.

mrMister




msg:736516
 7:56 am on May 6, 2005 (gmt 0)

That's strange, Powdork. It goes slower when you try it out?

That's hardly strange.

If the web server is closer than Google's proxy in terms of network topology and if the output of the server uses HTTP compression then Google's response will likely be slower than the original web server's response would have been.

Google's proxy isn't like an ISP's proxy that is guaranteed to be "closer". If it's farther away and it can't compress the data considerably then Google's response will be slower.

I know these Google programmers think they're smart, but if they think they can break the laws of physics then they really need to cut down on the amount of caffeine they're consuming! ;-)

dmorison




msg:736517
 7:59 am on May 6, 2005 (gmt 0)

You should be able to return anything you like

... but then how does the WA know that that is not the expected response and serve it up in response to the "real" click a few moments later?

I must be missing something here. Nothing I send out in the response, that can be interpreted by a computer program, can imply that the page is any different because of the prefetch header included in the request.

It would need a response header that implies "Prefetch-Denied", otherwise you have to figure out how to drop the connection and break the protocol...

claus




msg:736518
 9:24 am on May 6, 2005 (gmt 0)

403 Access Denied

---
And, it's quite easy to make a simple counter. Usually it's one of the first things you learn in any programming language.

"Wow, i am visitor number 1,000,000 to this web site and won a prize"

dmorison




msg:736519
 9:33 am on May 6, 2005 (gmt 0)

403 Access Denied

....and then that gets returned to the user in response to their subsequent genuine click!

Surely you're just being hopeful that the developer of an intermediate proxy that is inserting prefetch headers (the WA in this case) is going to asert in their design that if they receive a 403 Access Denied then they should re-request the URL in the future when the genuine click through comes through.

I am not happy making that assumption; because I did not design the WA, and I don't know the person who did.

Regardless, the only thing that could be safely relied upon not to break functionality in response to a request containing a prefetch header is something that specifically references that prefetch request, such as a new response code or prefetch-denied header.

jd01




msg:736520
 9:52 am on May 6, 2005 (gmt 0)

Without going too far off topic:

What you see in a browser:
www.yoursite.com

What your server sees in *every* request that is made:
GET [yoursite.com...] HTTP/1.1 (or some other method EG POST, HEAD, etc. and version of HTTP/1.0)

So, since G is sending the 'prefetch' header request only to requests it is prefetching and *not* to links that are being clicked by the user. You can effectively block the prefetch request at the server level with this or something similar:

RewriteEngine On
SetEnvIf X-moz prefetch HAS_X-moz
RewriteCond %{ENV:HAS_X-moz} prefetch
RewriteRule . [F,L]

This will *only* impact a request that contains X-moz: prefetch. It will *not* impact a simple GET or POST request that does not contain the prefetch request in the header. EG A user clicking on a link.

Hope this helps.

Justin

Added: >> and then that gets returned to the user in response to their subsequent genuine click!---This is incorrect.

Edited for clarity.

philaweb




msg:736521
 10:39 am on May 6, 2005 (gmt 0)

>So, since G is sending the 'prefetch' header request only to requests it is prefetching and *not* to links that are being clicked by the user. You can effectively block the prefetch request at the server level with this or something similar:<

I have checked and rechecked my analog stats for yesterday, I must admit that I do not see any prefetch requests from the Google Accellerator IP anywhere.

Perhaps Google is going to implement that option later (since the WA is only in BETA mode). Currently, only the IP tells me when the Google WA is at work.

It is interesting to see that WA also prefetches files within .htpasswd protected directories. WA prefetches anything your browser (IE or FF) has access to.

Another interesting pattern: WA downloads all files requested by a visitor, which means double bandwidth usage. It is difficult to see whether the downloaded prefetched files are kept within the WA buffer cache on the visitors PC or kept on a Google server - or both.

Anyways, I'm done fiddling with AW and have decided to delete it from my PC.

What I find confusing is the behaviour after I blocked AW in the server .htaccess file, still using the WA for browsing. Some of the pages turns up anyway, even though I never clicked them using the WA. Whether the page comes from the ordinary browser cache or the Google server prefetch cache is impossible for me to decide.

Some clicks did return a 403 error page, but the URI remained intact in the address bar. When refreshed the page turned up. The only .htaccess code that definitely shut down the WA prefetches was this:

<Limit GET POST>
order allow,deny
allow from all
deny from 72.14.192.6
</Limit>

dmorison




msg:736522
 11:00 am on May 6, 2005 (gmt 0)

This will *only* impact a request that contains X-moz: prefetch. It will *not* impact a simple GET or POST request that does not contain the prefetch request in the header. EG A user clicking on a link.

This statement doesn't seem to appreciate what an accelerating proxy server actually does. The whole point of the prefetch is that the users "real" click never makes it to your server.

Anyway I agree that we shouldn't bog this thread down with the technie details so i've tried to clarify my concerns in a separate thread...
[webmasterworld.com...]

jd01




msg:736523
 11:31 am on May 6, 2005 (gmt 0)

Last comment and then I'll go back to the threads I normally post in...

My understanding is: The prefetching is caching a 'live' page from your server on-the-fly (EG loading the page into a cache at G) - If you deny the request for the prefetch, the page will not be cached. If the page is not cached all link(s) will continue to have their normal function (IOW the links will not be broken, nor will the pages cease to function, because you denied the precaching... Just like any other link on the page that is not cached.) So, by denying the proxy request for the pre-caching, you are effectively turning off the pre-caching engine, and forcing the requests to be processed as normal by your server.

Justin

BTW I believe since the 'prefetch' header sent is that of a proxy function it will not show in your logs as a normal GET/POST header would... It should only show that the page was loaded by the G proxy IP address.

Anyway hope this help someone...

oneguy




msg:736524
 12:49 pm on May 6, 2005 (gmt 0)

from Scarecrow...

With WA, and the abiding faith that any WA user would necessarily have in Google as God, the mess-ups from Google will get blamed on us webmasters.

True, and Google cares not.

from mrMister...

I really don't understand why Google are so adamant that webmasters should have no control over the app.

I remember a few years ago, when content building webmasters would sing in unison about their partnership with Google. It has to be getting harder and harder to pretend.

(and that's not directed at anyone personally... I just jumped off your quote.)

Angonasec




msg:736525
 1:09 pm on May 6, 2005 (gmt 0)

Ta Claus: G WA is now blocked in my account root htaccess using the code you gave.

I asked before, but you missed it...

I'm on a shared server so will it stop prefetching of my account. (I don't control the server htaccess).

Bill asked:

"Why block WA?

Redirect it to a page the describes why it's bad technology burning bandwidth needlessly and should be uninstalled - educate those masses in their hypnotic trance that drool and chant "Goooooogle" all day on the net."

Okay Bill, please post the relevant compact htaccess code to redirect safely, together with your sample 'Google WA tutorial for the duped', and we will use it, until Google give us back our liberty.

reseller




msg:736526
 1:53 pm on May 6, 2005 (gmt 0)

Angonasec

<Okay Bill, please post the relevant compact htaccess code to redirect safely, together with your sample 'Google WA tutorial for the duped', and we will use it, until Google give us back our liberty.>

Liberty isn´t something somebody gives you, its something you fight for and earn! Also true in Google case. Its publishers who should fight back for their liberty.

History repeat it selv. The present situation with Google´s WA remindes me of a situation in 2001 where the great legend, Jim Wilson gathered hundreds of decent webmasters and lead the fight against scumware.
[scumware.com...]

Who knows who will be the next Jim to lead us all as publishers in our fight for privacy and control of our own contents.

Chndru




msg:736527
 1:59 pm on May 6, 2005 (gmt 0)

Am i crazy, or do webmasters DO NOT want their sites accessed faster and thereby improving user experience?

It is like, if i make the site slower, why should the user try to make it any faster!

DoppyNL




msg:736528
 2:14 pm on May 6, 2005 (gmt 0)

@GoogleGuy:

Requests currently made to my servers using the accelerator are causing some problems I don't want.
Users are loosing their sessions and other users get content they are not supposed to see, caused by users seeing pages prefetched for other users.

So, in short, I want to avoid the accelerator to `accelerate` my pages.

What response should I sent to the Google Servers that will tell them not to accelerate that specific page and let the browser itself fetch the page directly from the server?

Any specific http-response-code that will allow this?
Any specific header that will prevent google to cache the page?
Will it listen to an entry in the robots.txt file?

I welcome usefull software that does what it is supposed to do, but when it fails to do that. It must be possible to do something about it...

The Contractor




msg:736529
 2:23 pm on May 6, 2005 (gmt 0)

Am i crazy, or do webmasters DO NOT want their sites accessed faster and thereby improving user experience?

Not to sound callous, but did you read this thread?

It literally breaks sites that rely on cookies for shopping, login and other info.

It also skews all results of your visitor logs/tracking system as you have no idea if the visitor actually viewed a page or hovered over your navigation which triggered the prefetch of that page.

What does this mean? If I go to a site, lets say dmoz.org and use my mouse as a pointer as I'm reading through 100 listings in a category – those sites pages will be prefetched showing that I visited their site when I really haven't. Same goes for an individual site. Say I go to your site and hover the navigation consisting of 15 areas of your site, I download those pages via prefetch triggering, although I never left your homepage. So as a site owner you think I went through your whole site since I "downloaded" those pages, but I really only looked at your homepage and left.

The webaccelerator does not speed up anything in "most" cases, it only appears that way since it is downloading other pages in the background. If it really worked, you could use it on a dial-up connection and notice a speed improvement. As it is it would probably kill your dial-up connection bringing it to a crawl.

It also provides a very fast proxy for site downloaders/rippers. I mentioned that and also saw it firsthand in a clients logs where over 1200 pages were downloaded with 0 seconds between pageviews. How was this done, simple, go download FireFox and the plugin for httrack. Jeez, you could set the UA string to that of Googlebot and when the unsuspecting site owner did a whois on the IP they would think it was Google crawling their site if they didn't know any better.

claus




msg:736530
 2:23 pm on May 6, 2005 (gmt 0)

>> shared server

Yes on a shared server it will only work for your account and your folders, not for the whole server.

GoogleGuy




msg:736531
 2:27 pm on May 6, 2005 (gmt 0)

DoppyNL, can you tell me a bit more about users losing sessions? How is prefetching causing users to see pages for other users?

I'll ask your other questions as well, but I'd ask people to give it a little time before jumping to a conclusion that the accelerator is bad for your sites--it's been less than two days since the Labs demo was put up.

bloke in a box




msg:736532
 2:32 pm on May 6, 2005 (gmt 0)

Should it not have been tested until you knew what kind of effects it would have on the rest of the web though?

The Contractor




msg:736533
 2:38 pm on May 6, 2005 (gmt 0)

I'd ask people to give it a little time before jumping to a conclusion that the accelerator is bad for your sites-- it's been less than two days since the Labs demo was put up.

Hey, you know that I don't mean anything personal by my posts ;)
I think the problems that have been mentioned by others and myself should have been thought of and/or addressed long before the release IMHO. C'mon, you have some of the brightest minds in the business.... I give you more credit than this...

I'm wondering if Google will come up with a fix for Urchin so they will have the only stats program out there who's results are skewed by the Google webaccelerator :)

GoogleGuy




msg:736534
 2:41 pm on May 6, 2005 (gmt 0)

bloke in a box, I'm just not as familiar with the accelerator personally. I'm planning to find an engineer and find out more about it, but I wanted to have good questions first. For example, Powdork said that it slowed down their connection--so it's helpful to know that the slowdown was for a single photo-heavy site created by PhotoShop. That helps me when I go and ask about the accelerator.

If you go back a year to Gmail's debut, there were also people who wanted to block Gmail right after it was introduced. Fast forward a year, and many many people like Gmail now that they've seen the direction that we've gone with it: 2 gigs of storage and growing, a solid UI, free POP access, and free email forwarding as well. The accelerator had been out for less than a day and a half before people were forming conclusions about whether it was helpful for their site or not. I'm saying give it a little time (hey, maybe the weekend :) before deciding. And in the meantime, I'm happy to find an engineer who worked on this and ask them questions about what people are asking here.

GoogleGuy




msg:736535
 2:47 pm on May 6, 2005 (gmt 0)

No worries, The Contractor. My main goal is to figure out what concerns people have (e.g. what issues are people seeing with cookies? Are they not passed on?) and then take that to the accelerator folks and ask what's up. Sometimes even a new index would take a few days to shake out as you transition between data centers.. :)

Powdork




msg:736536
 2:53 pm on May 6, 2005 (gmt 0)

It has nothing to do with the pages being generated by photoshop. I only mentioned photoshop because the little 270 byte buttons wwere downloading so slowly I could watch them. Going through more of the pages I see that this is only happening on the pages where i do have affiliate ads on the page as well.
GG let me know if having a sample of the galleries where this is happening would help, although I guess you could just look up my accelerator history.;)

GoogleGuy




msg:736537
 2:56 pm on May 6, 2005 (gmt 0)

I wouldn't have access rights, Powdork. :) I would be curious to know a few of the sites where you saw that. It certainly could be that you're 1 hop away from the sites you're hitting, but I'd love to have some samples to show to an engineer. I'm sure that they're all asleep right now, but eventually I'll find one. :)

DoppyNL




msg:736538
 2:59 pm on May 6, 2005 (gmt 0)

- Users seeing pages they are not supposed to see:
2 users are both using Google accelerator. One of them is logged in with his personal account.
The other is seeing a page containing the string "Logged on as user-1-username".
Wich I guess is caused by Google's accelerator prefetching that page for User1 and then also sending that page to user2 shortly after that.
Another example of this is preferences set by one visitor suddenly are used for another user. (because a cached version was returned)
I haven't been able to reproduce this at this point; but the user who reported this isn't stupid and I believe what he reported to me is true.

- Users losing session:
This is a very simple thing and I'm able to reproduce this.
User does something on page 1 that causes some preference to be stored in the session. This information is used on subsequent pages. At some point it is clear from the users point of view this preference is no longer stored in the session, but they don't know why.
In some cases reloading the page works and it turns out it was simply an old version of the page that was cached was being viewed. In other cases reloading doesn't help.
When I take a look at the sessions on the server at that moment; the user appears to be having 2 (or more!) sessions. ie: they `lost` their session and a new one was created.

Allthough it happens that users loose their session without using the google accelerator; using the accelerator does cause this to happen more often.

- some system information
sessions are passed along via cookies, if no cookie has been set (yet), a parameter in the url is used.
The url's don't seem to be dynamic; as they don't contain ANY extention (.php/.html or whatever) and there is NO? in it at all. Url's just look like a path to some folder on the server.

Problem is that allmost ALL my pages are dynamic to some extend, because of the use of preferences that can be set by visitors.
Using a proxy or a system like google's accelerator doesn't really work for a site constructed like this.

GoogleGuy




msg:736539
 3:06 pm on May 6, 2005 (gmt 0)

So if it's the first time visiting a site, you get a parameter in the url, but not a '?' in the url. Are dynamic urls (urls with a '?') ever used, or have you structured the site so that the urls always look static?

tnlnyc




msg:736540
 3:06 pm on May 6, 2005 (gmt 0)

It may have to do with Search after all... <snip>

[edited by: engine at 3:34 pm (utc) on May 6, 2005]
[edit reason] See TOS [webmasterworld.com] [/edit]

Cheeser




msg:736541
 3:13 pm on May 6, 2005 (gmt 0)

To tnlnyc:

You say in that analysis: "Imagine a million people downloading the Google Web Accelerator and all of a sudden, you have an infrastructure that finds out about a lot of pages very quickly."

Sorry to be obtuse, but what do you think the Google Toolbar does? Surely it doesn't collect all that information about every single page you go to just for fun.

The Contractor




msg:736542
 3:26 pm on May 6, 2005 (gmt 0)

what issues are people seeing with cookies

Yes, they are not set on the users machine. I would venture to say that PowDork's problems stem from the images on every page being re-cached instead of them being cached/stored like they normally are in a browsers cache.

I stopped playing with the program as I can't use it with a clear conscience ;)

I have done some digging in the windows registry and have seen where it assigns a unique ID along with my Company name and my own name – I never entered that info, so it grabbed it from elsewhere.

What is the ultimate goal of webaccelerator? Ok, so I know you couldn't answer that even if you knew, but I had to ask.

I think you'd actually do better by releasing your own browser to gather users info/habits and leave this prefetch crap alone. I'm sure that's coming, just don't know when..
That way you get the info you want without screwing up peoples sites ;)

I give you credit/thanks for lots of useful tools, products, and services released to endusers, but this is one that I hope never gains a wide audience…hehe

This 476 message thread spans 16 pages: < < 476 ( 1 2 3 4 5 6 7 [8] 9 10 11 12 13 14 15 16 > >
Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Google / Google SEO News and Discussion
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved