Forum Moderators: Robert Charlton & goodroi
As search is a prob at the moment here I thought I would give a couple of links threads that discuss it, I'm sure dayo has more;)
something is "wrong" (maybe by design?) with Google. They only have 1/3 of their pages on main G. I searched for "a" and got 25 bill on one, and 8.7Bill on the Google.com IP. My site has no dupe issues yet, many pages have been dropped by the DCs that matter.
All I can think is that PR4...
All modern web browsers set this optional GZIP compression bit in their requests to webservers. MOST webservers do not provide GZIP compressed content! WHY? So they make more money on bandwidth! (At least that's one reason)
So if your server never provides compressed content there isn't much point in Google using bot that requests it. This may be one factor in which bot you see.
Your webserver logs should show a very different size for the page between the two bots if your server is GZIP compressing. Of course your website would be faster if your webserver did serve GZIP compressed content.
He he - things have changed since then though - it does look like a new index is being built using Mozilla Googlebot data on the test DC.
You seen any of the Mozilla Googlebot pages crawled in the test DC index.
MC Indicated on his blog that the test DC has a different crawl and indexing infastructure - so perhaps Mozilla Googlebot is now going to kick in and is the future :) - I have tried to start a new thread. Hope it gets approved soon.
g1smd
Unfortunately by default Apache is compressing at Level 6 which is CPU intensive. Were it simply compressing at level 1, most of the compression efficiency is gained, with reduced CPU usage. One can also compress using PHP, I'm not sure at what level.
With the correct level of compression total CPU usage should actually decline, and obviously bandwidth usage is drastically reduced, (DMA usage as well, very important).
Regardless, for my customers it's way faster! And if Google succeeds in using compression, as many other BOTS could, the whole internet would be faster, and way faster for dialup users! I know Apache caused Brett problems, but if GZIP compression were succesfully implemented, Brett probably would not have had to pull WW from the search engines (at least today anyway!)
Now we have ISP's compressing our content in their caches, so we (you the webmaster) don't see all the hits you should in your logs. Your pages are being served by ISP's caches not you! Not good for tracking your websites usage.
One way to tell if your web host is serving GZIP compressed content is by reviewing your website logs. Find a web page that both the Mozilla 1.1 Googlebot and the "classic" 1.0 Googlebot have crawled. If you see a very large difference in bytes transfered then you know your web host is "GZIPing" your web page's content. Typically the bytes transferred will be 3 to 4 times less for the Mozilla 1.1 bot, if your host supports GZIP.
There has been a great tool online for years to test webhosts, but today when I checked it was gone. Had it been there I would have sticky mailed it.
Wait, just found another commercial site with a good test. I can sticky mail it. (Google serves compressed content)
I wonder if compressed content is considered another minor "At a Webmaster" by Google for ranking?