Welcome to WebmasterWorld Guest from

Forum Moderators: open

Message Too Old, No Replies

Google Spider Suggestion

Googlebot + Compression

3:46 pm on Dec 18, 2002 (gmt 0)

Junior Member

joined:Dec 8, 2002
votes: 0

For a couple of years now we moved all our sites to dedicated servers, which by the way was a great move for us, since then we've been able to customize our content in many new ways and we're not going back to individual hosts.

Well, the only drawback is we now spend on data center management and also pay by the gigabyte of transfer - our hosting company offers 1st class service but their bandwidth is very expensive.

Guess who's one of our biggest consumers of bandwidth? Googlebot. Just this month Googlebot costed us over U$ 600.00 in bandwidth.

I would appreciate it if Googlebot 2.1 was improved to 2.2 by accepting gzip compression.

It'll add a little overhead to the spider to decompress the retrieved content before archiving it - but will save the world a few billion dollars in bandwidth every year.

This year our average is U$ 500 in Googlebot bandwidth monthly, so we'll have paid about U$ 6,000.00 to our bandwidth provider for Google to send us "free" traffic in 2002.

Would Google please consider accepting gzipped content through Googlebot? Thanks for your time.

3:54 pm on Dec 18, 2002 (gmt 0)

New User

10+ Year Member

joined:Nov 26, 2002
votes: 0

Either you have a really huge content, or your bandwidth is rediculous.
7:29 pm on Dec 18, 2002 (gmt 0)

New User

10+ Year Member

joined:Dec 6, 2002
votes: 0

Wow -- what kind of bandwidth is google pulling out of your site(s)? You must at least get tons of Google traffic if you have so many pages that Googlebot costs you $500.00 a month. It's a problem I wish I had. :) Of course, you can solve this expense very simply by the use of your robots.txt and keep the little bot away from your pages.
7:43 pm on Dec 18, 2002 (gmt 0)

Full Member

10+ Year Member

joined:July 10, 2002
votes: 0

Googleguy was also talking recently about setting up your server so that Googlebot only came looking for new pages. That should cut down your bill, unless you update every page every month. Can't remember the technical term for this, nor can I find the thread, but perhaps someone else remembers.
7:49 pm on Dec 18, 2002 (gmt 0)

Preferred Member

10+ Year Member

joined:Jan 29, 2002
votes: 0

Here it is, MeditationMan:


8:17 pm on Dec 18, 2002 (gmt 0)

Junior Member

joined:Dec 8, 2002
votes: 0

Thanks for the replies ;)

Actually they're all my sites that I run since 1996, we rented a server farm and moved the sites to several servers. There's plenty of work there, and over (way way way over) a million pages.

We do get a heap of google traffic as our subjects range from needles to satellites (both fictitious, we don't do needles OR satellites). MOST our pages don't sell a thing. No I'm not Slashdot.

My suggestion is not just for my own good, it's for the overall well being of the web. Our revenues from selling our widgets pay for the bandwidth (there's even a little left for friday beer ;))

But after I installed mod_gzip on my server farm (most browsers support gzip) the page size went from 20k average to 2k (amazing huh!) - but googlebot transfers remain in the 20k average and I've tcpdumped Googlebot requests - they indeed do not do Accept-Encoding: gzip.

If Googlebot used gzip it would cut Googlebot's bandwidth spending on my site from U$ 500.00 to U$ 50.00 monthly! From 6,000.00 yearly to U$ 600.00 yearly, just from adding GZIP support - painless, easy, quick!

Imagine what this would do for Google themselves, they pay for bandwidth too!

How could Google benefit from compression?

- Google would cut bandwidth costs
- If enough servers installed compression they'd spider the web much faster (if 100% of worldwide apache users installed mod_gzip I'd guess google could spider the web about 7 times faster, no kidding!)
- Compression is already used in images and video, why not on text? In case you're not a file format expert JPEG, MPEG, GIF, MP3, PNG and friends are just names of compression standards combined with a file format!
- Compression is standard and always accepted in computer science, it causes no information loss and gzip is very fast on Linux, which Google uses on cheap hardware
- Hardware is cheaper than bandwidth

How would Google stimulate enough webmasters to install gzip?

Instead of just terrorizing SEO's with their webmasters pages Google should add something like :


Other engines would follow suit, the internet would be faster (about 10 times faster). Webmasters that didn't have gzip installed would install it just to have their pages load faster - believe me mod_gzip made my sites load about 5x times faster on AVERAGE - sometimes even reaching 10 times speed increase.

Microsoft showed extremely good attitude adding gzip support to all internet explorer software - therefore the 99% of my visitors that use internet explorer and mozilla already see my pages faster and they load about 7 times faster WITHOUT CHANGING A THING ON MY CONTENT.

Googleguy, as if you didn't get enough free consulting on this great forum here's a post I'd really appreciate if you read. Help make the Internet a better place and make Googlebot gzip-compliant, in C it's one library you gotta link to, in Python it is a module you load, in Perl it is a 1 minute change to your code.....don't know what Googlebot is written in (I'd bet C/C++ based on Altavista's Scooter) but it shouldn't take more than 1 day to change this!

8:48 pm on Dec 18, 2002 (gmt 0)

Senior Member

WebmasterWorld Senior Member googleguy is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:Oct 8, 2001
votes: 0

That's a good suggestion, psoares. I think one of our crawls does support that; I'll ask about how hard it would be to support the ability to accept gzip more as we crawl.