Page is a not externally linkable
- Google
-- Google SEO News and Discussion
---- Google Windows Web Accelerator


Scarecrow - 4:46 pm on May 5, 2005 (gmt 0)


I have two servers, and a total of several domains. All are nonprofit.

I've checked my logs for Google's accelerator scraping. There are about 200 GETs on each server for today only. One server shows most of them from 72.14.192.* and the other server shows most of them from 72.14.194.*. I also have about 20 from 64.233.172.* and another 20 from 64.233.173.*.

One thing that disturbs me is that every single page on all of my domains has shown the NOARCHIVE meta for years now. Google does not consider this meta to be a prohibition for the accelerator. There is no opt-out to save your bandwidth! I don't see the accelerator checking for robots.txt either. To put it bluntly, Google considers this latest scraping to be a non-search function, and none of the old standards apply. Of course, you can be sure they save everything they grab for future use.

The other thing that disturbs me is that on one server (remember, this only for the last 12 hours), I saw 8 different accelerator GETs in a single one-second period. With 130,000 static pages on this site, should I be worried about load problems from Google? And for what -- so that Google can collect more information on people who want to access my sites?

If the Googlebot is hitting me this hard, I put up with it because I want the referrals from Google. I can also exercise control with robots.txt and NOARCHIVE. But the accelerator running in addition to the Googlebot is where I have to draw the line.


Thread source:: http://www.webmasterworld.com/google/29319.htm
Brought to you by WebmasterWorld: http://www.webmasterworld.com