homepage Welcome to WebmasterWorld Guest from 54.161.155.142
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Home / Forums Index / Google / Google SEO News and Discussion
Forum Library, Charter, Moderators: Robert Charlton & aakk9999 & brotherhood of lan & goodroi

Google SEO News and Discussion Forum

    
Googlebot brings my server down everyday :(
googlebot ddos dos attack
calande




msg:3305112
 7:09 pm on Apr 7, 2007 (gmt 0)

I have a Dual Xeon with 4GB of RAM, and, I still have downtime problems. Further investigation showed that the culprit is Googlebot, it seems to grab the whole site at once, generating a huge amount of symetric httpd processes at the same time. From my terminal the load average can reach 50, then I need to reboot the poor server.

I need this web site indexed, but I don't want Google to take the server down everyday. What can I do? Are there special rules in robot.txt that I could use?

Thanks,

 

tedster




msg:3305135
 7:25 pm on Apr 7, 2007 (gmt 0)

Here's what Google suggests:

Q: Googlebot is crawling my site too fast. What can I do?

A: Please contact us with the URL of your site and a detailed description of the problem. Please also include a portion of the weblog that shows Google accesses so we can track down the problem quickly.

Google Webmaster Support [google.com]


theBear




msg:3305167
 8:44 pm on Apr 7, 2007 (gmt 0)

calande,

A dual xeon with 4 GB of ram should be able to handle Google when it comes calling unless you have scripts that use a lot of CPU or a process running that trashes the cache.

Ma2T




msg:3305174
 8:56 pm on Apr 7, 2007 (gmt 0)

I would have thought your server would be able to handle it also.

You could tell google to crawl your site slower, via the google webmaster control (sitemaps).

Also if your using php, you could try some php caching like Xcache, or APC, etc

DamonHD




msg:3305185
 9:15 pm on Apr 7, 2007 (gmt 0)

Hi,

Googlebot used to bring my main sites and Internet connection down repeatedly ~8 years ago.

1) I sent them an email, and they tweaked things to be less toxic. You can still do the same now AFAIK.

2) In their Webmaster services you can ask the bot to crawl more slowly than it otherwise would. I do that on one of my sites that is really only a fallback, for example.

3) On the grounds that NO one remote entity should be able to bring your site down casually, put in behaviour-based controls that throttle the amount of traffic/load any one remote /24 (Class C) or-similar set of addresses/hosts/users can impose. This will save you from all sorts of other DoS grief too. (Won't help with DDoS, but not much will.)

4) Your system is more powerful than most of mine and yet I survive the Googlebot plus lots of other less well-behaved bots/scrapers/idiots. How? Partly (3) and partly by tuning the site code to keep the costs of most operations down, and cacheing the results of others. What is your normal page generation time? Seconds or milliseconds? If seconds then (a) you'll be irritating humans and (b) you won't keep up with most spiders' demands either.

Rgds

Damon

BillyS




msg:3305224
 10:13 pm on Apr 7, 2007 (gmt 0)

What do you have like a billion pages? With that server I'd think you'd be able to survive for quite some time even under a prolonged attack.

Keniki




msg:3305240
 10:37 pm on Apr 7, 2007 (gmt 0)

Sign up to webmastertools is my advice and there you can select a slower crawl rate. But to be honest if my hosting couldn't handle googlebot think what happens when you have a few visitors. Your hosting should be your priority to look at not google.

yodokame




msg:3305373
 3:10 am on Apr 8, 2007 (gmt 0)

Are you sure it's the Googlebot, and not some problem with your site's software or server?

We get nailed constantly by Google's robots, in a tripple whammy, the indexer, the Google News bot (we're a news source), and the AdWords bot. And we're on a shared server at Pair.com with hundreds of other users. No problems at all.

If you're using PHP 3 or some really old, inefficient softare, maybe that's it. Or try caching, such as jpcache.

calande




msg:3308831
 9:57 pm on Apr 11, 2007 (gmt 0)

Thank you guys. It seems it solved the problem. I don't get these Googlebot deluges of connections at the same time anymore. I asked Google to be lighter with my server.

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Google / Google SEO News and Discussion
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved