Welcome to WebmasterWorld Guest from

Message Too Old, No Replies

Can we schedule crawl times for googlebot?



6:19 pm on Nov 5, 2010 (gmt 0)

5+ Year Member

Short version: I would like to disable Google's indexing during certain days/times. I am also concerned about changing my robots.txt file and what the long term effects are (do they not come back?)

My company hosts monthly sales on our website where our traffic increases dramatically. Yesterday, the need for heavy load testing an optimization was pushed to the forefront during the perfect storm of heavy use and a Google crawl.

The fact that in just a couple hours, during our heaviest period of use, Google had downloaded well over a gigabyte of data (not to mention the stress on the SQL server), this was more than enough to push the system over the tipping point to horrible performance (1000% typical load time).

Does anyone have experience with similar problems? Were you able to find acceptable solutions while still maintaining good search results? We are currently top 3 on all our important search terms and phrases and I would hate to lose that. But if our site doesn't work, that is worse.


8:42 pm on Nov 5, 2010 (gmt 0)

WebmasterWorld Senior Member tedster is a WebmasterWorld Top Contributor of All Time 10+ Year Member

Hello JustBarno, and welcome to the forums.

I've been looking for the Google reference and can't locate it right now, but the essence of the answer is that no, it's not a good idea. This video gets close: [youtube.com...]

Usually the crawl team does a good job with allocating crawl resources in a way that doesn't hurt the server. You can ask googlebot to crawl more slowly, but that often has other, negative repercussions.

From your description of the problem, it sounds like Google needs to retrieve the full page for every request - database calls and all that. Have you considered server-side caching and then replying with a 304 status if the page hasn't changed?


10:40 pm on Nov 5, 2010 (gmt 0)

5+ Year Member

Thanks Tedster, that video was very informative. Normally I think you're right that it wouldn't hurt our server, but the problem is that we were already near the tipping point. I'll look into the server side caching but our pages are almost constantly changing (new high bids, auctions closed) etcetera.


12:10 am on Nov 6, 2010 (gmt 0)

WebmasterWorld Senior Member tedster is a WebmasterWorld Top Contributor of All Time 10+ Year Member

Was this a one time problem, a once-in-a-while problem, or a regular problem? If it happens more than a little bit, Google will also be experiencing the server delays and should adjust their crawl rate without you taking any action. At least that's the way it's supposed to work, and it often does.


7:04 pm on Nov 6, 2010 (gmt 0)

WebmasterWorld Senior Member sgt_kickaxe is a WebmasterWorld Top Contributor of All Time 5+ Year Member

Optimize, optimize, optimize!

Your own site that is. Employ caching, "minify" css and html, employ Gzip, minimize image file sizes, minimize image use, get rid of clunky code, consolidate javascript into one file and load it from the end of the page, etc..etc.

I'm sure you've done all of that but run firebug and pagespeed to double check, before doing anything else you want to reduce the size of... everything.


10:13 am on Nov 7, 2010 (gmt 0)

5+ Year Member

As far as I know, before loading your pages Google bot always reads robots.txt. You look in your log files how often Google do this. If this happens several times a day, probably you will be helped by a script which during certain time will overwrite a file robots.txt, adding there the instruction crawl-delay. And after heaviest (peak) time will be passed, it will clear the rule, to let robot continue indexing with speed as before.


5:22 pm on Nov 7, 2010 (gmt 0)

WebmasterWorld Senior Member tedster is a WebmasterWorld Top Contributor of All Time 10+ Year Member

That might seem like an option - however that is the kind of thing Matt Cutts is warning about in the video I linked to above. Can I use robots.txt to optimize Googlebot's crawl? [youtube.com]


10:25 am on Nov 8, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member

Well, if you are really desperate, you could do some IP/agent sniffing and serve a non-contented 503 to googlebot at the high times.
Bit risky, though...

Featured Threads

Hot Threads This Week

Hot Threads This Month