Forum Moderators: phranque

Message Too Old, No Replies

googlebot problems

         

jake58

10:10 am on Jul 21, 2007 (gmt 0)

10+ Year Member



Hi:

I have been getting a lot of traffic from a googlebot. I am using IP traffic monitor and the ip of 66.249.72.166 has been using a lot of bandwidth by DL about 7 to 10 megs every few minutes.

I found this code on an old post here and used it.

<files *>
order deny,allow
deny from 66.249.72.166
</files>

Problem is the googlebot still hits me every 30 seconds for another 700 bytes.

Is there another way to get rid of this bot or will it eventually go away when it doesn't get any more?

Thanks,

John

jdMorgan

3:53 pm on Jul 21, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



That is a legitimate Googlebot, so I wouldn't advise denying it access as you've done.

Look at your raw server logs, and determine what URL-paths it is crawling. If you don't want all of those URLs in the Google index, then use robots.txt to Disallow access to each of them, or to any common subdirectory paths that they share.

For example, if you have 10,000 URLs in your forums at example.com/forums, then

User-agent: Googlebot
Disallow: /forums

will tell Googlebot not to crawl any of them.

It is likely that 700 bytes is the length of your 403 error page, which is what will get served in response to any denied request.

Jim

g1smd

6:53 pm on Jul 21, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



That is an insane amount of bandwidth if that is continuous 24/7 activity.

I would report that as a problem back to Google, or look to setting your crawl rate through WebmasterTools... or both.