Forum Moderators: open

Message Too Old, No Replies

Deepbot okay, freshbot go away

Are they both "googlebot"?

         

Kackle

4:10 pm on Apr 20, 2003 (gmt 0)



Our site doesn't update often, and between deepbot crawls we frequently see freshbot needlessly grabbing old pages at a rate of 8,000 per day.

We'd just as soon disallow freshbot. Do both the deepbot and freshbot respond to the User-agent of "googlebot," or is there a way to distinguish between the two in robots.txt?

Oaf357

5:14 pm on Apr 20, 2003 (gmt 0)

10+ Year Member



Do a search for Freshbot and you'll find its IP addresses. Yes, they both have a user agent of googlebot.

Kackle

6:07 pm on Apr 20, 2003 (gmt 0)



I know their IP addresses. If I could block 64.68.82.* only and let 216.239.46.* through, I'd be happy.

But can you guarantee that blackholing 64.68.82.* at the router won't affect the behavior of 216.239.46.* on our site?

I don't think you can guarantee this. That's why I was hoping that the User-agent was different. Google should use independent User-agents for these in the robots.txt. Our site alone could save them a lot of useless crawling if they did this.

BigDave

6:13 pm on Apr 20, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Make sure that your pages repond properly to if-modified-since. That way you won't actually be serving up the pages.

GoogleGuy

2:02 am on Apr 21, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I definitely agree with BigDave on this one. IP addresses are subject to change, and so if you block by IP address you could end up out of the main index some day. If your server is set up to handle if-modified-since correctly, then you should see much lighter load on your servers.