Forum Moderators: phranque

Message Too Old, No Replies

Allow major spiders in site

Is there a way with .htaccess

         

ogletree

7:22 pm on Jul 29, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I plan on doing some agressive banning of IP's to my site. I want to make sure that the major spiders Gbot and Y and adsense bots can get in without a problem. I want to do it by IP because user agent can be spoofed. I saw a list at [webmasterworld.com...] that had a long list of G ip's would that cover G or do I need more or is this impossible. Is there a list of all ip's G owns.

jdMorgan

7:36 pm on Jul 29, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



The list you found was pretty good one in its day, but these lists must be carefully maintained.

There is no "bulletproof" way of guaranteeing that a legitimate search engine spider might access your site from a new IP address block in the future. Generally, if you block the major spiders for a short time, they will retry later with no ill effects. But any strict restriction method you use will require that you monitor your 403 errors carefully to insure that these unintentional blocks of legitimate spiders are short-term, in order to avoid getting dropped or "showing up on the radar", as the case may be.

There are services you can sign up with (related to cloaking) to provide you with up-to-date IP addresses for major spiders, but I've never investigated them deeply.

Jim

ogletree

9:59 pm on Jul 29, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I found a good list that seems to be kept up. They list several SE's but not Yahoo or Slurp. They have

InfoSeek
Alta Vista
Lycos
Inktomi
Excite
Google
Northern Light

Are any of these Yahoo. I can do without yahoo if I have to they never put me in anyhow.

Also are you saying that if I only allow a list of IP's from G and G uses one from a new IP that it is no big deal that G will come back some other way.

jdMorgan

11:01 pm on Jul 29, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



> Also are you saying that if I only allow a list of IP's from G and G uses one from a new IP that it is no big deal that G will come back some other way?

No, I'm saying they will come back on the same or a similar new IP address on another day, giving you a window of time to "unblock" their new IP address range. In other words, blocking them temporarily is not necessarily a disaster, but you must "keep up" with the current IP addresses in use, so that the inadvertent block is only in place for a short time (I'd be nervous about more than 24 hours, personally).

I'm refraining from saying you should or should not use this method, just pointing out what side-issues must be considered.

Jim

ogletree

11:11 pm on Jul 29, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I'm going to use a service.