Forum Moderators: phranque
There is no "bulletproof" way of guaranteeing that a legitimate search engine spider might access your site from a new IP address block in the future. Generally, if you block the major spiders for a short time, they will retry later with no ill effects. But any strict restriction method you use will require that you monitor your 403 errors carefully to insure that these unintentional blocks of legitimate spiders are short-term, in order to avoid getting dropped or "showing up on the radar", as the case may be.
There are services you can sign up with (related to cloaking) to provide you with up-to-date IP addresses for major spiders, but I've never investigated them deeply.
Jim
InfoSeek
Alta Vista
Lycos
Inktomi
Excite
Google
Northern Light
Are any of these Yahoo. I can do without yahoo if I have to they never put me in anyhow.
Also are you saying that if I only allow a list of IP's from G and G uses one from a new IP that it is no big deal that G will come back some other way.
No, I'm saying they will come back on the same or a similar new IP address on another day, giving you a window of time to "unblock" their new IP address range. In other words, blocking them temporarily is not necessarily a disaster, but you must "keep up" with the current IP addresses in use, so that the inadvertent block is only in place for a short time (I'd be nervous about more than 24 hours, personally).
I'm refraining from saying you should or should not use this method, just pointing out what side-issues must be considered.
Jim