Forum Moderators: open
Nothing. That's not the point... the point is to at least catch all "good" bots.
However, legit bot tracking is much easier to do now that the big 4 all have round trip DNS implemented.
If the reverse DNS has the following:
.googlebot.com
.crawl.yahoo.net
.search.live.com
.ask.com
You know it's a legit spider from the big 4 if the forward DNS check confirms the IP address matches.
No spider spoofing, no proxy hacking, no worrying about the exact UA, nada.
Others that I know off the top of my head that implemented round trip DNS include Exabot, Furlbot, Twiceler, VoilaBot, BecomeBot and tailrank.com
Even Gigabot is implementing this and we finally have confirmation that it's really Gigabot crawling from this range:
Performance Systems International Inc. COGENT-NB-0002 (NET-38-112-0-0-1)
38.112.0.0 - 38.119.255.255host 38.114.104.36
36.104.114.38.in-addr.arpa domain name pointer ss26.dal0.gigablast.com
Wonder who else will get with the round trip DNS plan?