frontpage - 5:28 pm on Dec 30, 2010 (gmt 0)
I am thinking that the 'white list' idea is a very good one.
That is only if you actually trust spiders to respect your robots.txt. The Spider Forums here are replete with tales of spiders that ignore robots.txt.
I just ban them via firewall or ModSecurity.
I have lots of website hosting/colo IP ranges banned, it makes life more pleasant.