Forum Moderators: phranque
you can combine multiple methods to gain it 99%
- disallow per robots.txt
- disallow per meta headers
- block ip's of known robots
- block known spider via the useragents info
the ip-blocking might be the most powerfull feature because you're shure the spiders can't connect to your site. but your list has to be up-to-date for this.
there are some comprehensive and mostly completely lists of such ip-adresses and .htaccess files for this available in this forum. please utilize the site search to locate them.
the Search Engine Spider Identification Forum [webmasterworld.com] is a good choice on this topic.