Pfui - 1:05 pm on Jul 16, 2011 (gmt 0)
lucy, try whitelisting robots. I use a CGI to let only the majors see the 'complete' version (which matches my sitemap.xml) and then I use mod_rewrite to let them hit only the pages indicated. Everyone else gets a generic full Disallow (and no access to sitemap.xml) unless and until I give them any leeway -- which I rarely ever do because too many are totally untrustworthy.
Oh, also: Bingbot, and other majors, hit many, many times from many, many different servers. So while it may look like they're not retrieving anything beyond r.txt, they are, just not by that server at that time.