enigma1 - 10:53 am on Dec 30, 2011 (gmt 0)
pages I did not even know existed
That's strange. Is this your site or not?
robots.txt are guidelines. Bots follow links they found externally or internally in your domain. If you don't want pages to be found you need to protect them or not expose them at all. And btw in robots.txt you expose them - by restricting them. You cannot guarantee that others don't read and display on purpose the content of robots.txt in some external page with hard-coded links. Guess what happens next.
If the robots choose to ignore his instructions and access the files anyway, then perhaps they fall foul of that act.
I would say fix your code instead of blaming spiders.
also deny directives and RewriteRules detecting IPs and UAs.
That's basically cloaking. Serving different content to different visitors, you don't know what will happen if the IP is reassigned or the UA changes.