klark0 - 8:32 pm on Nov 22, 2012 (gmt 0)
They actually do. Robots.txt tells them not to crawl. It doesn't say not to index. As a result, if they find out about the URL from links on other sites they may index it without crawling it.
Or if a URL was previously allowed in robots.txt and then subsequently blocked, it may still remain indexed.