Robert_Charlton - 8:38 am on Sep 5, 2012 (gmt 0)
Thanks for that! That snippet - Yes it's been long discussed. Don't know why should G even index the URL only version even though we have blocked it. G only knows!
shaunm and MikeNoLastName - Let's give it one more shot, as you haven't "blocked it" in the sense of keeping references to the page out of the index.
To repeat again what many here have said, by using robots.txt, you've kept the contents of the page from being crawled.
Because the page isn't crawled, Google doesn't see the meta robots noindex tag in the page contents.
Therefore, Google may still index the url of the page, and create "snippets" / titles / whatever it can, because of references it finds elsewhere on the web.
Google did not invent these protocols. It is simply following them.
For another take on this... here's a relevant section from an Official Google Blog article on the topic...
Using the robots meta tag
Official Google Webmaster Central Blog
If you use both a robots.txt file and robots meta tags
If the robots.txt and meta tag instructions for a page conflict, Googlebot follows the most restrictive. More specifically:
•If you block a page with robots.txt, Googlebot will never crawl the page and will never read any meta tags on the page.
•If you allow a page with robots.txt but block it from being indexed using a meta tag, Googlebot will access the page, read the meta tag, and subsequently not index it.
PS... I've been exactly where you're at on this. It is initially confusing. It may take some work and some reading to understand it.