pippo - 4:27 pm on Aug 5, 2013 (gmt 0)
"it will index it unless it has been explicitly told not to."
...adding, robots.txt does NOT work as a method of saying "do not index". OP seems to be confused about this.
If I link to your robots.txt-excluded pages, Google will probably add your URLs to the index and rank them based on what it can know about them without crawling them directly. So not even robots.txt plus leaving them out of the sitemap will do what you want - you have to use a NOINDEX directive somewhere.
Amusingly, in the above example, a NOINDEX wouldn't work because Google has been instructed (by you) not to crawl the page, so it can't know what the META ROBOTS directives are. IE, robots.txt actually works against you here.