aristotle - 1:32 pm on Nov 9, 2012 (gmt 0)
However, I'd be more aware of the "total indexed", especially if it is higher than the total number of pages you have. That number seems the hardest to bring down.
On one of my sites, my submitted sitemap has 38 URLs but Webmaster Tools shows the "total indexed" as 41. Those three extra indexed pages are on the server, but from the beginning I've always blocked them from being crawled in robots.txt and omitted them from the submitted sitemap, because I didn't want them crawled or indexed. But it seems that Google has indexed them anyway. I just did a site:domain check and Google shows them in the results, but says
"A description for this result is not available because of this site's robots.txt – learn more."
So despite my efforts to prevent these pages from being indexed, Google indexed them anyway. I think it's because people have pointed some external backlinks to them from other sites.
Edit P.S. I forgot to say that those three extra indexed pages also have noindex metatags in the header, but since they are blocked from being crawled, Google can't see the noindex tags.