not2easy - 8:34 pm on Sep 3, 2012 (gmt 0)
TRUE lucy24, the only way to prevent indexing is to have a robots meta tag in the page's header, you can't noindex from robots.txt. Still, if it shows up in your sitemap they may index it anyway. That is because if you read about the purpose of the sitemap, it is to have a list of the pages you want to have indexed. I found out the hard way a long time ago that you need to only have pages in the sitemap that you do want indexed, because a noindex metatag on the page gets ignored when they find it in the sitemap. I am reminded of it again whenever I try to do away with an old page and forget to remove it from the sitemap after I put a noindex metatag on the page.
Now, if anyone knows a way to prevent them from using an antique version of a sitemap, that would be helpful. I submit new sitemaps and still see 404s from pages that have not existed for two years, are not in any current sitemap. I appreciate that I can now mark them as "Fixed" but I know they will be back.