My sitemap is generated from the database of a shop and shows links to all articles, e.g. [example.com...] Then, when widget 2004 has replaced widget 2003, the link to widget-2003.html is no longer in the sitemap: its a orphan then. But when the Googlebot looks up for widget-2003.html he will see a page with a title "widget-2003", and in body: "widget-2003 is no longer available"
That might lead to the situation, that during a year, all e.g. 100 article are replaced by the new ones, and Google will find 100 orphan-pages with old articles. These are different in title and body, but I am afraid that Google might see them as orphan pages with duplicated content. Is that a risk for banning or will Google whipe out orphan pages by itself?
A possible workaround with [example.com...] and a robots.txt entry for 2003 is not a solution, because I have a product range of several years and some products are still 6 years old. If it is a risk of banning, I have to leave the year and work as: [example.com...] Thanks, maggy
[edited by: ciml at 9:01 am (utc) on July 23, 2004] [edit reason] Examplified [/edit]
well what you can do is do a directory permanent redirect.
Like www.example.com/2003/ > www.example.com/2004/ in that case all the files that were in the 2003 will forward immediately to the 2004 version and google will stop searching for the old pages. However keep in mind that google might look back to the old pages in case people link to them from their page.