Msg#: 4551364 posted 1:52 pm on Mar 5, 2013 (gmt 0)
I'm in the process of blocking thousands of files on a dynamic site with meta robots noindex.
I want these removed from the index as soon as possible.
The site has a few hundred thousand pages indexed, but is not getting crawled to heavily by Google. I have a feeling that it will take Google months/years to hit some of these files and discover the noindex.
The files I'm blocking are all parameters on URLs, so I can't remove them in batches via Google Webmaster Tools, since it is not possible to remove a directory using wildcards.
So: I thought of creating an XML sitemap with a dump of all the URLs I blocked with noindex, hoping it will speed up the removal process.
Is this a legitimate approach? Will Google actually hit the files and notice the noindex, or will these files (or the entire XML sitemap) just get ignored?
If this is not a valid approach, does anyone have any ideas on how to speed up the removal process?
Would it make sense to create an HTML index to these files instead?
Msg#: 4551364 posted 5:06 pm on Mar 7, 2013 (gmt 0)
It's a perfectly reasonable approach. People confuse sitemaps with the idea of presenting Google with a complete 'view' of your site that they will follow. In fact, sitemaps are just a way of adding on data to the usual crawl process. So it doesn't matter whether you have a 'positive' or 'negative' reason to submit.
The other method would be to create a page with links, as you suggest. if you do so, I would recommending using the "submit to index" feature in GWT. But to be honest, creating something with enough equity to get speedy crawling of thousands of URLs would not be a good idea IMO.