| Using XML Sitemap to speed removal of blocked pages?
|
spiral

msg:4551366 | 1:52 pm on Mar 5, 2013 (gmt 0) | I'm in the process of blocking thousands of files on a dynamic site with meta robots noindex. I want these removed from the index as soon as possible. The site has a few hundred thousand pages indexed, but is not getting crawled to heavily by Google. I have a feeling that it will take Google months/years to hit some of these files and discover the noindex. The files I'm blocking are all parameters on URLs, so I can't remove them in batches via Google Webmaster Tools, since it is not possible to remove a directory using wildcards. So: I thought of creating an XML sitemap with a dump of all the URLs I blocked with noindex, hoping it will speed up the removal process. Is this a legitimate approach? Will Google actually hit the files and notice the noindex, or will these files (or the entire XML sitemap) just get ignored? If this is not a valid approach, does anyone have any ideas on how to speed up the removal process? Would it make sense to create an HTML index to these files instead?
|
Andy Langton

msg:4552194 | 5:06 pm on Mar 7, 2013 (gmt 0) | It's a perfectly reasonable approach. People confuse sitemaps with the idea of presenting Google with a complete 'view' of your site that they will follow. In fact, sitemaps are just a way of adding on data to the usual crawl process. So it doesn't matter whether you have a 'positive' or 'negative' reason to submit. The other method would be to create a page with links, as you suggest. if you do so, I would recommending using the "submit to index" feature in GWT. But to be honest, creating something with enough equity to get speedy crawling of thousands of URLs would not be a good idea IMO.
|
Robert Charlton

msg:4552257 | 6:58 pm on Mar 7, 2013 (gmt 0) | | The files I'm blocking are all parameters on URLs, so I can't remove them in batches via Google Webmaster Tools, since it is not possible to remove a directory using wildcards. |
| This current thread discusses dealing with irrelevant parameters, and you might want to take a look at it.... Canonical Question - About multiple querystring with similar content http://www.webmasterworld.com/google/4551376.htm [webmasterworld.com]
|
|
|