Welcome to WebmasterWorld Guest from 54.160.254.203

Message Too Old, No Replies

Using XML Sitemap to speed removal of blocked pages?

     
1:52 pm on Mar 5, 2013 (gmt 0)

Junior Member

10+ Year Member

joined:Nov 21, 2004
posts:61
votes: 0


I'm in the process of blocking thousands of files on a dynamic site with meta robots noindex.

I want these removed from the index as soon as possible.

The site has a few hundred thousand pages indexed, but is not getting crawled to heavily by Google. I have a feeling that it will take Google months/years to hit some of these files and discover the noindex.

The files I'm blocking are all parameters on URLs, so I can't remove them in batches via Google Webmaster Tools, since it is not possible to remove a directory using wildcards.

So:
I thought of creating an XML sitemap with a dump of all the URLs I blocked with noindex, hoping it will speed up the removal process.

Is this a legitimate approach? Will Google actually hit the files and notice the noindex, or will these files (or the entire XML sitemap) just get ignored?

If this is not a valid approach, does anyone have any ideas on how to speed up the removal process?

Would it make sense to create an HTML index to these files instead?
5:06 pm on Mar 7, 2013 (gmt 0)

Senior Member from GB 

WebmasterWorld Senior Member andy_langton is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Jan 27, 2003
posts:2987
votes: 37


It's a perfectly reasonable approach. People confuse sitemaps with the idea of presenting Google with a complete 'view' of your site that they will follow. In fact, sitemaps are just a way of adding on data to the usual crawl process. So it doesn't matter whether you have a 'positive' or 'negative' reason to submit.

The other method would be to create a page with links, as you suggest. if you do so, I would recommending using the "submit to index" feature in GWT. But to be honest, creating something with enough equity to get speedy crawling of thousands of URLs would not be a good idea IMO.
6:58 pm on Mar 7, 2013 (gmt 0)

Moderator This Forum from US 

WebmasterWorld Administrator robert_charlton is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Nov 11, 2000
posts:11315
votes: 166


The files I'm blocking are all parameters on URLs, so I can't remove them in batches via Google Webmaster Tools, since it is not possible to remove a directory using wildcards.

This current thread discusses dealing with irrelevant parameters, and you might want to take a look at it....

Canonical Question - About multiple querystring with similar content
http://www.webmasterworld.com/google/4551376.htm [webmasterworld.com]
 

Join The Conversation

Moderators and Top Contributors

Hot Threads This Week

Featured Threads

Free SEO Tools

Hire Expert Members