Forum Moderators: Robert Charlton & goodroi
So, for a big site with hundreds of thousands of pages, is there any way to do this? Any third-party tool or service?
Where is the option in Google Webmaster Account to remove pages from Index and cache?
[edited by: crobb305 at 3:50 pm (utc) on Mar 21, 2011]
Do NOT submit pages that you intend to keep on your site because they will be removed from the index for at least 90 days.
You can reinclude your content at any time during the 90-day period by following these steps:
1. On the Webmaster Tools Home page, click the site you want.
2. Under Site configuration, click Crawler access.
3. Click the Remove URL tab.
4. Select the Removed content tab, and then click Reinclude next to the content you want to reinclude in the Google index.
Pending requests are usually processed within 3-5 business days.
[google.com...]
Google improved the URL Removal process a while ago to allow speedy reinclusion, because after all, stuff happens
For some reason the one advice that Google keeps giving is: delete or redo the 'bad /thin' pages as they will hurt your entire site and wait for Google to index and later re-calculate.
This update is designed to reduce rankings for low-quality sites—sites which are low-value add for users, copy content from other websites or sites that are just not very useful. At the same time, it will provide better rankings for high-quality sites—sites with original content and information such as research, in-depth reports, thoughtful analysis and so on.
One thing that is very important to our users (and algorithms) is high-quality, unique and compelling content. Looking through that site, I have a hard time finding content that is only available on the site itself. If you do have such high-quality, unique and compelling content, I'd recommend separating it from the auto-generated rest of the site, and making sure that the auto-generated part is blocked from crawling and indexing, so that search engines can focus on what makes your site unique and valuable to users world-wide.
Google maintains that its criteria for evaluating sites are based strictly on what best serves users.
As many have observed since the Panda, they stopped using the word 'content' in their statements and replaced it with 'quality', which is not exactly the same thing.
Yet they explicitly define low quality as "copied content".
This Panda update is sickening. Copyright violators are still ranking, stolen content on Blogspot is still ranking and eHow is still ranking for partly re-written content.