| 2:21 am on Nov 20, 2006 (gmt 0)|
Google has three suggestions to remove unwanted urls. [google.com]
You can use Google's URL removal tool - I've done it to correct URLs that I didn't want indexed (thanks to the Google toolbar). The instructions are pretty clear, but some have removed thier entire sites by accident. That's probably the fastest method.
That process (URL removal) does eventually expire so you really need to make sure these pages are gone or you block them with robots.txt (which I did).
| 3:15 am on Nov 20, 2006 (gmt 0)|
as I explained before, the URL removal is not preferred since I might decide to add more info to some of the pages and make them active again. Right now I disabled /deleted all the dubious pages and will revist them as time goes on.
| 3:45 am on Nov 20, 2006 (gmt 0)|
if you don't want the waiting period associated with the removal tool the the fastest way would be to return a 410 GONE
IMO googlebot will continue to visit a 404 because this condition could be temporary but a 410 is not.
| 4:17 am on Nov 20, 2006 (gmt 0)|
Theoretically, the 410 "should" be it. But Vanessa Fox confirmed earlier this year that Google will treat 404 and 410 identically. And given the amount of slipshod that fills the web from all angles, I can understand that.
| 4:48 am on Nov 20, 2006 (gmt 0)|
I think this is a supplemental index problem. Try publishing some original content on those pages or forget about them. I think from your post you clearly understood why you hit a penalty. When producing database driven content its important pages are unique. Don't worry you are in good company I made same mistakes I suspect you have.
| 5:30 am on Nov 20, 2006 (gmt 0)|
I only use a database for ease of maintenance and help with the different templates and updates:
EACH data field differes for EACH page and EVERYTHING is painstakingly added manually. This is not a datafeed or computer generated scrap.
However, I feel that I may not have enough data on some pages so I am removing them--for now. Once this is cleared I will review them one by one and more info. I think links did me in...combined with borderline thin content on enough pages. It all started when I linked from another site of mine. I linked only from the homepage, but a post-nuke rewrite screwup made it seem like 8000+ pages (relative links, and all bogus folders and files showed the front page :)). Who knows, but I have to try. Maybe the links earned me and others a manual site review.
Tedster, I remember reading the 410 comment too, I think from Matt. I was shocked but I am sure they have their reasons. How fast would a noindex work, and would Goog still calculate (even if they don't display) them?
| 3:44 pm on Nov 20, 2006 (gmt 0)|
The noindex meta tag takes effect within days to weeks for URLs in the normal index - days after the URL is next crawled.
| 6:02 pm on Nov 20, 2006 (gmt 0)|
I managed to remove thousands of duplicated pages on my sites simply adding a Disallow: statement. It took 2-3 weeks in my case and all were removed from Google's index and the site regained all rankings fine.
| 6:07 pm on Nov 20, 2006 (gmt 0)|
I second g1smd to use a dummy page with a noindex tag. This not only causes your page to almost immediately disappear from the index after Googlebot has fetched it, it has some nice side effects which might be handy:
- The URL won't go supplemental, which may be the case when you just delete it and return a 404/410.
- Google remains crawling the URL as long as a link is pointing to it. You mentioned that you might reuse tthe URL in the near feature. Removing the noindex tag will cause the URL to return to the visible index after the next fetch by Googlebot.
| 11:26 pm on Nov 20, 2006 (gmt 0)|
cool. Now, I need to spend some time to recreate the files for the noindex. Google picks my pages fairly qucikly.
When a page goes to supplemental (because of a 404,) does that still count as far as google is concerned?
| 11:31 pm on Nov 20, 2006 (gmt 0)|
Count for what?
If the page does not exist, then Google hangs on to a copy of it for a year as a Supplemental Result. They do this simply so that someone who looked at that page a few weeks or months ago can still find it in the SERPs. They can then view the old cache even though the page no longer exists within your site.
Your custom 404 error page should be there to tell the visitor that the page has gone and to present them with a set of links to the major content sections of your site.
| 12:01 am on Nov 21, 2006 (gmt 0)|
assuming that page x caused you a problem rank wise. You delete it, yet Goog moves it to the supplementals after a week. How does google treat the page? Will it still hurt your rankings?