homepage Welcome to WebmasterWorld Guest from 107.21.187.131
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member
Home / Forums Index / Google / Google SEO News and Discussion
Forum Library, Charter, Moderators: Robert Charlton & aakk9999 & brotherhood of lan & goodroi

Google SEO News and Discussion Forum

    
How long does Google take to remove 404 pages from index?
suratmedia




msg:4146082
 4:48 am on Jun 3, 2010 (gmt 0)

Webmaster Friends,

I have been observing from last 6 months.

One of our site generated 8000+ canonical copies due to mod_rewrite bugs so, on 3rd Dec, 2009 our site got the filter.

from 3rd Dec, 2009 to 6th March,2010 I found Googlebot collecting all 404 pages and marking them deleted by removing "cached" link along with SERP result.

from 6th March onwards, Google started its automatic process for 404 cleanup and we have noted some significant reduction of those 404 pages from Google index.

Today after 6 months, I still can find 400+ those orphan (non-cached) copies on SERP due to which our rankings got affected.

So, it's now more than 180 days since Google bot getting 404 status for those 8000 copies.

At max how much time Google need to cleanup all 404 garbage canonical copies?

Any idea?

 

tedster




msg:4146310
 2:50 pm on Jun 3, 2010 (gmt 0)

It can take months and months - but once Google spiders the URL and verifies the 404, it is not hurting your rankings.

suratmedia




msg:4146707
 2:47 am on Jun 4, 2010 (gmt 0)

Actually all "example.com/categories/" got 400+ canonical copies at that garbage canonical url "example.com/categories/page/[\d+]/item-url-slug.html"

So, "example.com/categories/page/[\d+]/item-url-slug.html" got higher priority over "example.com/categories/"

Then we deleted (404) all those canonical URL on 6th Dec 2009. So, traffic came down to 5-8%.

I though we could be back on SERP by 6 months, but still no luck...

It's a longest ever penalty I have ever seen on internal canonical issues.

TheMadScientist




msg:4146708
 2:59 am on Jun 4, 2010 (gmt 0)

Hmmmmm...

I think one plan would be to 301 instead of 404.
Another plan would be to use a canonical link rel instead of a 404.
A third plan would be to 410 instead of 404.

Personally, I would use a 301 or a canonical link relationship on the pages over removing the content. IOW: I would try to combine rather than remove, and if I had to remove I would use a 410 (permanently, purposely removed) rather than a 404 (not found, could be either temporary or permanent).

suratmedia




msg:4165107
 7:26 am on Jul 6, 2010 (gmt 0)

A little update on this.

This affected domain is a "subdomain" of my site.

- Now it's over 7 months.
- Remaining 404 copies left on SERP are: Aprox 130-145.
- Googlebot completely stopped crawling those 404 pages after 15th June, 2010 (exactly after 6 months)

Whenever I query "site:example.com" my all sub-domains except this penalized one are listed over there. While this subdomain goes beyond 900+. but whenever I try follwing queries on root domain, this affected domain shows up in top 10.

site:example.com *** -asdf
site:example.com **** -asssdsd


Whenever I query "example.com" my all sub-domains except this penalized one are listed over there. While this subdomain goes beyond 900+. but whenever I try follwing query on root domain, this affected domain shows up in top 10.

example.com+-asdf+-asdf+-asdf+-asdf+-asdf+-asdf+-asdf+-asdf+-asdf+-asdf+-asdf+-asdf+-asdf+-asdf+-asdf+-asdf


That means, Due to massive 404s Google supplement my entire sub-domain.




Now I need answer from experts.
=========================

So, now when googlebot is not crawling those 404 pages, would it come out from supplements result on time-out based thing (i.e. 8/9 months ? or Google rankings will rerank entire site like new site?

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Google / Google SEO News and Discussion
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved