|Cleaning Out Old, Superceded Content|
Removing Trash from Google
My site has been penalized for 54 weeks. A programming firm I hired 15 months ago triggered duplicate content, programmed thin content, used incorrect error handling, and caused a myriad of other problems.
On December 18, we completely removed all the programmers work and replaced it with 81,107 webpages of unique, excellent content with
unique titles and descriptions. We used 404s and 410s (where applicable) to clean up the old, bad content.
On December 14--on the one year anniversary of the penalty--I filed a reinclusion request detailing the numerous things we had cleaned up and advised Google of the new, unique content. We also put up a new Google sitemap. Google has crawled aggressively ever since.
A site:mysite.com search of DC [22.214.171.124...] shows 75,900 webpages. Sadly, all but 67 are supplemental. None of the new content is in the index although Google has clearly crawled most of it.
--Is there any way to expedite Google's cleaning out this old content?
--Is there any way to expedite's Google's showing the new content in the index instead of the old superceded content?
--Will my site remain dead until Google cleans out this old superceded content?
--If so, for how long?
--Should I use the URL removal tool to expedite cleaning out the old, bad URLs?
--What else should I do to get back on track?
All recommendations will be greatly appreciated.
Still looking for answers. Can anybody help out here?
Its a great question and one I am afraid there is little answer to in your case. I think that 301 any supplemental page (old content) to equivalent clean content would be only suggestion I could make now. Had I been involved from start there is no way I would 404 or 410 your old content whoever told you to do that should be hung drawn and quatered.
Watching google is like looking out the window...nothing you can do about what you see...standing on your head won't make it safe to play golf...forget about google and just accept the fact that the next few months might require some extra deep breaths...stay calm and get zen...there are always changes coming...nothing lasts...even a google dump! Happy 2007!
Its hard to make recommendations without a close examination of the specific site.
- I would have deleted all the 'bad pages' and then waited a bit
- I would *NOT* have added all 81000 pages in one hit
- the reinclusion request was the right approach
- how long has it been.... about a fortnight? I wouldn't expect results until about now. They can be slow to actually pop into the SERPs.
You have killed the old URLs and used new ones, haven't you? I would not recommend reuse of those old URLs.
jwc2349 - First of all I'd study the long thread on cleaning up duplicate content and methods recommended.
It sounds like you're having to deal with a lot of neglect and that may mean you might have to study your situation more deeply - sorry to hear and hope you can restore your situation quickly. Some of the best info has already been supplied around g1smd and Tedsters inputs with lot's of members contributing further:
Step 2 is speedy recovery and there is difficulty getting a clear idea of what's involved as only a few members with varying circumstances have shared their experiences thus far.
adding 81000 pages in one hit is not good.
I can't imagine how you can "just add" 81000 of quality and unique content.
Thanks for your reply and posting the threads. I am familiar with both and, like you, have a very high respect for tedster and g1smd. Hopefully, they will chime in also. Once Google digests the new content and flushes out the old crap, hopefully a speedy recovery will ensue. After 55 weeks, I would sure need one.
I am in the travel sector and we used an XML interface to add the new, unique content for 80,000 hotels. Therefore, it was relatively easy once my programmer designed the landing page and tied together the progrmamming. He then just pulled the XML in. We should have done that 6 months ago rather than trying to fix the crap the programming firm put up. Mea culpa!
|XML interface to add the new, unique content for 80,000 hotels |
And neither hotels themselves nor other affiliates publish that content?
|On December 18, we completely removed all the programmers work and replaced it with 81,107 webpages of unique |
Too big of a change to do all in one fell swoop...
|Sadly, all but 67 are supplemental. None of the new content is in the index although Google has clearly crawled most of it. |
Google does not like "big changes fast". However, it may take some time for google to digest this.
In the future I recommend a lower keyed approach, try incrementing your changes slowly over a period of several months.
|Google does not like "big changes fast". However, it may take some time for google to digest this. |
According to Matt Cutts video the sites get "flagged". Do you know how long the release period is set to [ if at all? ]
How do you re launch big sites?
5000 pages a week per Matt's directive will take 2 years for the site described over here.
I'd like to believe there's a way to communicate with Google on this via the re inclusion request.
[edited by: Whitey at 10:57 pm (utc) on Jan. 4, 2007]