Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

Server failure cause duplicate content penalty

penalty, duplicate content

         

warth0g

9:02 pm on Jan 29, 2007 (gmt 0)

10+ Year Member



I have a site that has been running and building traffic for about 3 years. By December of 2006 it had reached 3 million page views and over 250,000 unique visitors a month.

On Dec 5th Google indexed my site and creashed my server. The DNS agent redirected google to the failover server. Unfortunately the site was in the middle of a massive navagation revamping and the failover server had a mix of old and new source files, including the htaccess.

This broken configuration resulted in creating duplicate content across multiple sub domains. The next day the site was dropped form all the top listings for several very important search terms. Traffic dropped from 25,000 page views to 1000 overnight.

I realized what had happened and disabled the failover server immediately, I also made sure there was no glithces in the production code that would cause duplicate content. Once it was all checked I submitted a reinclusion request to google.

The next day all the cached pages I had in google reverted to september 2006, then about 3 weeks later the cache updated to current and all the old urls were removed.

It has been 6 weeks since I submitted the request and except for the re-indexing, nothing has changed. The home page now only comes up for search terms including the words in the domain name, traffic has stayed horribly low - my alexa rank dropped from 15,000 to 99,000.

The site has had regular growth from 300 visitors a day in January 2005 to 10,000 a day in Dec 2006.
Traffic has inched upward from an average of 1000 per day in december to 1800 a day this month (January) but it isnt even close to the old levels.

Should I submit another reinclusion request? I read somewhere that a duplicate content penalty incurs a minimum 6 month penalty, I read somewhere else that they are more leniant about technical problems.

Any advice would be appreciated, this is costing me an incredible amount of money and stress.

theBear

1:47 am on Jan 30, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Ouch,

One question are all of the duplicated pages completely out of the index and only the actual true pages in the index?

Verify this by searching on a several word text fragment from five or so of your formerly high ranking pages.

ashear

2:24 am on Jan 30, 2007 (gmt 0)

10+ Year Member



How often are you getting crawled?

Has your crawl rate changed?

Add your failover to Google site maps and see how many pages are still indexed and see how quickly they are dropping out. I have seen cases in which this can take several months if your dealing with a large volume of URL's.

Also check error messages in sitemaps you can see alot of data.

Beyond that, grep through your log files and look for error's that Google may be seeing.

you can do a more line like this:

more logfile.txt ¦ grep Googlebot ¦ grep 404 or 301 or 302 and keep looking for something way off.

Without diagnosing this by hand its difficult.

warth0g

8:33 pm on Jan 30, 2007 (gmt 0)

10+ Year Member



googlebot still indexes every day - all the old URLs have been removed for weeks - only current accurate URLS are indexed.