falsepositive - 3:53 pm on Feb 26, 2011 (gmt 0)
It just smacks to me as something like this: individual page factors are involved, including a site wide penalty. Enough individual page factors are triggered and brings down your ranking -- which is evident say, when one page is heavily duped. Enough of this and your entire site is penalized and brought down. Some tests I did:
1. New pages I have, I cut off the RSS feed. That page shows up fine, ranks, no problem, albeit LOWER than I'd usually expect for keywords.
2. Old pages, heavily duped without my consent (copyright infringement). Scrapers are way way ahead, my article is much lower, sometimes nowhere to be seen. Difference between me and scraper? I have links out and in. Scrapers toss out all links and copy my work verbatim without links, or with minimal links (some with link back to my site). For my lost income, I am of the mind to SUE these people for the money I've lost. I don't believe I have seen a time in the past when a scraper outranked me. Well now I have.
3. I am knocked down a few ranks on some unimportant keyword on my domain name, but a main word, nevertheless. For example, if my site name was "The Content Farm Victim" (snark), my site used to rank #2 for "Content Farm". Now it's #6. So there's a site wide issue here. My guess is that because of the multitude of pages affected in #2, it triggered enough of a sitewide penalty to cause #3. If we fix #2, then #3 may be fixed as well.
4. Some of my best pages are just not visible anywhere. No scrapers are either. This is weird. They are just not ranked, although indexed, even by using the full title of the page. Many of these pages are my pride and joy, and I have no idea what's up with those. Again, perhaps because of the sitewide penalty, they are hit. Maybe because they are older pages, they could have stale or no links.
5. My huge authority site was hit. Small sites of friends in the same niche, NOT touched. My back link profile is excellent. I have links from outstanding places, and perhaps spammers alike. I am on DMOZ. Small sites from others? Not so much. Minimal link profile.
6. It may not look like overall link profile is a big factor, but possibly on page links are. I am totally white hat, always afraid I would be slapped by Google for a mere insinuation of a paid link. I've never gone there, so this is a bigger stab in the chest than it should be. It's truly a "punishing of the good".
All in all, a lot of work if we are to fix this. Starting with: go after your scrapers? I would think Google should already know what to do. They were supposed to release this algo along with the "duplicate screening algo". Maybe these algos work hand in hand. I know that the first phase of this thing involved clearing out the scrapers. I saw them disappear in a week or two. But I was not affected then. This time, I am seriously affected, and scrapers abound. So will this mean that the scrapers will disappear in a week or two?
Re figuring out the "quality" of a page via visitors. Scrapers have my content word for word, as well as the work of others. Wouldn't they have great user experience? What if they also have great site design, making them percolate to the top? There has to be other things involved like link profile, age(!), authority and those link backs I placed to my site, that some of these thieves still retain in the content.
Could this algorithm be one to sniff out the "original content source"? If an even bigger site copied you word for word, how would this algo recognize that?
There are certainly patterns you can see here, but not yet totally clear.