Duplicate content comes in two very different flavors: cross-domain and same-domain. Then those two flavors each have sub-flavors.
Cross-domain we've got syndication (including quotation) and scraping.
Same-domain we've got intentional duplication and technical accidents, such as canonical issues.
So we need to think very clearly in this area and not just talk about "duplicate content". You can bet that Google doesn't do that.
-----
This new algo is not something that Google built once and will live with from here on. They NEVER work that way; they iterate and iterate and iterate, rather than aiming for perfection right at launch. Google is guided by long-term vision, not short term actions that merely favor the immediate or expedient. In this case, they ran the algorithm and found it agreed 86% or whatever with their human input - and that was good enough for a first step they decided. And yes, then they were talking about "layer 2" almost immediately.
If your site took a hit and you are really confident that you have an excellent offering that fell into that 14% mis-match area, then I'd say keep improving for your users, and not for Google. It is Google's job to recognize what your visitors already see in your site.
But if you don't have that certainty, then I'd focus on the
areas that Matt and Amit described [webmasterworld.com] - even telegraphed - to us. I would not chase after Panda based on anything else right now. Not any article from any industry "authority", and certainly not just any old post on a forum somewhere.
Why do I say this? Because it's clear to me that no one actually knows anything for sure right now. If we chase after what we think Panda is in this moment, then very soon it will have shifted... and shifted again. Better to understand what Google wants the algorithm to be measuring and do that. And what Google is describing sounds to me like what visitors want, too. This is how I am guiding my own work and my clients.
[edited by: tedster at 3:54 am (utc) on Mar 25, 2011]