Robert_Charlton - 7:59 pm on Apr 17, 2011 (gmt 0)
danijelzi - There's a recent Matt Cutts video on YouTube, which makes a valiant try at addressing the issue, including patterns similar to those you ask about, but it also suggests that there might be some problems "which don't happen that often". This may be true in terms of the web overall, but I think they are happening a lot.
How can I make sure that Google knows my content is original?
Clearly, Google is aware of the problems, and I think they know that their current approaches are inadequate in some areas, but I don't think they can say that publicly.
The latency and scaling issues in a database the size of Google's make this an extremely difficult issue to deal with, particularly if articles are scraped piece by piece. In some situations, I feel, the recent "scraper update" which preceded Panda made the problem worse.
Even if the original source is cited and rewritten, the societal implications regarding the internet distribution of digital content are complex... and, IMO, they go way beyond Google, and they have scarcely been addressed.