kaled - 2:23 pm on Aug 8, 2010 (gmt 0)
Actually, it's a pitifully simple problem but Google are just too dumb or too stubborn to fix this problem properly.
It's a tough problem, I'll give them that, but it used to be better than it is.
The only way, and I really do mean the only way to distinguish between duplicate and original content algorithmically in the general case is by the age of the url. In a very few cases, when scrapers are faster than Googlebot, an automated test could be initiated by site owners such that the submitted page is scanned immediately and if it then appears in a scraper site, all duplicate content on that site is then deemed non-original.
However, what you have to understand is that Google don't care. They have always demonstrated complete and total disregard for the property rights and privacy of others. At the very most they do only what is required by law and many people would argue that they don't even come close to that.