I wonder if over time Google detects that the higher PR domain is just scraping and lowers its PR?
I can understand in the case of, say, a press release where loads of sites have republished something word for word with permission, Google would rank them in PR order. But for content that's just informational, I would think Google would intend to rank the original above any scrapers.
If so, that could mean that the problem is not so much that Google can't identify the original as that originality is not as important as other aspects of the algorithm. If Google is, say, equally concerned about originality and pagerank, then just making sure they know who's the original might not cut it.