Forum Moderators: Robert Charlton & goodroi
OR
Are you talking about content that other sites syndicate and republish, with the page and site navigation at the other places all being different (non-exact page code across versions)?
.
Whatever, [webmasterworld.com...] still spplies.
If Google finds identical or very similar pages, one will likely be indexed, the others probably as supplemental results, some maybe not at all.
There is no visible logic that predicts which one will get indexed; it may be the oldest, the newest, the one with the highest GPR. Or the lowest. It's probably a matter of which direction the spider was moving when it hit. Who knows.
People frequently point out that many duplicates ARE listed - yes, that's true. One reason is time - a new page from an OK site will get listed, but over a few cycles of spidering, if it's a dupe, it may get dropped (in support of this, news searches tend to have trillions of dupes as a story spreads around the world; they don't all last).
Another key factor is that Google looks at code, not the visible page - so a clone page has a very high chance of being dropped or supplementalized (dread word). But a page with very similar visible content, but very different navigation etc, may get away with it.
Also, of course, template pages with small content are at greater risk as the unique:shared content ration is bad.
Finally, other factors like meta tags increasingly matter in the duplicate decisions.
[edited by: Quadrille at 10:05 am (utc) on Sep. 1, 2006]