Page is a not externally linkable
WebGuerrilla - 4:19 pm on Aug 2, 2002 (gmt 0)
I agree with Jim regarding the notion that identifying duplicate content is a difficult task, although I think Google's method of detection is more along the lines of the system AltaVista received a patent for [164.195.100.11]. a combination of file name and link structure analysis would be enough to catch most dupe and near nupe content. I also think that Google now understands most of the points in Rich's post, and have backed off on actually penalizing sites that turn up as duplicates. The only thing that seems to be happening now is the PR/inbound links are being merged, and only one version is being shown. As an example, In June, I came across a domain name that was once owned by the manucaturer of a specific product. Having a client that operated a site about that product, I bought the domain and put up a duplicate version on the client site on a separate IP. The original site had a PR of 6, and the new domain had an existing PR of 5 due to all the links that still exist. After the last crawl, our positions for the original site were replaced by the new domain. Both sites now show a PR 6, and doing a check for backlinks on both domains returns an identical list of sites. (Even though the two sites resolve to different IP's). Although the original site still displays a solid PR6 in the Toolbar, it is completely non existant in any related SERPS.
Great post Rich.