"Similar" Sites Vs Duplicate Content

Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

"Similar" Sites Vs Duplicate Content

what can cause trouble....

janejanejane

9:07 pm on Jun 24, 2005 (gmt 0)

Duplicate content can cause a problem of course, but what about "similar" looking and structured sites (where content is not duplicated) but where inventory is the same (all operated by the same vendor).

Is that considered spam, or worthy of a (manual) penalty in the new era of Google? Curious if anyone had any thoughts on the technicalities between the "duplicate" and the "similar"...

bbcarter

3:45 am on Jul 1, 2005 (gmt 0)

Google can't solve the problem simply by demanding more and more backlinks; too many high quality sites have relatively few backlinks (e.g. government and educational sites which don't engage in SEO).

If they're high quality, someone will link to them. I'm not the only person who has noticed that pages with links and descriptions do better than your average page, other things being equal- this is google using humans that already pick sites

Plus, its actually easier for spammers to create thousands of (not really legitimate) inbound links than it is for the typical webmaster to attract hundreds of (relevant) inbound links.

Yes, if they counted all links, but they don't. Google (check out marketleap.com's popularity checker) has stringent criteria for 'quality' links, probably based on pagerank- maybe PR3 and higher only. So only backlinks from quality sites/pages matter.

Unfortunately, any spammer who already has pagerank may be considered 'quality' de facto.

an obvious solution would be to attempt to weight inward links with respect to the likelihood that the link has been created and verified by a human being (not affiliated with the recipient of the link).

Right, and if you read the recent google patent, you'll see indications that they will be looking at the whois info to see if backlinks are from sites you also own. But you can shield ownership... so that's not good enough either.

Certainly all of these things are indicators, and G could weight them programmatically, but it's messy, sloppy, and they'll make mistakes- punish some who shouldn't be punished and let by some of the spammers.

That's why I doubt that the algorithmic approach works anymore.

It seems some spammers are always a step ahead... and with the complexity of language (nonlinear, asymmetric, illogical, not reducible to algorithm) plus the spammers' ability to use randomization... I just don't see it happening.

But I remain open-minded.

This 31 message thread spans 2 pages: 31