We use template driven sites (ala Ebay and Amazon)so the content is somewhat similar. Secondly we operate several sites are they are all linked to each other.
I was "told" that templated pages that appear within the same domain/web site are not considered duplicate content BUT similar pages on SEPARATE domains that link to each other is considered duplicate content and is subject to penalty.
Can anyone clear this confusion and define duplicate content as they understand it?
I dont think a template alone would be enough to cause a duplicate content issue. So long as each page has different content then the code will be different enough. Content is the biggest facor in cases of duplicate content.
The cross linking can be a problem if you are not very carefull. For example I would not have evey site link to every other site. That's just to far on the radar. There are many ways to interlink sites, one such way is to have site "a" link to site "b" and site "b" link to site "c" etc. That way none of the links are recipricol, and you have a lesser chance of being discovered. To be honest though, if the links are on topic then you have less to worry about.
this seems like a good place to throw this..
I have a couple of "gateways" from back in the old days when I used to move the physical content around a bit. they are of the how.to/cjb.net variety, and load my site inside a full-window frame.
google has indexed all three. my gateways do pretty well, too. and amazingly are indexed with correct subfolders, like ****.how.to/blog/ , etc.
question: should I kill these?
It's not duplicate content, as such, it's the same content, just different ways to get to it. I'd be gutted to lose my google ratings for a couple of, now unnecessary, gateways. They are kinda cool though, and I'd prefer to leave them up, if possible.
I've been wondering about this question too, how much duplication is considered duplicate? I do a bunch of sites that most definitely have duplicate content on most of the pages, I need to get that fixed since google spotted that and the sites are all off the serps now, but I want to change as little as possible to avoid the duplication penalty, does anyone know the real numbers for what triggers duplicate content penalties?
Currently the filename, part of the title, the main page header, and all the content are duplicated on most pages, obviously this is very easy to spot. Navigation is different enough I think to not be duplicate.
As far as I can tell, the structural html is pretty much ignored by google, as it should be, so the problem seems to be in the text content alone.