Forum Moderators: open
Paul
And how on earth is Google going to determine who is the originator of the content? Or are they going to ban the original work and let one of the dupers stay in?
They can't determine that with an algorithm so I think this duplicate content story is bogus. Either that, or they'll get a lot of people really pissed. And I can't see them win a lawsuit either against someone whose site got banned for content he was the originator of.
Oraqref
You are right to be concerned because duplication must be an issue Google is looking into. General rule from my experience is if you have 15% original content per page you are ok, but this may rise. The current limit is probably you must have at least 10%.
BUT, IMHO the following may be happenning:
1) Google is detecting templates, so your navigation template etc. will not be included in the duplicate content equation. So if 20% of your pages are template, this will be detected and ignored, the remaining 80% must have a reasonable chunk of original content, I would guess at least 20% to be safe.
2) ( oraqref) Google will, rightly or wrongly, make a guess on who had the content first. Lots of speculation as to how, but probably pr plays a part and perhaps the age of the pages. I have wondered as well how they do this, duplicate content within a site is easy to detect, but across domains introduces lots of problems for them. However, many people seem to have had pages dropped for duplicating other sites content. There are loads of affiliate sites churning out content pinched from other sites, so google must be addressing this problem big time.
Oraqref
With as many times as GG has mentioned a "SARS"-type freshness approach lately, I would make your site emulate a news site, sounds like it already is anyway. Credit your sources, keep it fresh and different, and I think you're fine.
Yes you are right, and I can't see how they get round that one as well. I would love to agree that 'duplicate content penalties' are a myth, but it seems people have suffered from this. I can see within a site it is fine for google to ignore or drop duplicate pages, but I agree that once they start comparing two different sites content they could get it badly wrong!
However, the poet could always sue for copywrite, then get his poems on the web. Google is not responsible for listed sites content, and will choose on a 'first come' basis and/or the pr etc. If they worried about the legal validity of every page they index they would not list anybody! So I suppose they take a view that it is up to the agrieved party to pursue stolen content, it is not their problem. They just cannot list hundreds of identical pages, so they have to pick one and let any legal issues be sorted out elsewhere.
it is up to the agrieved party to pursue stolen content
I agree but it looks like they won't look the other way if you're proactive:
[google.com...]
Substantially duplicate pages - Google will filter out the duplicate pagesto keep the SERPS clean. It will not affect the rest of your site, they will only remove the duplicate page. This is not a penalty.
Substantially duplicate sites - These sites will simply be removed as spam. Tuis is a penalty if they thing the site is trying to spam and they might even remove *all* copies of the site, if they are obviously all related.
It is not possible for google to compare every page to every other page. They will compare pages that are most likely to be the same. I don't know how they come up with their criteria for what sites to compare.
They can't determine that with an algorithm so I think this duplicate content story is bogus.
I suggest you read the forums including some old threads. Do a site search for duplicate content and you will get some good info to digest.
I should confirm this :-
Google algos pick up pure duplicate content.
I have seen many sites where duplicate content has been penalized.
Example one:- A News website. It promoted it's website by a different name. After 2 years they had some problems, so they chose their original brand name and copied the whole site on the new domain. All the duplicate pages got Grey Bar Penalty. But this was a Big brand site, and hence, they gradually got good one way incoming links. So a site wide Ban was not applied and the new site has some pages which have PR. But the home page and many other important pages still have Grey Bar. They are not aware of this problem.
Ignorance is a bliss....
Example two:- A corporate website. Due to Branding reasons they bought a good domain name. Duplicated the old site on the new one. Whole New site has a Grey Bar penalty. They don't care of ranking of this site. So again this site will remain Grey bar.
Now these were the cases of 100% duplicate content. And very easily picked by google , in the first case by a page by page basis.
If it is not a 100% duplication (the templates are different or some tweaking) then a Probability scenario occurs where you May or May not be penalized. Just shared my experiences. Make your own decisions. HTH.:)
Except if he doesn't have a problem with the fact that people tend to like his poems so much they put them on their websites. That's one of the liberties the poet has. And believe me, there are plenty of people who feel that way, I as a semi-professional poet am one of them. It has happened to me that poems of mine appear on other websites without permission but I just tend to feel flattered about that.
And what about, say, 18th century translations of the Illias? Public domain. Let's say there are 5 different sites who all have another translation with slight varieties. Penalized. Result: people who look for these must wade through tons of garbage first because Google wanted to play editor instead of search engine.
Oraqref
Thanks for the welcome. I wasn't arguing that Google cannot spot duplicate content. I was arguing that they can't determine who created such content. A duplicate content penalty creates big problems, some of which the rather technical people at Google have probably not even considered. The news sites are good examples: it will be really hard to find good news sites if they all get penalized for using dupicate content. There are many more such examples. What about a collection of public domain poetry? Penalized, because the individual poems all appear elsewhere. And so on. I don't really feel that's the direction Google should go, even if there are people daft enough to put loads of duplicate sites on the web to catch visitors. The fact that an algorythm can't determine such nuanced cases spells disaster for loads of people.
Oraqref
Duplicate Content plays a vital role in sites. suppose ur using the content from some other site and that site has a copyright then they surely sue you. About getting banned in google it might happen that the site from where u have picked up the content and that particular site is already listed in google from many days then it might have a problem ur site to get it listed. So try to avoid Dulicate Content.