Duplicate content: What exactly is it? - Content, Writing and Copyright forum at WebmasterWorld - WebmasterWorld

Forum Moderators: not2easy

Message Too Old, No Replies

Duplicate content: What exactly is it?

What if you allow someone to republish an original article from your site?

Webwork

8:42 pm on Mar 2, 2004 (gmt 0)

WebmasterWorld Administrator

10+ Year Member

Top Contributors Of The Month

I think it's unnatural for an informative article to sit in only one location on the WWW for long. That's not to say that I expect, invite or endorse copyright violations. What I expect is requests for republication and I expect, in most cases, to consent on my terms.

So, is anyone aware of anyone from any SE speaking "with authority" on the subject of "what exactly is duplicate content"?

How do the SEs differentiate between duplicate content "as spam" and duplicate content as flattery, the organic and natural redistribution of information on the WWW?

If a SE is able to distinguish the 2 how - exactly - does it handle/treat the difference between flattery and spammery?

How does a SE ever tell "which came first"? By checking and comparing log file dates? Can't be that because someone could simply backdate file dates. So how?

Are duplicate content penalties an impediment to the organic redistribution of information on the WWW?

If this has been discussed at length before just whack me gently and point me to the source material ;-)

jpell

9:21 pm on Mar 2, 2004 (gmt 0)

10+ Year Member

Webwork,
I have been wondering the same thing. I was considering allowing some of my articles to be reproduced freely, but worried that it would somehow diminish it's authority. I have been noticing more of it being done though, as well as inviting others to contribute their work to your website. This is the route I'm taking, however I am stressing that all articles must be original and not published anywhere else, including the authors website.
I see it as a natural progression of reciprocal linking. It's tedious and doesn't get you noteable traffic. An article with your byline on a related website is much more effective. It's also what the internet is supposed to be about... a free flow of related information through a community of websites.
JPell

rogerd

10:01 pm on Mar 2, 2004 (gmt 0)

WebmasterWorld Administrator

10+ Year Member

The dupe content most likely to get nailed is true dupe content, i.e., the exact same page reproduced more than once. Elimination by creation date isn't possible, so dupe content is more likely to be eliminated by PR or other algorithmic factors. "First found" might work, too, though it might not always reflect the truly original work.

An article that is presented with different formatting, additional page content, page title, inclusions, etc., is far less likely to be automatically detected as a dupe, although I presume Google et al are working to enhance their dupe detection.

A greater danger for the content supplier, perhaps, is getting displaced in the SERPs by a page that outranks the original.

Instead of pure article syndication, I like the idea of customization whenever possible. If a nursing site wants to print your article about preparing a resume, for example, it would probably be quite easy to make it about preparing resumes in health care. A few strategic alterations would probably be sufficient. There's no duplication, then, and the article is far more useful and relevant to the site and its visitors.

Another approach that avoids full duplication is permitting the other site to reproduce only a portion of the article followed by a "read more" link back to the original page. There are obvious benefits from a linkage and traffic standpoint, but the value of the article is reduced somewhat due to its incomplete nature.

You can check some past discussion of duplicate content here, too: google.com/search?hl=en&ie=UTF-8&oe=UTF-8&q=duplicate+content+site%3Awebmasterworld.com

Teshka

10:18 pm on Mar 2, 2004 (gmt 0)

10+ Year Member

I allow some of my articles to be reprinted indefinitely. As far as I know, Google isn't dinging any of the pages where they appear(look at all the people swiping huge chunks of Wikipedia). Even if it is someday detected and thoses pages zeroed in the SE, so what? Most of us who let our articles be reprinted do it for a link back to our page, and that link is to attract people not bots. It works, too.

I think people worry too much about this. Now, if your entire site is nothing but reprinted articles, you might have cause for concern...

John_Shaw

10:27 pm on Mar 2, 2004 (gmt 0)

10+ Year Member

I think it's unnatural for an informative article to sit in only one location on the WWW for long.

It's the same in the print world. Thik "Readers' Digest". Aside from that special case, articles are often published more than once. In fact, some writers will refuse to sell "all rights", just "first print and electronic rights" so they can resell articles.

I have had many articles published in multiple trade and technical magazines. So I think the same should apply to the web.

As far as search engines and their avoidance of duplicate content, the trick is editing each publication. If you right an article for one online publication, and another asks for rights to republish, immediately let them know that you would like to edit the work first. Change some words, add or change headings, tailor the work for the particular publication, change the order of paragraphs if possible, etc.

If you are a publisher and you have the right to publish something already on the web, do some editing. Add a few headings, ask the author for some edits, etc.

One tutorial I have on my site, with a PR 5, has been published on several other sites. At first glance, they all look the same. But if you look deeper, there are differences.