Forum Moderators: open

Message Too Old, No Replies

Duplicate content question

newsletter archives

         

shasan

3:42 am on Nov 19, 2003 (gmt 0)

10+ Year Member



I'm going to be sending out a newsletter and have it set up so that people can view the HTML version online in an 'archives' fashion.

The newsletter will usually include net-new content, but some of it may appear on other parts of my site (i.e. the normal articles area).

Will I be smacked down for duplicate content? The newsletter archives present the whole newsletter (including featured articles) all on one page,just like in email form, whereas the article engine has a separate page for each article.

Please advise, thx much.

shasan

troels nybo nielsen

5:48 pm on Nov 20, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



AFAIK there should not be any risk of being penalised, but if one article is regarded as partly duplicate content of another you may expect it to rank low.

AjiNIMC

8:31 pm on Nov 20, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Any Idea how google checks the duplicate contents, does it check one big sentence or different words?

I thought of many algos but still banning a site for duplicate content is tough. But google is a big boss with a big algo, I still doubt whether google penalise for duplicate content.

As duplicate content has no proper definition.

Aji

troels nybo nielsen

9:01 pm on Nov 20, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



AFAIK banning is an individual decision based on a handcheck, exactly for the reasons you mention. But downgrading duplicate content most likely is automatic and done by the algo.

shasan

10:22 pm on Nov 20, 2003 (gmt 0)

10+ Year Member



Ok, I can live with the dupe content being ranked lower. I just don't want to get banned for spamming (I don't think I am).

I just want my newsletter archives available to people, and the archives may contain items published (previously) on other parts of my website. Would that stand up to a 'handcheck'? Should I even worry about it?

<grammar edit>

troels nybo nielsen

10:33 pm on Nov 20, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



It doesn't sound like something worth worrying about. If it actually makes your site better and more usefull for your visitors I cannot imagine your website being banned after a handcheck.

shasan

10:51 pm on Nov 20, 2003 (gmt 0)

10+ Year Member



Thanks, I was thinking the same thing. But you can't be too careful with the big G. :)

matrix_neo

6:32 pm on Nov 21, 2003 (gmt 0)

10+ Year Member



Does publishing the same article in various websites and my own site will lead to a duplicate content issue?

troels nybo nielsen

7:22 pm on Nov 21, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



A couple of days ago I read exactly the same article from AP on several newspaper websites. I don't think they run any risk of being banned. But one might guess that those specific pages with that same article may rank rather low for their keywords.

matrix_neo

4:22 am on Nov 22, 2003 (gmt 0)

10+ Year Member



If so do you guys have any insight how google differentiate articles published in many sites and the content stolen and published as a new site? (which google penalise for duplicate content)

troels nybo nielsen

10:14 am on Nov 22, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



If Google find two or more pages with basically the same content one of those pages will be regarded as the "original" and given the highest rating while the other(s) will be regarded as duplicate(s) and given lower rating.

It is unclear how Google's algo decides which page to prefer. We have had quite a few discussions about that here at WW. It seems that age will often be a considerable factor but not necessarily the one determinant factor.

I personally know of at least two articles that have been published by their authors on the web in two different places. In one case the oldest version ranks high in Google while the newer one is literally invisible. In the other case it's the other way around.

The algo does not distinguish between stolen content and legit content. That will have to be an individual decision made by a human.

matrix_neo

5:33 pm on Nov 22, 2003 (gmt 0)

10+ Year Member



Thanks for sharing your knowledge nielsen! I was bit worried of being banned by google because of duplicate content now I will go ahead and publish the articles in my website also.

kaled

6:03 pm on Nov 22, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



You also have the options of exluding single pages from the index using the robots meta tag and excluding directories using the robots.txt file.

Having added a hidden div (popup help) on a page, I decided to play ultra-safe and added a robots meta tag to exclude it from Google's index. There is a 99% chance that Google would not have worried about this as hidden text, however, the page was unimportant (for search engines) so caution seemed sensible.

If you have duplicate content that you don't want/need indexed, just exclude it by one of the methods above.

Kaled.

AjiNIMC

7:42 pm on Nov 22, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



How much duplicate content makes my page duplicate?

If someone manupulates the sentences a bit and then publish on the site, will that be filed under duplicate case.

Does 33% unique content enough for a page to get the credibility?

Aji

troels nybo nielsen

7:58 pm on Nov 22, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



> How much duplicate content makes my page duplicate?

Only Google know. And there is no doubt that Google constantly are refining their filters so that they catch as much duplicate content as possible while on the other hand avoid to suppress genuine content. This may mean that if you write a long article that has a rather long quote from another article, that quote and only that will be filtered as duplicate content and downgraded in the SERPs.

Maxie

8:37 pm on Nov 22, 2003 (gmt 0)

10+ Year Member



The newsletter will usually include net-new content, but some of it may appear on other parts of my site (i.e. the normal articles area).

I have a similar situation where one page lists many items and other pages list one of those items and when I search for some keywords from the text I find them both listed, general page first, detail page later, perhaps because that one has a longer uri or some of the keywords appear more often of the general page.

One time i'm in the top5 with 18 total results another time the detail page isn't considered relevant even though there are only 2 relevant results, of 4 found total.

Thus it's hard to understand what is going on precisely..
I don't get the impression that there is a penalty issue going on though..

Max