Forum Moderators: open

Message Too Old, No Replies

Consequences of Duplicate Content

Am I in danger of getting banned or just not indexed?

         

jsnively

11:58 pm on Mar 7, 2004 (gmt 0)

10+ Year Member



I am publishing reviews for a certain category of electronic products (each product has its own page). To keep the pages consistent, I'm using the same format with all of them. While there will obviously be some differences between pages, a lot of it will be the same, especially for similar models of product.

My question is, will duplicate content put me in danger of getting my domain banned, or would it be that just those pages in question will not be indexed or rank highly?

HayMeadows

1:51 pm on Mar 9, 2004 (gmt 0)

10+ Year Member



No worries, you will not be banned. Just do what's best for your users and you will be okay. Play your cards right and make enough changes (unsure of %) and you might even get all your pages in.

Brett_Tabke

11:33 am on May 21, 2004 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



When Google finds two pages that are the same, it considers the first page it runs into as the Original Document. This is why Google spideres so often and so aggressive.

You can NOT be banned or kicked out of Google for duplicate content. Only the page itself will be "down graded". The page will be pushed down in the rankings (this is NOT via a PR Zero, although the page can have a pr0, duplicate content is not the source).

wanna_learn

11:50 am on May 21, 2004 (gmt 0)

10+ Year Member



"When Google finds two pages that are the same, it considers the first page it runs into as the Original Document. This is why Google spideres so often and so aggressive"

Brett,
How Same? - could you please tell...out of below..which cases invite that Dupelicate Content Penalty?
1- Exactly Mirrored .... same HTML
2- Same Content, same Design, same navigation...but Title and H1 change
3- Same Content, different Design and Layout (plus hosted with same company)
4- Same Content, different Design and Layout (plus NOT hosted with same company)
5- Content with Tweak, but Same Design and Layout

mipapage

12:27 pm on May 21, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



this is NOT via a PR Zero, although the page can have a pr0, duplicate content is not the source

That's a hard one to swallow. A site belonging to a friend of mine has dup content from another site (he has your typical "write something for one site, post it on yours after it's a little stale" agreement), and that is the only reason that I can see for his PR getting greyabarred after every PR update.

I e-mailed Google, and the day they replied that nothing weird is up, 'poof' his PR is back. Then shortly thereafter it disappears.

In addition to the PR, his Cat listing also goes missing. Hundreds of high quality organic backlinks too.

birdstuff

12:43 pm on May 21, 2004 (gmt 0)

10+ Year Member



I have written several articles that I make available to other webmasters from a "free content" page. Every article includes a credit in the resource box, including a link to my site.

Typing "mydomain.com" -site:www.mydomain.com into Google shows that over 1000 sites (an estimate) have used my articles. The vast majority made my link into a hotlink. Some of the pages are showing PR but many show PR0.

My question: Am I getting the anchor text benefit from all of these links or just the ones on pages with PR>0?

Vince

1:44 pm on May 21, 2004 (gmt 0)

10+ Year Member



Here’s another question about dup content…

I have two sites (A and B), both offering different products. Is it ok to display A product pictures on the B site but link back to the A site for the product information? Would this be considered dup content or possibly get either/both sites penalized?

mipapage

7:37 pm on May 21, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Some of the pages are showing PR but many show PR0.

Well, that does it then. Should I get my buddy to keep Googlebot off of these 'dup-content' pages?

don't mean to hijack the thread, but it's on topic and I have posted about this before with nary a whisper of input...

nuevojefe

12:37 am on Jun 1, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



We have some sites with a layout and navigation structure that is so large (in terms of percentage of each files content) that I think many of our internal pages are not getting PR (even though they show backlinks) because they're seen as duplicate content.

I think we are going to have to put way more content onto every page which is going to be tough; in order to get the page similarity percentages down.

It's just weird because all the pages are indexed, all show internal backlinks from the homepage which has PR and all of the pages get traffic. So what is the point in not awarding the pages Pagreank?

Has anyone else experienced this? I mean, I have sites that don't have anything on them yet, just shows the directory (don't even have index.htm), that have PR. Why wouldn't these pages?

MovingOnUp

4:07 pm on Jun 10, 2004 (gmt 0)

10+ Year Member



You can NOT be banned or kicked out of Google for duplicate content. Only the page itself will be "down graded". The page will be pushed down in the rankings (this is NOT via a PR Zero, although the page can have a pr0, duplicate content is not the source).

I hate to disagree, but I think you may be wrong Brett. I believe excessive duplication CAN cause a PR0 penalty. I have one site that went to PR0 where that's the only possible explanation.

The site was several levels deep, and many of the pages had some duplicate content on them. Google totally stopped crawling the site. The pages with the duplicate content moved down in the rankings and many of them disappeared. The pages that were linking to the pages with duplicate content started going next. It dominoed through the site until virtually every page on the site (including many with original content that were five years old with incoming links from thousands of sites) disappeared from Google. Traffic from Google dwindled to nothing. Finally, the display PR dropped to 0.

I had never done anything questionable on the site, other than the content that Google may have considered to be duplicate content. No links to bad neighborhoods. No hidden content. No cloaking. Nothing. The only explanation I can come up with is that the PR0 came from duplicate content. It's been that way for five months now, so it doesn't appear to be temporary.

nuevojefe

10:06 pm on Jun 10, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Any crosslinking?

MovingOnUp

10:25 pm on Jun 10, 2004 (gmt 0)

10+ Year Member



No, no crosslinking.

Rick_M

11:57 pm on Jun 10, 2004 (gmt 0)

10+ Year Member



Has the domain been registered to the same person since the time it started to get backlinks, or has it changed ownership?

MovingOnUp

3:08 am on Jun 11, 2004 (gmt 0)

10+ Year Member



No changes in ownership or registration.

t2dman

3:46 am on Jun 11, 2004 (gmt 0)

10+ Year Member



Being part of an seo competition has highlighted some interesting duplicate content issues.

I have found that Google can recognise not just pages that have duplicate content, but paragraphs that are the same. A friend came up with 2 great paragraphs and I "copied" them onto my site. If I search for a particular two word phrase on that paragraph, Google shows the original page, then "repeat the search with the omitted results included." Showing the omitted results shows my page.

A competitor copied my page exactly (exact html) then added 20% of unique content onto the bottom. This was enough for my page to be treated as duplicate, and the ranking substantially drop. It has taken 4 weeks for the page to get back its proper ranking.

A number of sites link to me in the footer/or same place of every page of their site. When I search for site:abc.com term, Google shows some but not all of the pages, then saying "In order to show you the most relevant results, we have omitted ...". Another site that has links in the left border only shows one page, then the message.

I came across an interesting article in web pro news LondonLinkBuildingAndDomainNameIssues (many links looking the same can also be seen as duplicate content).

I have just begun a forum where even with unique titles and metadescriptions, because the rest of the top of the forum was the same (navigation structure) when I do a site:domain.com search it shows only the first 30 of over 300 pages (then "In order to show..." message). I will be moving unique content higher on the page.

So Google is very aware of ANYTHING that is the same, be it whole pages, or any part of a page. So really unique content is very necessary.

rickbender1940

6:15 am on Jun 11, 2004 (gmt 0)

10+ Year Member



Something happened today that brought me here wondering

Our domain is customizedwidgets.com, however we've also registered customisedwidgets.com for UK customers who might just remember the name but not the American spelling. customisedwidgets isn't linked from anywhere at all.

Today, we got listed in Yahoo UK's list of top 100 women's sites...using customisedwidgets! Somebody must have mentally recalled the name and used their British spelling. Thing is, now I'm afraid Google might follow that link, see AN ENTIRE DUPLICATE WEBSITE and drop our main site from the index.

Any suggestions?

t2dman

6:28 am on Jun 11, 2004 (gmt 0)

10+ Year Member



Do a 301 permanent redirect from one to the other, so there is only one site, but two routes to get there.

This is also the best advice for those with domain.com and also domain.co.country, or other mispellings.

Remember that Google also picks up a domain for normal spider via adsense (personal experience).

With two totally duplicate sites, Google is more likely to drop the site that has the least text links/PR/PR via text links to it for a particluar query. I have seen duplicate content domains appear serp=1 for alternately different search terms, with the alternate domain nowhere to be found. The only explanation has been that the external links have been different.

artdog

4:36 pm on Jun 11, 2004 (gmt 0)

10+ Year Member



I have to say I'm puzzled about dup content. I know of a site that is state specific. They show every county in the state at the bottom of every page as a link. Every county page is identical except in four places where they show the name of the county. There are about 70 of these pages and all are PR4. The index page is PR5.

How is this possible with all the above postings?

t2dman

4:54 pm on Jun 11, 2004 (gmt 0)

10+ Year Member



The Duplicate content filter shows the most relevant result, and can hide the duplicate. Therefore, the site you mention should show a different page for each state query, depending on what state the page has been optimised for. So duplicate content within a site doesn't need to be a concern. However, if there was a paragraph on your site that was copied from another website, and the query string was in the middle of it, the other site may be shown not yours.

artdog

5:31 pm on Jun 11, 2004 (gmt 0)

10+ Year Member



Just to be clear, dup content within a site is no problem whatsoever, just two or more sites that contain the same content?

airpal

11:27 pm on Jun 23, 2004 (gmt 0)

10+ Year Member



Any other experts wish to elaborate on this growingly important topic? Please share any details you may have so that we can completely narrow down the exact effects of dupe content penalties! And if anybody has found a foolproof way to get a site back ranking higher after being penalized for dupe content, please send me a private message, and there could be something in it for you. Thanks.

isitreal

12:15 am on Jun 24, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I've been trying to figure this duplication question out too, I have about 8-10 sites that blatantly share content, part of the page titles, most headers, and page content. Navigation is sometimes different, sometimes identical, so I went in just now and checked the page rank for them, no page rank 0's anywhere. Forget that theory, they've been sharing content since about mid january.

The page will be pushed down in the rankings (this is NOT via a PR Zero, although the page can have a pr0, duplicate content is not the source).

So sounds like Brett Tabke has this one right.

Sites all have unique ip addresses, most file names are the same, domain names are of course different, but otherwise they are pretty much carbon copies of each other (it's a rough draft of a basic content management system, sort of).

Some sites have slight differences scripted in depending on the site, but it's more or less all the same exact words.

However, in the last few months all sites are plummetting in the SERPs, hits are way down, google definitely caught these, but it hasn't affected page rank, which isn't very high, 3-4, 5 at best. This strikes me as a confirmation of those who claim page rank is irrelevant, although I don't pretend to know this for a fact.

I too am wondering just what it is that triggers whatever is being triggered in google, since I should fix these sometime, it's not a high priority at the moment, client doesn't really care, I don't really care, but I would like to know what is happening and how much duplication to remove before they once again are considered uniques.

<added>Oh, I did find one site with page rank 0, but I'm not sure if that site even has any other site linking to it.

donstar

3:45 am on Jun 24, 2004 (gmt 0)

10+ Year Member



Many of us are regularly updating our sites with new, original content. How do we prevent our content from being stolen.
Are there ways to stop a website copier from downloading & copying our site? Any preventive measures?

t2dman

4:10 am on Jun 24, 2004 (gmt 0)

10+ Year Member



Beware of using the noarchive tag. While it does stop people taking content via the Google API, it can mean that Google visits your site less often.

You could use all sorts of encryption and other technology on your site, but most of these ways are circumventable and not good for Google bots and human visits.

The best way is to lodge a Digital Millennium Copyright Act action with Google, and use it and threat of such action to pressure both the website owner and hoster to take the site offline.

[edited by: t2dman at 4:12 am (utc) on June 24, 2004]

donstar

4:11 am on Jun 24, 2004 (gmt 0)

10+ Year Member



Thanks t2dman :)

MovingOnUp

11:42 pm on Jun 24, 2004 (gmt 0)

10+ Year Member



I've been trying to figure this duplication question out too, I have about 8-10 sites that blatantly share content, part of the page titles, most headers, and page content. ... Forget that theory, they've been sharing content since about mid january.

Just because you haven't been dropped town to PR0 doesn't mean that other's haven't been. I have one site that dropped from PR7 to PR0, and the only possible explanation is that Google considered the pages on the site to be duplicate content.

isitreal

2:49 am on Jun 25, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



although the page can have a pr0, duplicate content is not the source)

Brett seems to be pretty firm on this point, I'm not going to claim I know what's up for a fact, but I can tell you that only one of the sites has a PR0, all the others are unchanged in PR, and they are absolutely duplicated. I'd like to get a better idea of what I can get away with before putting much work into this problem, it's a low priority for me, not for your site obviously.