Forum Moderators: open
My question is, will duplicate content put me in danger of getting my domain banned, or would it be that just those pages in question will not be indexed or rank highly?
You can NOT be banned or kicked out of Google for duplicate content. Only the page itself will be "down graded". The page will be pushed down in the rankings (this is NOT via a PR Zero, although the page can have a pr0, duplicate content is not the source).
Brett,
How Same? - could you please tell...out of below..which cases invite that Dupelicate Content Penalty?
1- Exactly Mirrored .... same HTML
2- Same Content, same Design, same navigation...but Title and H1 change
3- Same Content, different Design and Layout (plus hosted with same company)
4- Same Content, different Design and Layout (plus NOT hosted with same company)
5- Content with Tweak, but Same Design and Layout
this is NOT via a PR Zero, although the page can have a pr0, duplicate content is not the source
That's a hard one to swallow. A site belonging to a friend of mine has dup content from another site (he has your typical "write something for one site, post it on yours after it's a little stale" agreement), and that is the only reason that I can see for his PR getting greyabarred after every PR update.
I e-mailed Google, and the day they replied that nothing weird is up, 'poof' his PR is back. Then shortly thereafter it disappears.
In addition to the PR, his Cat listing also goes missing. Hundreds of high quality organic backlinks too.
Typing "mydomain.com" -site:www.mydomain.com into Google shows that over 1000 sites (an estimate) have used my articles. The vast majority made my link into a hotlink. Some of the pages are showing PR but many show PR0.
My question: Am I getting the anchor text benefit from all of these links or just the ones on pages with PR>0?
I think we are going to have to put way more content onto every page which is going to be tough; in order to get the page similarity percentages down.
It's just weird because all the pages are indexed, all show internal backlinks from the homepage which has PR and all of the pages get traffic. So what is the point in not awarding the pages Pagreank?
Has anyone else experienced this? I mean, I have sites that don't have anything on them yet, just shows the directory (don't even have index.htm), that have PR. Why wouldn't these pages?
You can NOT be banned or kicked out of Google for duplicate content. Only the page itself will be "down graded". The page will be pushed down in the rankings (this is NOT via a PR Zero, although the page can have a pr0, duplicate content is not the source).
The site was several levels deep, and many of the pages had some duplicate content on them. Google totally stopped crawling the site. The pages with the duplicate content moved down in the rankings and many of them disappeared. The pages that were linking to the pages with duplicate content started going next. It dominoed through the site until virtually every page on the site (including many with original content that were five years old with incoming links from thousands of sites) disappeared from Google. Traffic from Google dwindled to nothing. Finally, the display PR dropped to 0.
I had never done anything questionable on the site, other than the content that Google may have considered to be duplicate content. No links to bad neighborhoods. No hidden content. No cloaking. Nothing. The only explanation I can come up with is that the PR0 came from duplicate content. It's been that way for five months now, so it doesn't appear to be temporary.
I have found that Google can recognise not just pages that have duplicate content, but paragraphs that are the same. A friend came up with 2 great paragraphs and I "copied" them onto my site. If I search for a particular two word phrase on that paragraph, Google shows the original page, then "repeat the search with the omitted results included." Showing the omitted results shows my page.
A competitor copied my page exactly (exact html) then added 20% of unique content onto the bottom. This was enough for my page to be treated as duplicate, and the ranking substantially drop. It has taken 4 weeks for the page to get back its proper ranking.
A number of sites link to me in the footer/or same place of every page of their site. When I search for site:abc.com term, Google shows some but not all of the pages, then saying "In order to show you the most relevant results, we have omitted ...". Another site that has links in the left border only shows one page, then the message.
I came across an interesting article in web pro news LondonLinkBuildingAndDomainNameIssues (many links looking the same can also be seen as duplicate content).
I have just begun a forum where even with unique titles and metadescriptions, because the rest of the top of the forum was the same (navigation structure) when I do a site:domain.com search it shows only the first 30 of over 300 pages (then "In order to show..." message). I will be moving unique content higher on the page.
So Google is very aware of ANYTHING that is the same, be it whole pages, or any part of a page. So really unique content is very necessary.
Our domain is customizedwidgets.com, however we've also registered customisedwidgets.com for UK customers who might just remember the name but not the American spelling. customisedwidgets isn't linked from anywhere at all.
Today, we got listed in Yahoo UK's list of top 100 women's sites...using customisedwidgets! Somebody must have mentally recalled the name and used their British spelling. Thing is, now I'm afraid Google might follow that link, see AN ENTIRE DUPLICATE WEBSITE and drop our main site from the index.
Any suggestions?
This is also the best advice for those with domain.com and also domain.co.country, or other mispellings.
Remember that Google also picks up a domain for normal spider via adsense (personal experience).
With two totally duplicate sites, Google is more likely to drop the site that has the least text links/PR/PR via text links to it for a particluar query. I have seen duplicate content domains appear serp=1 for alternately different search terms, with the alternate domain nowhere to be found. The only explanation has been that the external links have been different.
How is this possible with all the above postings?
The page will be pushed down in the rankings (this is NOT via a PR Zero, although the page can have a pr0, duplicate content is not the source).
Sites all have unique ip addresses, most file names are the same, domain names are of course different, but otherwise they are pretty much carbon copies of each other (it's a rough draft of a basic content management system, sort of).
Some sites have slight differences scripted in depending on the site, but it's more or less all the same exact words.
However, in the last few months all sites are plummetting in the SERPs, hits are way down, google definitely caught these, but it hasn't affected page rank, which isn't very high, 3-4, 5 at best. This strikes me as a confirmation of those who claim page rank is irrelevant, although I don't pretend to know this for a fact.
I too am wondering just what it is that triggers whatever is being triggered in google, since I should fix these sometime, it's not a high priority at the moment, client doesn't really care, I don't really care, but I would like to know what is happening and how much duplication to remove before they once again are considered uniques.
<added>Oh, I did find one site with page rank 0, but I'm not sure if that site even has any other site linking to it.
You could use all sorts of encryption and other technology on your site, but most of these ways are circumventable and not good for Google bots and human visits.
The best way is to lodge a Digital Millennium Copyright Act action with Google, and use it and threat of such action to pressure both the website owner and hoster to take the site offline.
[edited by: t2dman at 4:12 am (utc) on June 24, 2004]
I've been trying to figure this duplication question out too, I have about 8-10 sites that blatantly share content, part of the page titles, most headers, and page content. ... Forget that theory, they've been sharing content since about mid january.
although the page can have a pr0, duplicate content is not the source)
Brett seems to be pretty firm on this point, I'm not going to claim I know what's up for a fact, but I can tell you that only one of the sites has a PR0, all the others are unchanged in PR, and they are absolutely duplicated. I'd like to get a better idea of what I can get away with before putting much work into this problem, it's a low priority for me, not for your site obviously.