Welcome to WebmasterWorld Guest from 3.214.184.124

Forum Moderators: open

Message Too Old, No Replies

Duplicate content threshold

what % duplication is safe for G?

     
4:10 pm on Mar 3, 2003 (gmt 0)

Junior Member

10+ Year Member

joined:Aug 22, 2002
posts:153
votes: 0


Does anyone have any experience or knowledge with how much duplicate content a page can have before Google penalises?

tia

4:12 pm on Mar 3, 2003 (gmt 0)

Junior Member

10+ Year Member

joined:Jan 25, 2003
posts:116
votes: 0


I believe if you have the same word on ore than one page, you'll be penalised.

I accidentally had the word "the" on two different pages, and I didn't realise.

Now I'm ruined.

4:17 pm on Mar 3, 2003 (gmt 0)

Full Member

10+ Year Member

joined:Oct 30, 2002
posts:236
votes: 0


I can't see it myself - I have "it" on over 30 pages and I have been fine with that - I guess it is only a matter of time before "G" catches up with me though.
4:25 pm on Mar 3, 2003 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Jan 7, 2003
posts:1230
votes: 0


hi curiousweb,

i can't exactly say when google penalises and i think google will keep this top secret because they don't want anyone to trick their spider / index.

i myself have the same pages under 2 domain names and google did not penalises the two pages. so this is an amount of 100% and it's not penalised. i know this does not answers your question completely, but it may give a hint.

4:28 pm on Mar 3, 2003 (gmt 0)

Full Member

10+ Year Member

joined:Oct 30, 2002
posts:236
votes: 0


Some say 10%. Personally, I would never risk duplicating my content in any way, shape or form. Too risky and only a matter of time before you get reported to google by a competitor and banned.
4:36 pm on Mar 3, 2003 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:June 22, 2001
posts:781
votes: 0


I would aim for less than 10%.

If the page layout and HTML are significantly different I would image than the content wouldn't have to be so unique.

Having said that I avoid duplication at all costs :)

4:45 pm on Mar 3, 2003 (gmt 0)

Full Member

10+ Year Member

joined:Dec 5, 2002
posts:219
votes: 0


Total Paranoia,

You seem to deserve your nickname, don't you? ;)

Most of the e-commerce dynamic sites use some kind of templating for their items pages. In that case there is automatically a certain amount of common elements including text and links on all those pages. Sometimes the pages only differ by a few characters. What's looking more like a given manufacturer's 60GB hard disk than a 80GB disk from the same manufacturer? They only differ in the reference, price and a few characters in the technical specs...
In that case, those 2 pages would have a lot more than 10% in common, but shouldn't be considered as spam.

Am I missing something?

Dan

5:03 pm on Mar 3, 2003 (gmt 0)

Junior Member

10+ Year Member

joined:Aug 22, 2002
posts:153
votes: 0


thanks for the replies

I think that the figure may be a lot higher than 10% but am not sure at all. I'm definitely sure that having the word 'the' on more than one page is ok so if you have a penalty I think it must be for something else.

The content I have is probably about 60-80%, depending on the page, the same as another company's content that I am a reseller for and am hoping that this will be OK but the design and layout of the pages are totally different to theirs

Do you think it makes a difference to Google if the content is duplicated from another site rather than being duplicated in the same domain?

5:10 pm on Mar 3, 2003 (gmt 0)

Junior Member

10+ Year Member

joined:Oct 5, 2001
posts:109
votes: 0


I was under the impression that duplicate content was more of an issue within a domain. I think Google just ignores the newer site of two sites with duplicate content.

Having "the" on more than one page is either poor sarcasm or a bit extreme.

I think the threshold is definately much higher than 10%.

5:15 pm on Mar 3, 2003 (gmt 0)

Junior Member

10+ Year Member

joined:Jan 27, 2003
posts:48
votes: 0


Must be higher than 10%. Think of major sites that all list the same news releases, for example. Or online stores featuring product descriptions taken from the manufacturer. A lot of perfectly reasonable duplication.
5:15 pm on Mar 3, 2003 (gmt 0)

Senior Member

WebmasterWorld Senior Member bigdave is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:Nov 19, 2002
posts:3454
votes: 0


So what you are saying is that you do have duplicate content that would fill up the SERPS with substantially the same thing as the other site?

If a human had the two pages open right next to each other, would they consider them to be the same thing? That is the sort of situation that google is trying to combat. You should be worried.

What you should do is put in the effort to either get higher PR than the site you are reselling for, therefore making it so their site would be the one to dissappear if caught in the filter. Or you should work on producing substancially improved content on your own site.

I'm in favor of providing better content. Most information on products provided by retail sites is garbage anyway, so if you actually give the surfer good content, you might suddenly find your sales improving.

5:23 pm on Mar 3, 2003 (gmt 0)

Full Member

10+ Year Member

joined:Oct 30, 2002
posts:236
votes: 0


I know of somebody who has lots of links to each main area of his website at the bottom of each page that is headed "search engine information" from that area he links to duplicated pages using numbers only - changing the meta tags and "location" each time for each duplicated page. Example "service location" This has taken a lot of work to do as there are hundreds of duplicated pages.

I think this is a brave move but not one I would want to take.

No penalty as yet for this person. All pages have high PR.

IMHO, you should be fine with taking the information from another page so long as your site layout is different and HTML is set out different to the merchant.

5:25 pm on Mar 3, 2003 (gmt 0)

Full Member

10+ Year Member

joined:Oct 30, 2002
posts:236
votes: 0


<<Having "the" on more than one page is either poor sarcasm or a bit extreme>>

Oh come on needsomehelp, it was just a little humor during these difficult times ;)

5:31 pm on Mar 3, 2003 (gmt 0)

Junior Member

10+ Year Member

joined:Oct 5, 2001
posts:109
votes: 0


I must say that the professionalism that once reigned at webmasterworld seems to be lost. Too much joking around. Too many worthless, long threads that get off topic.

Don't get me wrong, the help offered is still great. But we were all in the shoes as newbies. When advice is given that can be considered sarcastic and funny to some, it isn't to those that are not experienced like those offering this joking advice.

5:35 pm on Mar 3, 2003 (gmt 0)

Junior Member

10+ Year Member

joined:Nov 30, 2002
posts:59
votes: 0


This has reminded me of a nasty thought i have this morning.

One of my sites has the entire text of the Bible on it. Obviously alot of other sites will have that identicle text. The headings/footings will be differant, but still 85% of a page would be the same. I have good reasons for wanting to provide this on my website, and for wanting to allow google to search it.

I dont consider it spam of any sort, its not duplicated on my site (ie its only on my site once), but am i at risk of being penalised by google - i havent been so far in the four months it been up.

5:41 pm on Mar 3, 2003 (gmt 0)

Senior Member

WebmasterWorld Senior Member bigdave is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:Nov 19, 2002
posts:3454
votes: 0


Do not consider the duplicate content filter to be a penalty in the classic sense. Google will just remove the lower PR sites from the SERPs. They will not "penalize" your site for this, they will just remove your pages from the results.

This can be penalty enough if you want those pages to show up in the SERPs, but it is not the same as getting a penalty for spamming.

5:43 pm on Mar 3, 2003 (gmt 0)

Full Member

10+ Year Member

joined:Sept 2, 2002
posts:262
votes: 0


I also think that the structure is at least as important as the text.

I spent some time building a small website in English using mostly html and tables (using just tables can make a big difference in the visual aspect of the page). The code was very unique.
Then I translated all the text to my mother language, keeping the html structure, and put it in a subfolder of the same domain. This translated site was listed by several good sites (including the regional Dmoz), BUT the pagerank has been zero for a few months (PR of the original site is 4 - and itīs not in Dmoz).

To me, this is an indication that google is seeing them as duplicated (even though there is no repeated text) and ignoring the translated pages.

5:49 pm on Mar 3, 2003 (gmt 0)

Junior Member

10+ Year Member

joined:Nov 30, 2002
posts:59
votes: 0


Many thanks for that bit of info Dave - in that case its no penalty realistically. My SEO isnt good enough (yet) for me to even have hopes of top ten for most phrases in the bible! ;)

The phrases it helps me with are phrases in which I do well at present, so i can sleep easy - thanks.

6:07 pm on Mar 3, 2003 (gmt 0)

Preferred Member

10+ Year Member

joined:Dec 7, 2001
posts:579
votes: 0


With similar and related product/information/specification, it is inevitable that there will be "Page Similarity".

I usualy try to work on a variance of at least 20%.

Unfortunately there is no measuring stick to determine similarity so it comes down to a judgement call.

Google is going to be responsible for thousands of ulcers in coming years and duplication plus linking will be the main triggers to this stress induced affliction.

6:12 pm on Mar 3, 2003 (gmt 0)

Junior Member

10+ Year Member

joined:Aug 22, 2002
posts:153
votes: 0


So what you are saying is that you do have duplicate content that would fill up the SERPS with substantially the same thing as the other site?

Yes it is product documentation which I either need to show or to direct them to the manufacturers site which I don't want to do as then they may not come back...I have tried to modify the content and add in my own unique usps but a lot of the wording is still the same.

I get a fair amount of visitors to other pages who might be interested in these pages so I'm not too worried but if they showed in the serps as well this would be a bonus.

6:24 pm on Mar 3, 2003 (gmt 0)

Senior Member

WebmasterWorld Senior Member bigdave is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:Nov 19, 2002
posts:3454
votes: 0


These pages won't hurt you if they are marked as duplicate content. You just won't get whatever traffic they would deliver.

If you want to keep them in the SERPs, just make sure they hve a higher PR than the manufacturers version.

The thing is, that it really is duplicate content, and those pages should be removed to keep the SERPs clean.

6:40 pm on Mar 3, 2003 (gmt 0)

Junior Member

10+ Year Member

joined:Aug 22, 2002
posts:153
votes: 0


thanks BigDave

I understand and agree with the principle involved as I can see that if they don't do this then the top search results for many searches will be the same content on different sites which is pretty pointless and frustrating from the surfer's point of view.

Off to find some good links...

7:03 pm on Mar 3, 2003 (gmt 0)

Senior Member

WebmasterWorld Senior Member bigdave is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:Nov 19, 2002
posts:3454
votes: 0


Just a note for those of you that are too concerned about duplicate content:

Google does NOT compare all 3.5 billion pages against all the other 3.5 billion pages!

That would require 3500000000 + 3499999999 + 3499999998 + ... + 1 = 6125000001750000000 page compares.

They probably only concentrate on those sites that reurn high in the same SERPs, possibly limiting it to the front page of the popular searches, along with those that are caught due to spam reports.

4:30 am on Mar 4, 2003 (gmt 0)

Junior Member

10+ Year Member

joined:July 13, 2002
posts:119
votes: 0


curious ...

I think you're easiest solution is to have a link to the mfg doc's site that opens in a new window ... you avoid the content dup and keep the visitor ...

I think we've stumbled onto something huge ... maybe the googler hasn't updated because it is still penalizing all pages with ... "the"

 

Join The Conversation

Moderators and Top Contributors

Hot Threads This Week

Featured Threads

Free SEO Tools

Hire Expert Members