Forum Moderators: open

Message Too Old, No Replies

What defines "substantially duplicate content"?

         

kire1971

10:58 pm on Jan 3, 2003 (gmt 0)

10+ Year Member



The Google guidelines state, "Don't create multiple pages, subdomains, or domains with substantially duplicate content." But what is considered "substantially duplicate"?

I have several sites with seperate domain names and servers. Some of those sites are similar in content and the others are more in depth versions of the content.

As an example, lets say the similar sites are a directory of Animals divided into subcategories like Cats, Dogs, etc. Each of these similar sites has it's own layout and design, but the main portion of content, the directory listings with descriptions, are basically the same. In addition, the titles and descriptions of each site focus on keyword synonyms (one site will be titled Dogs, the similar page on the second site will be Canines, etc) to cover the different ways people may search.

Is creating similar sites for synonym keywords against any of the Google guidelines? Is this considered substantially duplicate content?

These sites were created over a year ago and have been happily listed ever since. The're well designed and useful for whoever finds them. However, each Google update makes me nervous that they'll be delisted.

Chris_R

11:20 pm on Jan 3, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



You won't get a surefire answer on this.

A good rule of thumb is - are you doing this for the users or the search engines.

If the answer is the latter - you can't very well expect that google will like it.

I wouldn't worry about getting banned - unless you are being abusive. Most likely if google's filter picks it up - they will drop one of the pages.

Ton of people and sites have duplicate content. There is not necessarily anything wrong with this. Google just trys not to show more than one copy.

If you type in a few words from "The Raven" for example - they just don't give you 10 million copies of the raven.

compare:

[google.com...]

with:

[search.lycos.com...]

If your title is different and a decent part of the content is different - you should be ok. Layout can be the same - most sites have the same layout.

Google isn't going to let you know the exact amount - as it is a secret. Just keep the principles in mind and you should be ok.

eyeinthesky

11:26 pm on Jan 3, 2003 (gmt 0)

10+ Year Member



<< Is creating similar sites for synonym keywords against any of the Google guidelines? Is this considered substantially duplicate content? >>

I've got the same problems here, not synonyms but in other languages. I've translated these pages "wholesale" and Google seems to think they are duplicates.

Any way around this?

Thanks and a happy and successful 2003 to all!

my3cents

11:34 pm on Jan 3, 2003 (gmt 0)

10+ Year Member



You may want to keep in mind that if your competitors see that you have several websites that are very closely related, they will be able to pretty easily compile a list and report this to the directories. I do this all the time, because it's pretty clear that most people do this to try and get around what the automated filters get rid of. The bottom line is, are you doing this so that you can increase your backlinks and PR? are you doing this so that you can get around the obsticles the SE's have in place to filter duplicate listing? are you doing this as an attempt to manipulate your SE position? All of these things seem to be against the rules, google seems to state them pretty clearly in saying not to spread your content over multiple domains. If so, it may work great for you now, but I wouldn't recommend it.

kire1971

11:46 pm on Jan 3, 2003 (gmt 0)

10+ Year Member



The reason for the similar sites is really for the searchers and the search engines. As per my example, most of the searchers will search for "Dog" but some will search for "Canine". It's very difficult to get a single page to rank well for two phrases so I created additional sites for synonym keywords and keyphrases. This way, whichever way someone searches, they'll find one of the pages. I'm not trying to dominate the search results with 5 out of 10 listings for the same phrase, i'm just trying to cover all the bases of how someone will find my pages.

Hardwood Guy

11:54 pm on Jan 3, 2003 (gmt 0)

10+ Year Member



Kire:

I know where you're coming from. I have two sites that have a few pages that are basically duplicates and I eventually want to remove the ones from the original site for fear of being penalized. Unfortunately that site is the one that is getting 80% of the traffic as the other is only two months old. Once the new catches up with the old I plan a complete makeover.

my3cents...I like that! I've used my .06 many times and mentioned inflation as a reason, but nobdy notices. Good originality!

my3cents

11:55 pm on Jan 3, 2003 (gmt 0)

10+ Year Member



I can understand how you are justifying it, but the fact is, google states that they don't want you to do this. They warn against it, and trust me, if a list were compiled and sent to the directories, they would remove you. I know from personal experience. Also, it is not hard for a site to do well for many different phrases, provided that that site is large enough and well promoted enough. It sounds like, instead of developing more content to make one site bigger, therefore targeting more search phrases, you are asking if it's ok for you to copy the same content over and over, make a few minor wording changes, and if you would get in trouble for it. Some very sound advice: spend your time developing more content, not copying your content across several domains. If you develop enough targeted content for each keyword as sections of your main site, you will enjoy the same benefits of showing up for several different search results, if you simple duplicate your content and change the keywords, you will eventually be caught.

[edited by: my3cents at 12:04 am (utc) on Jan. 4, 2003]

my3cents

11:56 pm on Jan 3, 2003 (gmt 0)

10+ Year Member



thanks hardwood guy. :)

gibbon

12:06 am on Jan 4, 2003 (gmt 0)

10+ Year Member



we have multiple domains, with a bit of duplicate content spread across each domain (terms & conditions, contact info etc) as well as a few product pages.

reading this thread worries me, should we consider this duplicate content? and if so should we put a robots noindex tag on all incidences of duplicate content?

Marcia

12:07 am on Jan 4, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I've run into this, being surprised by finding out about additional domain names pointing that I wasn't told about - the same identical pages, returning a 200, too.

I've heard a few times about an 80% similarity cap, though less is probably safer, with different page structure.

my3cents

12:12 am on Jan 4, 2003 (gmt 0)

10+ Year Member



Just to elaborate on my post above, if you decide to develop more content and optimize several section of one website, you will have internal pages rank high for the search phrases that you target with those internal sections, this brings visitors into your site on the pages most likely to pertain to what they are looking for. This can work very well and also increase your internal PR without leaving you at risk of someone reporting you. If you have a pet site, and you develop sections for different animals, add content into each section and group pages together that target variations of the search phrase. I do this with my websites and it works great, I see a lot of my internal sections and pages outranking my competitors and my main page is #1 for 4 of the variations, simply because it's the main page of an ever growing website that is linked to sections and pages that each have a handful of different variations of the search phrases targeted. I think that people originally started spreading their content out across multiple domains so that they could achieve multiple directory listings, since the directories don't seem to want to add "deeplinks" to sites that are very diverse in content. IMHO, you are breaking the rules, the algo probably will not catch you, but one of your competitors may. Either way, you would be better off building one big site and optimizing each page and each section to target visitors searching different phrases, and you won't be at risk of a ban.

Hope this info helps.

SlyOldDog

12:15 am on Jan 4, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



There's no good reason to use subdomains anymore. Looksmart is the last search engine that used URL names to help your rank in any discerable way.

Using subdomains or seperate urls will help you short term because you can get several listings on one google page and you can boost your pagerank artificially, but you may get a slap on the wrist for it too if you get reported.

There's no problem optimising one site for all the keywords you please. Just make a seperate page for each set of keywords.

We only have 2 domains per subject area, and the second one is disaster backup only in case the main site gets dropped. We don't promote the second site and it normally doesn't make the page 1 SERPS. If we get dropped, we'll ask the backward linking sites to move their links to the new site. This really is the only way to sleep at night. :)

kire1971

12:16 am on Jan 4, 2003 (gmt 0)

10+ Year Member



The words are synonyms. How does one create multiple pages of original content for synonyms? By definition, it's the same thing. Should I fill an additonal page with complete BS just so I have some different content? Seems that's a worse offense than what I'm doing.

It seems to me that what I did is what doorway pages were originally used for... basically to direct people to a relevant page when they're searching using different keywords. Google doesn't like doorway pages because "Google does not encourage the use of doorway pages. We want to point users to content pages, not to doorways or splash screens." Instead, I have skipped the doorway page and created a page of content where a doorway page would have been. Isn't this a good thing? The intent is not to fool a search engine or a searcher, it's merely to get my content to those who are looking for it in different ways.

my3cents

12:23 am on Jan 4, 2003 (gmt 0)

10+ Year Member



gibbon, I wouldn't worry about those types of pages, I don't think that someone is going to report you because you have similar terms & conditions, contact info etc. The fact is that the current algo is not finding most of the duplicate content that is breaking the rules, I doubt that the rules were intended to find sites that have the same terms and conditions. Most yahoo stores use the default privacy statements, etc. Also consider that people aren't searching to see how many websites have similar terms and conditions, where you get into trouble is when you blatantly duplicate content and make a few minor changes with the intent of having several domains show up for common search terms, especially in commercial areas, where these tactics may be taking sales away from competitors. There is a good chance that one of the competitors is going to find this, and how will it look when they see the same content on dozens of domains, all owned by one company. I'm sure that would look much worse than seeing sites that have the same terms and conditions.

I think it's common sense people, if you're trying to cheat the search engines, and you are spreading content across multiple domain for the sake of getting better rankings and targeting more phrases, it is going to look like spam or at least put you at risk of being reported. I have had very good luck just building lots of content on single site and having my internal pages outrank competitors for common phrases. It really does work and the larger your website is and the more pages of content it has, the better internal PR you have. All of these factors help boost your main page to the top on several variations.

gibbon

12:25 am on Jan 4, 2003 (gmt 0)

10+ Year Member



it would have been great if we could have made one big site rather than a few smallish category sites.

however this is not how it has worked in practice. we tried the one big site approach, however this has not been as successful as targetting niches with a little bit of content crossover.

this is why we have multiple domains. it has been done for the customers best interests, we do not crosslink either. surely this is not a bannable offence?

my3cents

12:30 am on Jan 4, 2003 (gmt 0)

10+ Year Member



kire1,
I understand that you are not intending to be dishonest and I understand that, this is the justification most people who do this have. To answer your question, in the terms you used before. Create a section of articles or content on your main site for dogs. Develop several pages of unique content, on same pages, title it canines, and target that phrase, on other pages use the words mutts, and optimize for that phrase, etc., etc. If you do this, instead of just taking one content page and rewording it over and over, you will have better results, plus, your main website will quickly grow, as it becomes bigger, each month your backlinks will increase and you will gain higher PR over time. The side effect is, that you internal pages will also increase PR, from your own relevant pages and you will find that when you search for mutts, your internal pages will outrank your competitors, plus when people search for mutts, they will find your page, which is using the term that is in their head. I'm telling you (and I may be opening myself up for abuse here) I report exactly what you are doing when my competitors do it and they get removed. I happen to love dogs, and animals in general, I suppose if you would have used casinos or something as an example, my attitude would not be to provide sound advice, but really, building one large site with original content is much more effective and safer than just duplicating and rewording content across multiple domains.

my3cents

12:39 am on Jan 4, 2003 (gmt 0)

10+ Year Member



hey gibbon,
It doesn't sound to me like you are at risk, I mean if you sell a diverse line of products, there may be good reason for you to have multiple sites, especially if there is very little cross over. I own several websites, they are completely unrelated. It would not make sense for me to market these product together, it would only divide things further and you're right, would not be effective. If you sell hair brushes on one site, and lawn mowers on another, I can't imagine anyone suggesting that you should combine them into one site. The issue that I am talking about, is that when people take the same content and simply reword it, for the purpose of just targeting a slight variation of the search phrase. I have seen thousands of these sites, and they are breaking the rules, and I do report them. The sell hair brushes, and they duplicate the content across multiple domain for the sole intent of getting more search engine traffic. Just as people tell me not to waste my time reporting them, I am saying the same thing, don't waste your time duplicating content, develop original content. There is nothing saying that one person can't own several companies, and if it doesn't make sense to combine them into one website, then have separate websites, but don't just duplicate the content. If someone is selling widgets and they make 100 website that all basically sell the same thing, but each one targeted for a different color widget, they are at risk of being reported/banned/etc.

gibbon

1:10 am on Jan 4, 2003 (gmt 0)

10+ Year Member



my3cents, thanks for that.

as a newbie it is easy to get slightly paranoid around here :)

kire1971

1:19 am on Jan 4, 2003 (gmt 0)

10+ Year Member



Lets look at this another way...

There are thousands of sites using the same ODP data that are not considered duplicates. All they have done is repackage the same directory information in different ways with different keywords and phrases.

There are thousands of sites using the same AP newswire articles that are not considered duplicates. All they have done is repackage the information.

I could go on.

I am considering doing a similar thing with my directory content and provide it to other sites. Are they all then to be considered duplicates? The only difference here would be me owning and operating the sites vs. different people owning and operating the sites. Either way, the majority of the content would be the same.

3cents, I get the feeling this is not as cut and dry an issue as you make it. Any other opinions? Maybe GoogleGuy can weigh in on this?

my3cents

1:37 am on Jan 4, 2003 (gmt 0)

10+ Year Member



kire,
What I'm saying is that if you are duplicating your content and tweaking it to show up for different search results, spreading across different domains, etc. There is a good chance that someone will report you. I have reported many people who have done this and they have lost their odp listings, their Y! listings and dropped out of site. I don't think it's a good idea to do it and run the risk. My other point is that if you spent the time that you are spending duplication and tweaking your content, buying more domain names, etc. and used it to develop original content that targeted the different variations of your phrases, you would probably get better results and have better search engine ranking. Another point is that although there are several sites that duplicate dmoz content, how many of them do you see in search results? I search for my key phrases and the google directory and the dmoz directory usually show up within the first few pages, I don't see any of the others showing up in the first 200. Also consider that google themselves duplicate this directory, so why would they ban duplicates of the odp? This wouldn't make sense. As far as articles go, yes, many articles are posted on thousands of different website, I even submit my articles to dozens of websites, do they have my duplicate content, yes, but they are in no way connected to me. Do I run the risk of a competitor reporting me? No. why? because there is a pretty big difference between ME duplicating my content and others duplicating my content. I agree that it is not a cut and dry situation, and it';s true that you can keep doing what you're doing and probably get away with it when it comes to the google algo.. BUT, how comfortable would you be if your competitors had a list of all of your websites. From personal experience I can tell you that if you were my competitor, it would probably take me less than a half hour to make a list of every website you own and report it, within a month or two, you would probably be in really bad shape. I don't want to make anybody paranoid, I'm not on here to try and find people to hunt down, but if I see something that looks like it's breaking the rules, I will report them. How do you think a human editor would look at my report. Do you think they could determine the difference between an article that's listed on hundreds of websites and a person who has spread their duplicate content across many domains?

It's up to you how you want to do it. I am speaking from personal experience that duplicate content spread across multiple domains can get you in big trouble, and that building a site filled with original content really works well.

kire1971

2:30 am on Jan 4, 2003 (gmt 0)

10+ Year Member



Are you speaking only of ODP and Yahoo directory listings? I can see where that wouldn't be appropriate with similar sites such as these, but i'm specifically talking about Google search. As for the Googlebot, I don't think it cares since they are sufficiently different and I wouldn't consider them mirrors at all. They are as similar as if several sites were to use the same ODP data. The only difference is that they are all owned by me. I am only concerned with a Google hand review and if a Google editor would consider them "substantially duplicate".

Go60Guy

3:49 am on Jan 4, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



This topic comes up frequently. The example often given is syndicated content (articles) that are picked up and duplicated word for word on many different websites.

So far, the wisdom I've been able to discern is that this sort of content, being wrapped in different navigation and structure, is not at risk of breaching the "substantially duplicate" admonition.

Also, I seem to recall Brett saying that product descriptions provided by manufacturers that go out to all distributors, retailers, etc. and show up in duplicate on multitudes of websites do not run afoul of the substantially duplicate risk.

This area is fairly nebulous, and any analysis of content which covers the same territory for its susceptibility to being a duplicate, is bound to be subjective to a large extent.

I've seen conjectures that anything less than 80% or 50% similar is OK. Take your pick.

BTW, my3cents, welcome to the boards. One thing I might mention, and I hope you'll take this constructively, is that my eyes glaze over when I try to read your posts. It would be very helpful if you could break them into paragraphs, as I feel you have much good input to contribute here and I'd like to read it.

my3cents

11:14 am on Jan 4, 2003 (gmt 0)

10+ Year Member



hey go60, I will try to break it up, thanks for the input and the welcome.

My whole point about this is not if google will ban you for it, people with duplicate content can probably slip under google radar, simply because of the legitimate reason for duplicate content that go60 pointed out.

My point is that, if you do this, you stand a risk that someone will report you and spreading content across multiple domains, duplicating content and tweaking it to target a different set of search term, makes you a target and puts you at risk. Google may never find this, but there is a good chance that your competitors will.

If you want to do it go ahead, but you are running a pretty big risk. I'll tell you right now that if I find you, I will report it. I'm not looking for you, but someone else may be. If you are duplicating your content with the intent of increasing PR or increasing SE position, you are breaking the rules, and once someone reports you....

Make whatever choice you want, if you want to justify it, go ahead. If you want to talk yourself into believing that you're doing nothing wrong, fine.

I am telling you from experience, that you are gambling. If you are comfortable gambling, that is your choice, but the actions you described in duplicating content are against the rule, and if someone reports you, you will probably get hurt pretty bad.

I wish that everyone who is looking far ways to cheat the system would realize that substantial original content will get you so much further. Seriously, it's not some kind of joke, it really works. Why run the risk? you could just develop a few dozen new pages a month on your main website, make it keyword rich, and target the content and title to the search terms you want to target, you really will do better than the duplicate sites. Why would anyone want to run the risk of a ban when there is a better way that has been proven a thousand times?

Good_Vibes

8:55 pm on Jan 16, 2003 (gmt 0)

10+ Year Member



I have an interesting problem I could use an answer for.

One of my clients has a French section for his web site. He has also bought a domain name (in French) so that he can refer people to it. On the French domain name, he only has the French home page. All the links then point to the French section on the English domain.

He does this because he can't afford having 2 secure keys for the shopping cart.

Thus the problem: The French home page is duplicated.

In September his English side got a PR 0 (white) from Google. I found the duplicate content and put up a robots.txt on the French domain so that Google would not spider it.

This was done December 12th 2002. Since then, he still has a PR 0 even though there are good sites linking to him.

My question is: Will the robots.txt keep Google from the French domain site, and will it be enough to get the penalty removed from the English site?

Brett_Tabke

9:45 pm on Jan 16, 2003 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



> You won't get a surefire answer on this.

Sure there is.

I've had extensive experience with dupe content over the last year. Sites being moved, sites being pure mirrors - etc etc.

I know one guy with 75 domains that are probably classified as duplicates by Ink and Google. fooatNewYork.com, fooInMiami.com, fooInStLouis.com - etc etc. The only difference is the map to find their store on the home pages.

Google thinks they are dupes. So What? One just gets a lower pr than the next one. No penalty, no drops from listings, etc etc. I've never seen one removed from the index for dupe content.

Dupe content can be a friend if used appropriatly.