Forum Moderators: open

Message Too Old, No Replies

Duplicate Content Concerns

effect on site as a whole

         

prairie

3:39 pm on Nov 4, 2004 (gmt 0)

10+ Year Member



The motivation for starting this topic comes from the "Hilltop Magic Pill" thread - [webmasterworld.com...]

Perhaps naively, its just occured to me the that the duplicate content filter may also serve as an affiliation filter. Originally I just saw it as something which ensured a single results page didn't get filled with repeat information.

If a site has X amount of content identical to another site, Google could also take them to be related. This would effectively trim near identical domains owned by one company in different regions - e.g. .com/.co.uk/.ca etc.

It isn't a big step from here to suppose that only one page among the duplicates will pass the benefit of any anchor text, be it internal or external.

Until today I'd also only presumed duplicate content to mean on-page text. Duplicate content could also mean crossovers of substantially similar file names/directory structures, page titles, internal and external linking tendencies. Can anyone think of further possible factors?

Lastly, if you have a site with a large amount of content found on other sites (e.g. standard technical information on products), and you weren't indexed first, could this dampen your ability to rank on anything you have that *is* unique because your site is substantially duplicate content?

Prairie is very worried!

MHes

12:22 am on Nov 5, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Hi

I have also thought duplicate content is a signal that a site is an affiliate. However, being an affiliate site does not matter IMHO, it is the duplicated content that causes the problem.

I don't think additional content suffers, as long as it is unique. Being indexed first may not be the rule, the pr of the page and quality of links to that page may win through.

caveman

12:36 am on Nov 5, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



>Prairie is very worried!

Be afraid. Be very afraid. Or, don't worry, be happy. Not sure which is better advice in this environment.

Presumably, from a SE's view anyway, dup filters address all sorts of evils, including but not limited to aff sites with no unique content. And yes duplication comes in all shapes and sizes. Add WHOIS data to your list of what might be looked at be the SE's, and do a G search of WW. There are lots of threads about dup filters and lost sites.

The good news is that if you cover all your bases, you should have nothing to worry about except ranking well. :-)

<Cliché count: 6+>

GodLikeLotus

10:38 am on Nov 5, 2004 (gmt 0)

10+ Year Member



>if you have a site with a large amount of content found on other sites (e.g. standard technical information on products), and you weren't indexed first, could this dampen your ability to rank on anything you have that *is* unique because your site is substantially duplicate content?

My experience here leads me to beleive that Google is unable or was unable to decide which site was actually the first site.

A site I built was copied and launched under another domain last December. We then had to battle for position with a copy of our own site. In March/April this year our pages started to drop, they are all in the Google index, just seem to be confined to much lower ranking like page 10+. The copy of our site has the ranking we used to enjoy.

Our site has a PR 5 on the main pages and PR 4's on the directory pages, we have good links and descriptions on both DMOZ and Yahoo directories and the other 2 MSN and Yahoo rank us fine.

Would love an explanation and a what to do guide, because at the moment nothing seems to work.

My biggest concern over dup content is that the drop down boxes used for certain states like New York, Texas, Florida etc. are very large on their own, up to 50k for a large drop down menu. Even though I built this navigation, the copied site stole these too and how do I get around many pages that are 50% dup even before the content is examined.

Can anyone help, Please?

prairie

10:54 am on Nov 5, 2004 (gmt 0)

10+ Year Member



GodLikeLotus -- was the content they duplicated your own copyright? If so the answer might be to get legal.

GodLikeLotus

11:47 am on Nov 5, 2004 (gmt 0)

10+ Year Member



The majority of the site is a directory split into the different states, however, the directory with over 23,000 business names and addresses is not my property, I do not own a businesses name and address. Although it took nearly a year of hard work in putting together the directory into an easy to use site. Alot of the main pages content was dupe from sites like FTC (Federal Trade Commission) but I have rewritten everything now and improved the sites overal navigation. Just praying Google will release our site from what ever penalty it is currently under. I do use word penalty because from where we are, if a search engine pushes a site down its ranking deliberatly, then it can only be a type of penalty.

JuniorOptimizer

12:50 pm on Nov 5, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



GodLikeLotus, what was the PageRank of the site that has the troubles?

GodLikeLotus

12:56 pm on Nov 5, 2004 (gmt 0)

10+ Year Member



My site has PR5 on the index and main pages. The stolen copy has a PR4.

Powdork

4:05 pm on Nov 5, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I am concerned because I am adding a shopping section to my local wedding directory. I don't mind if the shopping section (5-10 pages out of 100 on the site) doesn't rank as long as it doesn't force the rest of the site into 'time out'. Since the shopping section will be largely dupe content I'm probably just going to add noindex, nofollow to it.

GodLikeLotus

5:04 pm on Nov 5, 2004 (gmt 0)

10+ Year Member



Powdork

My site is similiar, I don't care about the main pages as my sites traffic comes from searches like "widget sellers in widgetville" or "Widget Company in Widgetville".

caveman

9:14 pm on Nov 5, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Powdork I think lots of people are worried about the same thing.

In another thread we've talked about whether or not adding a significant number of new pages can sink an entire site. I have received all sorts of feedback.

What I know: Some previously healthy sites that have made no changes other than adding 10%-30% new pages have been badly hurt in the SERP's. BUT others have done same with no ill effect at all.

Since the addition of pages in and of itself did not necessarily hurt all sites, it must have something to do with other factors, where the addtion of pages was only some sort of red flag (if there is anything to this at all).

Possible issues connected to adding lots of new pages: Too many similar pages (dup filters); too little or difused PR such that the new pages dropped the site below certain hurdles; rate of growth that was too fast for G's liking; other?

Powdork

11:53 pm on Nov 5, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Yes I have been in that other thread and that is why I would consider the noindex, nofollow. However, G still crawls pages with noindex, nofollow so that may mean that these pages could be used in whatever computation determines a dupe filter or sandbox thingy.
I think instead I will link to these pages with a link called from an external js file that lies in a robots.txt protected folder. Will any of this bother the folks at adsense or adwords, given that I will probably be using both for the pages?

Powdork

5:35 am on Nov 8, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



just so i'm sure.

User-agent: Googlebot
Disallow: /shopping/

That (in robots.txt) will keep Googlebot from seeing the directory that includes the duplicate content, but will allow G to see shopping.htm, which has unique content only. Additionally, it won't keep anyone else from seeing it. Right?