Welcome to WebmasterWorld Guest from 54.226.67.166

Message Too Old, No Replies

How Much Duplicate Content Is Too Much For Panda?

     

Pjman

2:58 pm on Dec 7, 2011 (gmt 0)



I'm doing a through review of a Pandalized site. I was able to find a number of problems and suggest corrections.

This site is an industry that must list industry standards that their work meets with. So in many cases they will list a single sentence that appears on 10,000 (literally) other sites.

On about than 20% of their pages they list a single sentence (duplicated on 1000s of industry sites).

Most their pages are 600 words or more; with 15 words duplicated on other industry sites.

So can a single sentence of duplicate content (such as standards) get you Pandalized?

goodroi

4:25 pm on Dec 7, 2011 (gmt 0)

WebmasterWorld Administrator goodroi is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



I doubt a single sentence is causing the trouble. If you are a really concerned you could place that sentence in an image so the search engines do not see it.

I would look at the value of their pages. Is this information unique & useful or is it just another version of what is it regurgitated on 1000 other sites.

dunivan

8:04 pm on Dec 7, 2011 (gmt 0)



I doubt a single sentence is causing the trouble. If you are a really concerned you could place that sentence in an image so the search engines do not see it.
I share this view, but I do not recommend the image idea.

I wouldn't worry about one sentence. But to satisfy your fear, just re-word it.

Whitey

8:10 pm on Dec 7, 2011 (gmt 0)

WebmasterWorld Senior Member whitey is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



I agree, but an observation of duplicate content ( internally and externally ) and then separately, some consideration to the "value" of the unique content element ratios would be important.

For example, if all those pages had was the one line about "standards" , then they would be considered likely low value content, and in that context yes duplicate and low value.

But i doubt if that is what you're asking.

gyppo

5:43 am on Dec 8, 2011 (gmt 0)

5+ Year Member



Put the sentence in an image, that'll sort it!

rowtc2

7:52 am on Dec 8, 2011 (gmt 0)

5+ Year Member



Even you have original content, but is not useful (wrote for Google "to have original content"), i think you can be affected by Panda. Look at ezinearticles site.

There are e-commerce sites with descriptions from manufacturers that are not affected by Panda.

Whitey

1:45 pm on Dec 8, 2011 (gmt 0)

WebmasterWorld Senior Member whitey is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



descriptions from manufacturers that are not affected by Panda

I'd like to hear from any that aren't if the content is widely distributed. There's none that i know of in a number of very large affiliate networks using syndicated duplicate content that have escaped.

Pjman

1:50 pm on Dec 8, 2011 (gmt 0)



More and more I am inclined to believe Panda is focused on duplicate content.

Whitey

8:54 pm on Dec 8, 2011 (gmt 0)

WebmasterWorld Senior Member whitey is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



focused on duplicate content

... but these are not the only elements of low quality - although this is important.

santapaws

2:23 pm on Dec 9, 2011 (gmt 0)

5+ Year Member



this makes me so sad, makes me realise we are living through the death of information. People do not search for great websites, they search for great information. They could care less that the same information is on 1000 other pages of that website. Duplicate information on a single website does not mean that information is low quality.

Robert Charlton

6:51 pm on Dec 9, 2011 (gmt 0)

WebmasterWorld Administrator robert_charlton is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



this makes me so sad, makes me realise we are living through the death of information.

To paraphrase Mark Twain, I think the reports of the death of information have been greatly exaggerated. More likely, the wholesale copying and spinning of information to create websites at the push of a button, or the constant avalanche of billions of pages of mindless drivel, would lead to the demise.

In any event, I don't think that 15 words of industry boilerplace would cause Pandalization. That said, I'd rewrite it.

I'd look carefully also to see whether the 15 words are symptomatic of a general cookie-cutter or paraphrased approach to the site content as a whole. That could well be a problem.

nippi

10:38 pm on Dec 10, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



ezine articles. unique content?

actually, looking at ezine articles it seems to be mainly written for adsense and links rubbish, with a huge % of their articles published on dozens of articles sites.

acee

12:07 pm on Dec 11, 2011 (gmt 0)

10+ Year Member



Is duplicate content even a Panda issue?

Google's SERP's are positively heaving with dupe content. Syndicated, auto-generated, you name it!

Panda is just a smokescreen under which the SERP's can be skewed in favour of improved adwords clickthru whilst suppressing websites on an upward curve of reduced bounce rate, increasing pageviews and better conversion.

I cannot understand why you would apply a quality factor to an entire website. Would it not make more sense to look at how an individual page hangs within a hierarchy of pages that link to it, and to which it links, in order to determine true quality?

tedster

4:24 pm on Dec 11, 2011 (gmt 0)

WebmasterWorld Senior Member tedster is a WebmasterWorld Top Contributor of All Time 10+ Year Member



Is duplicate content even a Panda issue?


I'd say it is - but one of many areas Panda scores against. I have two reasons for this opinion. First, Panda's initial roll-out was immediately preceded by the Duplicate Content update [webmasterworld.com], commonly called the "Scraper Update".

Additionally, I've seen one site recover to better than pre-Panda traffic levels only after directly taking on the "original author" area and strengthening the signals around their authorship over scrapers and syndication or mash-up sites.

I'm aware that Panda seems to parallel a rise in Adwords income for Google. However I think it vastly overstates the case to say Panda is "just" a smokescreen, or indeed to say it is "just" any one thing.

Pjman

7:17 pm on Dec 11, 2011 (gmt 0)



I'm aware that Panda seems to parallel a rise in Adwords income for Google.


@ tedster

Totally I agree, but there is just a little too much coincidence here.

Jan 11'

Larry Page named new CEO. G's business philosophy obviously shifts.


Feb 24th

Panda blind sides many people, even us.

Q2 11'

Google reports an insane jump in revenue.

Rest of 11'

Many promising, but non profitable G projects shut down.


G's 2011 Business Philosophy

To most people, using Google is like a drug. They have to have it and can't imagine life without it.

Google just like any good dealer, cuts the quality of their product to make more sales. I guess it was bound to happen?

Whitey

9:57 pm on Dec 11, 2011 (gmt 0)

WebmasterWorld Senior Member whitey is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



I'd say it is - but one of many areas Panda scores against. I have two reasons for this opinion. First, Panda's initial roll-out was immediately preceded by the Duplicate Content update [webmasterworld.com], commonly called the "Scraper Update".


There's a lot said about fixing DMCA issues to convince us all that this was a key area that Panalized sites needed to fix.

But duplicate content management has always been complex.

Has Google's philosophy changed on duplicate content mixed with unique and added value? MC and others often said adding value to such sites was acceptable.

Or should sites completely remove every last piece of aggregated dupe content, in listings from external sources and internally between levels. Should the eliminate snippets for example introduced from external sources ?
e.g. shopping sites.

If some sites did this, they would have too little content to be indexed I'd say.

What's the view out there? Anyone seeing test results on duplicate content tweaks?

tedster

10:21 pm on Dec 11, 2011 (gmt 0)

WebmasterWorld Senior Member tedster is a WebmasterWorld Top Contributor of All Time 10+ Year Member



Has Google's philosophy changed on duplicate content mixed with unique and added value? MC and others often said adding value to such sites was acceptable.

Even the recently leaked EWOQ document said this - seems to me it obviously can be OK, but that is also a factor that complicates things quite a bit.

I work with one site that sells books, and each book has an author's biography on the page. Every title does not always get a unique biography - far from it, in fact - but this amount of internal duplication has not caused any Google problems I can see.

Whitey

10:38 pm on Dec 11, 2011 (gmt 0)

WebmasterWorld Senior Member whitey is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



Are we perhaps talking about sitewide thresholds here?

Like it's OK to have some pages without "added value", but if the site's dominated by duplicate content then it's not.

Susan Moskwa may provide some insight into Google's universal philosophy on " value add" when responding to a content removal consideration.

As long as your site has worthwhile, original content on it (i.e. it's not just made to put AdSense on it), there's not a golden "text-to-ads ratio" or a word limit for ranking. I worry that you're just looking at the trees and not seeing the forest. Optimizing your site for search isn't about counting the words on a page, it's about making sure that you have useful, usable content, and then making that content accessible to search engines. [google.com...]

Somewhere else i recall Susan warning against the uneccesary elimination of content on sites. She seemed to be saying some sites have gone overboard with content elimination.

Is the tin foil hat that i'm wearing correct to assume that this covers added value on duplicate content clips as well?

[edited by: Whitey at 10:57 pm (utc) on Dec 11, 2011]

tangor

10:56 pm on Dec 11, 2011 (gmt 0)

WebmasterWorld Senior Member tangor is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month



Some industry standard texts should not be messed with like: "This lawnmower is not designed to trim hedges." :)

A single sentence of that type should not (theoretically) cause a duplicate content. However, given the increasing REDUNDANCY of content found across the web, it might be a factor where the SEs arbitrarily pick and choose which sites appear for any query. (10,000 sites, 10 top spots minus whatever big brands appear first...) That's a pretty small target to get listed.

Whitey

11:06 pm on Dec 11, 2011 (gmt 0)

WebmasterWorld Senior Member whitey is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



"This lawnmower is not designed to trim hedges."

But machine generated duplicate content with wildcard insertions amongst x characters for ranking purposes is probably going to act against a site.

My gut feel tells me this example is essential to retain, but if it's manipluated with repetitive sitewide keyword variances for ranking purposes, then it's going to fall into another version of duplicate content which is unacceptable to Google and worse than aggregated content. Any views ?

acee

10:33 am on Dec 12, 2011 (gmt 0)

10+ Year Member



Duplicate content is the soft target of onpage factors that is most likely to get every webmaster wondering if they've had their content copied or if they've over egged the snippets from other pages.

Google has been fighting duplicate content for many years and must be the foremost authority on the planet on this subject. So why are the search results still full of pages that use variations of one article from the same author, thin pages that contain almost exclusively syndicated content with no value add, the same forum posts across several domains, and sites with a large proportion of the same copy on all pages?

If they were unable to totally remove duplicate content with their algo's prior to Panda, I'm pretty sure they would have nailed this afterwards, but the evidence seems to suggest otherwise.

Why are the first three results from Amazon whenever you search for consumer goods followed by price comparison sites using entirely affiliate feeds even though you didn't include the words 'price' or 'comparison'?

Because they are engineered, not natural!

If Google wanted to improve the quality of search it might help if keyword proximity played a role more than simply evidence. Google encouraging webmasters to consolidate content is likely to decrease relevancy of many pages because a larger gamut of keywords are present. Surely concise content is the friend of search and verbose the enemy!

Let's be objective, setting aside a pro or anti-Google outlook, and ask yourself if these search results look like (1) they were delivered to your browser purely for your benefit, (2) were they excessively skewed in favour of Google's profits, or (3) were they a compromise that you found satisfactory?
 

Featured Threads

Hot Threads This Week

Hot Threads This Month