homepage Welcome to WebmasterWorld Guest from 54.211.230.186
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Home / Forums Index / Google / Google SEO News and Discussion
Forum Library, Charter, Moderators: Robert Charlton & aakk9999 & brotherhood of lan & goodroi

Google SEO News and Discussion Forum

    
How Much Duplicate Content Is Too Much For Panda?
Pjman



 
Msg#: 4395198 posted 2:58 pm on Dec 7, 2011 (gmt 0)

I'm doing a through review of a Pandalized site. I was able to find a number of problems and suggest corrections.

This site is an industry that must list industry standards that their work meets with. So in many cases they will list a single sentence that appears on 10,000 (literally) other sites.

On about than 20% of their pages they list a single sentence (duplicated on 1000s of industry sites).

Most their pages are 600 words or more; with 15 words duplicated on other industry sites.

So can a single sentence of duplicate content (such as standards) get you Pandalized?

 

goodroi

WebmasterWorld Administrator goodroi us a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



 
Msg#: 4395198 posted 4:25 pm on Dec 7, 2011 (gmt 0)

I doubt a single sentence is causing the trouble. If you are a really concerned you could place that sentence in an image so the search engines do not see it.

I would look at the value of their pages. Is this information unique & useful or is it just another version of what is it regurgitated on 1000 other sites.

dunivan



 
Msg#: 4395198 posted 8:04 pm on Dec 7, 2011 (gmt 0)

I doubt a single sentence is causing the trouble. If you are a really concerned you could place that sentence in an image so the search engines do not see it.
I share this view, but I do not recommend the image idea.

I wouldn't worry about one sentence. But to satisfy your fear, just re-word it.

Whitey

WebmasterWorld Senior Member whitey us a WebmasterWorld Top Contributor of All Time 5+ Year Member



 
Msg#: 4395198 posted 8:10 pm on Dec 7, 2011 (gmt 0)

I agree, but an observation of duplicate content ( internally and externally ) and then separately, some consideration to the "value" of the unique content element ratios would be important.

For example, if all those pages had was the one line about "standards" , then they would be considered likely low value content, and in that context yes duplicate and low value.

But i doubt if that is what you're asking.

gyppo

5+ Year Member



 
Msg#: 4395198 posted 5:43 am on Dec 8, 2011 (gmt 0)

Put the sentence in an image, that'll sort it!

rowtc2

5+ Year Member



 
Msg#: 4395198 posted 7:52 am on Dec 8, 2011 (gmt 0)

Even you have original content, but is not useful (wrote for Google "to have original content"), i think you can be affected by Panda. Look at ezinearticles site.

There are e-commerce sites with descriptions from manufacturers that are not affected by Panda.

Whitey

WebmasterWorld Senior Member whitey us a WebmasterWorld Top Contributor of All Time 5+ Year Member



 
Msg#: 4395198 posted 1:45 pm on Dec 8, 2011 (gmt 0)

descriptions from manufacturers that are not affected by Panda

I'd like to hear from any that aren't if the content is widely distributed. There's none that i know of in a number of very large affiliate networks using syndicated duplicate content that have escaped.

Pjman



 
Msg#: 4395198 posted 1:50 pm on Dec 8, 2011 (gmt 0)

More and more I am inclined to believe Panda is focused on duplicate content.

Whitey

WebmasterWorld Senior Member whitey us a WebmasterWorld Top Contributor of All Time 5+ Year Member



 
Msg#: 4395198 posted 8:54 pm on Dec 8, 2011 (gmt 0)

focused on duplicate content

... but these are not the only elements of low quality - although this is important.

santapaws

5+ Year Member



 
Msg#: 4395198 posted 2:23 pm on Dec 9, 2011 (gmt 0)

this makes me so sad, makes me realise we are living through the death of information. People do not search for great websites, they search for great information. They could care less that the same information is on 1000 other pages of that website. Duplicate information on a single website does not mean that information is low quality.

Robert Charlton

WebmasterWorld Administrator robert_charlton us a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



 
Msg#: 4395198 posted 6:51 pm on Dec 9, 2011 (gmt 0)

this makes me so sad, makes me realise we are living through the death of information.

To paraphrase Mark Twain, I think the reports of the death of information have been greatly exaggerated. More likely, the wholesale copying and spinning of information to create websites at the push of a button, or the constant avalanche of billions of pages of mindless drivel, would lead to the demise.

In any event, I don't think that 15 words of industry boilerplace would cause Pandalization. That said, I'd rewrite it.

I'd look carefully also to see whether the 15 words are symptomatic of a general cookie-cutter or paraphrased approach to the site content as a whole. That could well be a problem.

nippi

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 4395198 posted 10:38 pm on Dec 10, 2011 (gmt 0)

ezine articles. unique content?

actually, looking at ezine articles it seems to be mainly written for adsense and links rubbish, with a huge % of their articles published on dozens of articles sites.

acee

10+ Year Member



 
Msg#: 4395198 posted 12:07 pm on Dec 11, 2011 (gmt 0)

Is duplicate content even a Panda issue?

Google's SERP's are positively heaving with dupe content. Syndicated, auto-generated, you name it!

Panda is just a smokescreen under which the SERP's can be skewed in favour of improved adwords clickthru whilst suppressing websites on an upward curve of reduced bounce rate, increasing pageviews and better conversion.

I cannot understand why you would apply a quality factor to an entire website. Would it not make more sense to look at how an individual page hangs within a hierarchy of pages that link to it, and to which it links, in order to determine true quality?

tedster

WebmasterWorld Senior Member tedster us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 4395198 posted 4:24 pm on Dec 11, 2011 (gmt 0)

Is duplicate content even a Panda issue?


I'd say it is - but one of many areas Panda scores against. I have two reasons for this opinion. First, Panda's initial roll-out was immediately preceded by the Duplicate Content update [webmasterworld.com], commonly called the "Scraper Update".

Additionally, I've seen one site recover to better than pre-Panda traffic levels only after directly taking on the "original author" area and strengthening the signals around their authorship over scrapers and syndication or mash-up sites.

I'm aware that Panda seems to parallel a rise in Adwords income for Google. However I think it vastly overstates the case to say Panda is "just" a smokescreen, or indeed to say it is "just" any one thing.

Pjman



 
Msg#: 4395198 posted 7:17 pm on Dec 11, 2011 (gmt 0)

I'm aware that Panda seems to parallel a rise in Adwords income for Google.


@ tedster

Totally I agree, but there is just a little too much coincidence here.

Jan 11'

Larry Page named new CEO. G's business philosophy obviously shifts.


Feb 24th

Panda blind sides many people, even us.

Q2 11'

Google reports an insane jump in revenue.

Rest of 11'

Many promising, but non profitable G projects shut down.


G's 2011 Business Philosophy

To most people, using Google is like a drug. They have to have it and can't imagine life without it.

Google just like any good dealer, cuts the quality of their product to make more sales. I guess it was bound to happen?

Whitey

WebmasterWorld Senior Member whitey us a WebmasterWorld Top Contributor of All Time 5+ Year Member



 
Msg#: 4395198 posted 9:57 pm on Dec 11, 2011 (gmt 0)

I'd say it is - but one of many areas Panda scores against. I have two reasons for this opinion. First, Panda's initial roll-out was immediately preceded by the Duplicate Content update [webmasterworld.com], commonly called the "Scraper Update".


There's a lot said about fixing DMCA issues to convince us all that this was a key area that Panalized sites needed to fix.

But duplicate content management has always been complex.

Has Google's philosophy changed on duplicate content mixed with unique and added value? MC and others often said adding value to such sites was acceptable.

Or should sites completely remove every last piece of aggregated dupe content, in listings from external sources and internally between levels. Should the eliminate snippets for example introduced from external sources ?
e.g. shopping sites.

If some sites did this, they would have too little content to be indexed I'd say.

What's the view out there? Anyone seeing test results on duplicate content tweaks?

tedster

WebmasterWorld Senior Member tedster us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 4395198 posted 10:21 pm on Dec 11, 2011 (gmt 0)

Has Google's philosophy changed on duplicate content mixed with unique and added value? MC and others often said adding value to such sites was acceptable.

Even the recently leaked EWOQ document said this - seems to me it obviously can be OK, but that is also a factor that complicates things quite a bit.

I work with one site that sells books, and each book has an author's biography on the page. Every title does not always get a unique biography - far from it, in fact - but this amount of internal duplication has not caused any Google problems I can see.

Whitey

WebmasterWorld Senior Member whitey us a WebmasterWorld Top Contributor of All Time 5+ Year Member



 
Msg#: 4395198 posted 10:38 pm on Dec 11, 2011 (gmt 0)

Are we perhaps talking about sitewide thresholds here?

Like it's OK to have some pages without "added value", but if the site's dominated by duplicate content then it's not.

Susan Moskwa may provide some insight into Google's universal philosophy on " value add" when responding to a content removal consideration.

As long as your site has worthwhile, original content on it (i.e. it's not just made to put AdSense on it), there's not a golden "text-to-ads ratio" or a word limit for ranking. I worry that you're just looking at the trees and not seeing the forest. Optimizing your site for search isn't about counting the words on a page, it's about making sure that you have useful, usable content, and then making that content accessible to search engines. [google.com...]


Somewhere else i recall Susan warning against the uneccesary elimination of content on sites. She seemed to be saying some sites have gone overboard with content elimination.

Is the tin foil hat that i'm wearing correct to assume that this covers added value on duplicate content clips as well?

[edited by: Whitey at 10:57 pm (utc) on Dec 11, 2011]

tangor

WebmasterWorld Senior Member tangor us a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month



 
Msg#: 4395198 posted 10:56 pm on Dec 11, 2011 (gmt 0)

Some industry standard texts should not be messed with like: "This lawnmower is not designed to trim hedges." :)

A single sentence of that type should not (theoretically) cause a duplicate content. However, given the increasing REDUNDANCY of content found across the web, it might be a factor where the SEs arbitrarily pick and choose which sites appear for any query. (10,000 sites, 10 top spots minus whatever big brands appear first...) That's a pretty small target to get listed.

Whitey

WebmasterWorld Senior Member whitey us a WebmasterWorld Top Contributor of All Time 5+ Year Member



 
Msg#: 4395198 posted 11:06 pm on Dec 11, 2011 (gmt 0)

"This lawnmower is not designed to trim hedges."

But machine generated duplicate content with wildcard insertions amongst x characters for ranking purposes is probably going to act against a site.

My gut feel tells me this example is essential to retain, but if it's manipluated with repetitive sitewide keyword variances for ranking purposes, then it's going to fall into another version of duplicate content which is unacceptable to Google and worse than aggregated content. Any views ?

acee

10+ Year Member



 
Msg#: 4395198 posted 10:33 am on Dec 12, 2011 (gmt 0)

Duplicate content is the soft target of onpage factors that is most likely to get every webmaster wondering if they've had their content copied or if they've over egged the snippets from other pages.

Google has been fighting duplicate content for many years and must be the foremost authority on the planet on this subject. So why are the search results still full of pages that use variations of one article from the same author, thin pages that contain almost exclusively syndicated content with no value add, the same forum posts across several domains, and sites with a large proportion of the same copy on all pages?

If they were unable to totally remove duplicate content with their algo's prior to Panda, I'm pretty sure they would have nailed this afterwards, but the evidence seems to suggest otherwise.

Why are the first three results from Amazon whenever you search for consumer goods followed by price comparison sites using entirely affiliate feeds even though you didn't include the words 'price' or 'comparison'?

Because they are engineered, not natural!

If Google wanted to improve the quality of search it might help if keyword proximity played a role more than simply evidence. Google encouraging webmasters to consolidate content is likely to decrease relevancy of many pages because a larger gamut of keywords are present. Surely concise content is the friend of search and verbose the enemy!

Let's be objective, setting aside a pro or anti-Google outlook, and ask yourself if these search results look like (1) they were delivered to your browser purely for your benefit, (2) were they excessively skewed in favour of Google's profits, or (3) were they a compromise that you found satisfactory?

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Google / Google SEO News and Discussion
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved