The bad news
Stolen content is common. Once a site ranks the content will be stolen. It's not a matter of if the content will be stolen. It will be stolen.
I once busted somone who had scraped my part of my site and was selling it on eBay as an Instant Website for AdSense package. I once had to send a takedown notice to a university dean because a professor had stolen an entire article, used it at a conference, then posted it online with links to all my competitors. Yes, it's pretty ugly out there.
The good news
It used to be that stolen content would cause our pages to rank less well. The effect is not as dramatic anymore. Google's ability to identify the correct document is vastly improved and I find I don't have to chase infringers anymore.
You might benefit however from chasing down the highest profile infringers, like infringement on sites with decent pagerank, and especially infringement on .edu sites.
Can stolen content lead to Panda?
Whether this is Panda related, site unseen, just to hazard a guess and a shot in the dark, I'd say probably not. Stolen content is everywhere. It happens pretty much to every site that has ever ranked for a keyword. Stolen content is so widespread that if that could lead to Pandalization there would be massively widespread collateral damage.
Thanks for your reply and the story about the professor is incredible! Wow!
Let me add a qualifier and see if it changes your opinion.
Q: What if I didn't know the content was being copied for YEARS. Like 10 years. I was just happily bouncing along writing my content - never dreaming it was being copied. And now - only now, 13 years later - I realize this.
These copied versions have been up for 4 or 5 years. Does that make a difference?
The only way to know if any of these take downs and DMCA progression work has helped is for Panda to re-run - correct?
Pardon me for interjecting, but the OP is being out ranked by the scrapers
My site does NOT rank at all or ranks at the bottom for my OWN content
when I do these searches
My take is that the more Google trusts/knows a site and its writers , the less likely this is.
I am titoeing towards editorial content, articles that is, and this has always been something that worried me, spend time an money , then get sunk by G algo and some friendly scrapers
Wonder if you have any recommendations to avoid such
What you are experiencing in my opinion is an epidemic.
Is this an issue with the other search engine? That might give some insight whether this is Panda or not.
My experiences with this situation are well documented already. I'm in the same situation. In my opinion it is because of all the recent algo changes, that now, copied content from your site can easily outrank you based on who is doing the copying.
There is another thread here with a link to a Google doc in which you can submit to them directly, searches that show scraped pages outranking the originals.
There has and always will be scraping and copying. I get that. I accept that. It's only an issue when you are being outranked. Right now it appears that if a good PR site decides to copy, they will pretty much outrank any smaller, lesser PR site. A pretty easy search engine to manipulate. Perhaps the playbook is out there right now. No idea but it's broken. I'm going to bet that the "other" search engine isn't having the same difficulty. Please advise regarding that. We make enough noise, GooG will perhaps realize that something very fundamental has been broken here.
I'll let you know if I come back from it. My situation might be slightly more unique, but I've put some measures into place to break from the scrape. Although they rank for my stuff currently, those link aren't accurate and updated in the results because click those results will ultimately take people to my site. It's a framing issue in my situation.
I think anyone who hasn't been taking their own text and searching in GooG is living with their head in the sand. This is a new problem and most people are oblivious to the fact that it's maybe not them that's the issue here. Duplicate content loss of rankings is nasty nasty nasty.
|What if I didn't know the content was being copied for YEARS. |
This is happening to pretty much everybody who ranks for keyword phrases. Most people don't search snippets in search of content theft and are going about their business unawares.
IF copied content led to Pandalization, then the collateral damage would be massively widespread. Clearly, stolen content does not lead to Pandalization.
It is common for content thieves to outrank the original site for snippets, even when the original site ranks well for keyword phrases.
So in my opinion what you are seeing is not the cause of your troubles. It is just a symptom or side-effect of it. Don't confuse a side-effect with a cause. That will lead to confusion.
Your dates seem remarkably similar to mine Frost_Angel, I've just submitted abuse reports to Google and a major registrar.
|It is just a symptom or side-effect of it. |
Whilst I am not disagreeing with this it evidently smacks of an almighty fubar by Google that this could have happened and never been rectified when so many have complained and questioned why they were penalised.
|Is this an issue with the other search engine? |
For me definitely not.
I just checked - and Bing is not giving me the same issues as Google. If my content has been copied in Bing -- I still rank #1 over the copiers. At least that's comforting.
So it does seem to be a Google issue. Almost as if... once a page on your site is copied so many times it gets worked into your site's "scoring" - (negatively) and you get dinged as no longer being the authority? Maybe that's far-fetched. I *DO NOT* profess to be an expert on anything Google. That's why I'm here. But it does seem to me that it will eventually effect your site's rankings.
Is there nothing to do about scrapers - as far as prevention? I use Wordpress.
Should I no longer have an RSS feed?
I realize there are ways for someone to download your entire site with a click of a button and then slap it up as their own. You just have to be looking for these people - because I can tell you that Copyscape is basically worthless. You *HAVE* to do the copy/paste check method if you want to know what's going on with your content.
I think with all this authorship markup stuff -- there should be another way or better way to tag a site as the original author. It's stupid to ask us for "quality" content and we give it, only to be shoved right out of the picture by thieves. It's like me walking into a museum and claiming that I painted the Mona Lisa? It's JUST as ridiculous.
This is where I don't understand all this praise for Google engineers and their "smart" algos - it's a bunch of blah blah and pretty worthless if they can't even figure out how to give an original author credit? It seems so basic - almost elementary.
I'm going to seriously question the statement that copied content doesn't lead to Pandalization.
Sounds like a lot of people to me here are hit by Panda or recent Google algo updates. Who is to say what the cause is? Well I can say that in my instance, I assumed Panda was the cause of my rankings loss. Little did I know that that in fact some A hole framed my site and now ranks as the authority on my own content.
So please do tell that scraping/copied content isn't a Panda cause. I would suggest that 95% of Pandalized people aren't thinking that their issue might be related to the fact that site are scraping their content which have higher PR and trust in Google's algo and thus the "Panda" misdiagnosis.
How about it might not be Panda itself, but it might be that your content has been hijacked/scraped and thus you suffer the Panda effect. So in other words updating your content, 301'ing, rewriting articles, subdirectory additions, etc all may simply be nothing to do with why you are hit in the first place?
How about that? The first step to dealing with Google today is to take your damn content, paste it into search, and see if you're outranked for your own content. That's how you troubleshoot Panda/Google STEP #1. If not, then you will be wasting your time and anyone else that you are advising of getting out of Panda/Google algo hell.
There are a lot of smart people here, but bad advice is rampant. People are clueless ultimately about this situation. My mistake for 8 months of "Panda" was not realizing that a scraper snatched ownership of my content. I could rewrite and do all the stupid Panda "fix" solutions out there, however the culprit was in fact that a site was re-broadcasting my site and Google decided that they are the authors. That's it. It's not Panda but it is Panda.
So I'm mad yes. Of course Panda is widespread and collateral damage is everywhere! The fact is where are the advice articles that suggest your issue might be what the OP has said here? Nope. Instead the experts are going to suggest you blow up your site to deal with Panda. What a crock of S.
I agree with you.
This is why....
Because I have done EVERYTHING and I mean EVERYTHING to get out of Panda hell. And I listened to "expert" advice. I even had a few experts offer to look at my site. And some even said... "Your loss of traffic over night on April 11th has nothing to do with Panda."
WHAT? That's one hell of a coincidence.
Only NOW am I looking into MY SITE being seen as the duplicator. I didn't consider it before - because I assumed I was the authority for my own content. Doing the cut/paste on my own content into Google search has been..... heartbreaking. SOOOOO many copies and I don't even rank? Some of the thieves not only rank higher than me - but they rank for stories about my life, my kids.... my family, my career AND, get this.... they are using images from my site and using MY bandwidth to run them! AND THEY RANK HIGHER THAN ME?! Or I don't rank at all?
How is this possible?
And how can this not have panda-ish consequences?
So at this point. I am left with the only solution I know.
1. Find/Identify the thieves using the cut/paste method. Yes... this is tedious BS - but there is NO OTHER WAY.
2. Either send DMCAs or if there are 20+ copies of your work, re-write your article/post and have a small mourning moment for the beautiful, personal, well thought out content you originally wrote that no longer belongs to you - even if it is something you wrote about your own personal life.
Google has this wrong. It's a flat out disgrace. I don't know who moderates these forums - I don't know if they work for Google or want to work for Google - not sure -- but there are a lot of regular people like me that lost their income, their work, their career because of Panda.
We are not all whiners with MFA websites crying over Adsense income loss. ::rolls eyes::
We are not all SEO Diva wannabes. Parading around with our big peacock feathers that drip of SEO-epic-ness.
Sometimes we're just regular people with blogs that kick ass. Some times nice guys finish last. Some times Google is wrong.
|I'm going to seriously question the statement that copied content doesn't lead to Pandalization... and thus the "Panda" misdiagnosis. |
You're contradicting yourself. You start out questioning that copied content does not lead to pandalization and then end up talking about Panda being a misdiagnosis, that it's something other than Panda.
I tend to agree with you that it's possible that some people who thought they were affected by Panda were actually affected by copied content. Two different issues.
Is it possible that the Panda algo opened the door to some sites being more susceptible to copied content? Is it possible that some sites are more susceptible to this effect than others? If so, what makes those sites more susceptible, crap links?
But if it's crap links making them more susceptible to copied content rank loss, isn't it also plausible that the crap links themselves are causing the loss of rankings and that the copied content did not play a role?
Copied content is a given. Those that rank well and those that lost their rankings both have stolen content. So if you're going to state that the stolen content is the root cause, then you will have to also explain why the effect is not universal if stolen content is something that affects pretty much all websites.
Nobody has supplied that explanation so I started the ball rolling for you by jumping to the other side of the argument and throwing it out there that perhaps the difference is crap links.
Ok, now I'm arguing with myself. Here is the question you should be considering: If stolen content is universal to all sites, ranking and those who have lost their rankings, but you believe stolen content is the cause of lost rankings, why was one site more susceptible to the effect over the other?
Is this a stolen content issue unrelated to Panda or is there a connection to Panda? It's possible there's a connection but somebody will have to explain what the connection is. This is the only way to explain why stolen content does not affect those sites still ranking.
That's the big hole in the theory that needs to be thoughtfully discussed.
[edited by: martinibuster at 7:20 pm (utc) on Nov 19, 2012]
Sites like BIG BRANDS - or companies with deep pockets and lawyers on retainer --- I bet they don't worry much about stolen content.
That's why they are all ranking so well?
Just throwing that out there as food for thought.
There is no hole in the theory. I'm not saying Panda only. I'm saying a GooG algo update.
I'm saying that if you tell yourself that it must be Panda for your lost rankings, you're going to find articles from so called experts and pros that have NOTHING to do with whether you site is being copied by another website. No. Experts are going to say your site has this issue or what issue and you need to do this and that.
Well, your efforts mights just be a complete waste of time if you consider duplicate content. Not on your site, but just duplicate content.
There is no hole in this because the reason I personally lost ownership of my entire site is because I lack PR and authority. The site who took my "stuff" is a PR god. Big brands have what? PR and authority. That is what's at play here. If you are strong enough, then you can get through this issue without a problem apparently.
You don't hear about this issue because? I'll say it again. It's not what the experts are telling people to look for when their sites tank in Google. That's a fact. Instead people are wasting their time and efforts when in fact they could simply tell GooG via their own form that scrapers are OUTRANKING them for their own content. People can't report what they are unaware of. People can't complain if they aren't being told what they should look for as being a possible culprit in losing all their GooG organic traffic.
This is new and this is pretty much a Goog exclusive. Enough tinkering on an algo and I guess there will be casualties. Until experts suggest starting at square one, then please don't advise.
Step one for loss of Goog traffic is determining whether you are still ranking ahead of scrapers for your own content. I shudder at the thought of what some people have done with their sites when in fact their only issue was that GooG gave a different site authority and ranking for their own content.
To clarify, all the advice here and elsewhere is about looking at your own content and site issue. Well here is the biggie here. It's nothing to do with your site at all. It all has to do with your content being attributed to somebody else who is using your content. External vs. Internal. Where are the suggestions that people look outside of their site as a possible cause for Panda or other GooG effects? This is my point.
Sorry but I'm fired up about this issue. I will go relax with a massive cup of coffee now (full of caffeine).
|To clarify, all the advice here and elsewhere is about looking at your own content and site issue. Well here is the biggie here. It's nothing to do with your site at all. It all has to do with your content being attributed to somebody else who is using your content. External vs. Internal. Where are the suggestions that people look outside of their site as a possible cause for Panda or other GooG effects? This is my point. |
This is why I came and asked this question. Because my site has NO duplicate content on it. NONE. Everything is original.
I don't have a Penguin penalty. I don't have an unnatural linking penalty. There is NO ATF issue, NO EMD issue.
I don't have any mega weirdness going on. I have taken a hatchet to my site trying to recover from Panda. Trying to find "low quality content, thin content, boilerplate content, duplicate content".
There is NONE.
And my site DID rank for 10+ years. So I'm not complaining about never ranking or having crappy content that cannot rank.
So after I am hit AGAIN by Panda 20 on September 28th - I am dumbfounded.
It's like WTH?
The only thing I haven't addressed - is the duplicate content "out there".
Now - can someone tell me.... does someone know....
IS THIS AN ISSUE THAT CAN CAUSE A PANDA PROBLEM?
See - no one really knows. So it could very well be a problem. You think Google is going to tell us if it is? That would be super nice - but I haven't found that gem of info anywhere.
Has anyone had any kind of uptick or recovery after cleaning up scrapers or duplicate content on other sites that stole their content?
Or are some of you saying....in essence... copied content happens all the time, it's the nature of the beast and you just have to let it happen and if it out ranks you, too damn bad?
Because it would seem there is no incentive to write another damn post if so. Because you're writing for scrapers for free.
|copied content happens all the time |
|it's the nature of the beast |
|and you just have to let it happen |
No. There are steps you can take to make it go away.
|and if it out ranks you, too damn bad? |
No. You may want to consider sending out takedown notices and if that fails then you may want to consider your options under the DMCA, if that's something available to you.
Other than Russian sites my written content has never been stolen - not that i can see in the serps anyway. Why? Pictures, yes on a massive scale but written content no.
I rank number one on many terms so why not me. Can i help any of you? Is it the subject matter of my sites (predominantly cooking and gardening) relevant? Ask away why my sites haven't been copied.
Frost_Angel, first let me say I set all my Wordpress feeds to excerpts instead of full feeds. It's cut way down on scrapers for me.
Second... it's actually possible you've got it backwards here. Whenever I've had scrapers outranking me, I'm convinced it was because Google was unhappy with my page, NOT because they mistook someone else's page for the original. If your articles are years older than your scrapers, it seems unlikely Google could be confused about who's the original author on so many of them.
I have a page that used to be #1 for ages in Google, but after Penguin it now gets outranked by scrapers. So I think sometimes it's part of the penalty, or maybe a byproduct of the penalty, that they let scrapers outrank you.
Now, by no means am I saying don't send out the DCMAs - I send them out myself. I'm just saying the outranking might follow the penalty rather than the penalty following the outranking. So, if you get all those sites removed - and DCMA can do that - then you'll be ranking best for your text snippets... but it might not help with your overall Panda issue. :/
|not that i can see in the serps anyway. |
Not in the SERPs. Not usually. Nobody's scraped your content? Unlikely. You just aren't aware of it.
Cut a six to ten word snippet and paste it into Google to find the sites that have scraped your content.
To answer your original question,
|Is Panda Loss Because of Scraped Content? |
yes, that was one of the widespread thought processes back when Panda first landed. But with my exp., I can tell you that scraped content outranking the original content owner is indication of loss of authority of that site.
Google seem to be tagging authority to content ownership and not to published dates and domain names. When Panda was introduced, sites that lost traffic also lost authority to varying degrees and this had the indirect effect of scraped content outranking the pandalized sites, who really were the original content owners.
|Not in the SERPs. Not usually. Nobody's scraped your content? Unlikely. You just aren't aware of it |
You are, of course, correct! I had occasionally checked before, obviously not checked enough.
|But with my exp., I can tell you that scraped content outranking the original content owner is indication of loss of authority of that site. |
So - what your telling me - is Panda hit me because my site lost authority?
It has nothing to do with people copying my content a bazillion times. That's just a symptom of losing authority?
If that is the case - why hasn't my site recovered after 18 months of improvements? I never lost pagerank... my site is better than ever.
It now seems that the scraped content has now had enough time to become entrenched while I was making improvements. This means that
there is no recovery from Panda - unless I re-do my entire site or start over - because you'll never get a leg up on the scrapers?
|So - what your telling me - is Panda hit me because my site lost authority? |
I think what indy's communicating is that Panda causes your content to lose authority. With the loss of authority comes a loss of it's ability to outrank those that copied your content. Which makes sense.
|I think what indy's communicating is that Panda causes your content to lose authority. With the loss of authority comes a loss of it's ability to outrank those that copied your content. Which makes sense. |
Sure as hell doesn't make sense to me, this is almost an oxymoron.
One has written an article and were ranked at #1 however someone has scraped it, served it up as their own and then the original is de-ranked since it is not the authority since NOW the scraper is...bunkum Google, if this is what you are, in fact, doing, then the sooner you're freakin' toast the better.
This begs a far, far bigger question though.
Google de-ranks a formerly authority site/article and awards the origin to the scraper...just why are Bing/DDG/Yandex/Grobe/Zapmeta and Baidu getting it correct and Google not?
Farce, it's a totally corrupt farce.
How did my site initially lose authority? Why did Panda get me in the first place?
It just seems super confusing now. I am not sure this thread has helped me at all.
HuskyPup, I think the word Authority is imperfect. We are referencing it in terms of PageRank.
You can think of it this way, that when a site is hit with Panda, it's PageRank, the web page's standing in the algo, is diminished. With the diminution comes it's inability to rank for phrases. If a site is unable to rank for phrases, it follows that other pages will come in to take it's place.
Naturally, it would be preferable for Google to suppress pages that are duplicates/stolen.
|when a site is hit with Panda, it's PageRank, the web page's standing in the algo, is diminished |
Mine never changed through Panda April 2011 and Panda September 2012 - it's stayed a steady PR4 - yet my traffic is at a quarter of what it was prior to Panda hits.
So if my PageRank never changed - did I lose authority?
Have you added your Author Profile in Google? It can send signals to Google that you are the authority and the originator.
You might like to read the two posts Lisa Barone put together from Pubcon.
|Mine never changed through Panda... |
No, no, no. You are misunderstanding me. You are referencing Toolbar PR. That is not what I am discussing. PageRank and Toolbar PR are two different things. The toolbar is something used by Google in a punitive manner to manipulate web publisher behavior. The Google toolbar can be thought of as the belt that's rattled at the dinner table to get the kids to settle down. PageRank, sometimes referred to by Googlers as Internal PageRank, is something more nuanced and is not reflected in what you see in the toolbar. Your actual PageRank can change and it won't necessarily be reflected in the toolbar.
With that in mind, reconsider the statement:
|You can think of it this way, that when a site is hit with Panda, it's PageRank, the web page's standing in the algo, is diminished. With the diminution comes it's inability to rank for phrases. If a site is unable to rank for phrases, it follows that other pages will come in to take it's place. |
Naturally, it would be preferable for Google to suppress pages that are duplicates/stolen.
Google, "How I Hijacked Rand Fishkin’s Blog" for a related article.
Every time Google turns a dial making some aspect of SEO more difficult - they make other areas of SEO easier.
Frost_Angel - re your "they are using images from my site and using MY bandwidth to run them!"
Are they still using the images? Why not change the images.... perhaps a new image which suggests their site has stolen the content without permission? Or something worse?!
Won't help much but could be very satisfying.
| This 98 message thread spans 4 pages: 98 (  2 3 4 ) > > |