|Effects of Different Types of Scraped and Duplicate Content |
I would like to start a thread about the effect that different types of scraped or duplicate content can have on a website. I have sometimes heard that if your content is duplicated on another website, it won't really affect your rankings and I have also heard that if your content is duplicated on another website, it can have an effect on your rankings.
Below are a few ways that I have seen a website's content scraped and duplicated, and I would really appreciate your thoughts regarding the impact that it can have on your website. I should also mention that when I say content has been duplicated or scraped, I mean a large part of the body text or the entire body text of a page on a website is appearing on another website(s).
(1) If your content is appearing on other site(s) and there is no link back, can it impact your rankings? If it can, is it rankings for certain keywords that the content ranks for that is affected or is there an overall effect on your site?
(2) If the content is appearing on other sites but there is a link back to your site, will there be an impact on your rankings? If there is an impact, would it be specific or more overall in nature?
(3) If the content of a page on your site was copied by another site with a link back to your site and the other site also tags the content with many keyword phrases so the content is essentially appearing on many pages of the other site with a link back to your site, would your site be affected? (I know that this question is kind of long, but it is a situation that I am facing)
(4) If your site's content is appearing on other websites, but when you do a search for a unique string of text in quotes on a search engine and your site appears first in the SERP, does it mean that although the content of your site has been duplicated, it is not impacting you?
If I was them (Google/Bing), I would give a high (+) ranking score to pages that were scrapped, even more than incoming links. Because these pages are probably useful, otherwise why had people scraped them anyway?
From what I've seen so far on my sites that were scrapped in different creative ways [(1)-(4) and more], it has never made any issue. I guess it helps in one way or another.
You asked so I'll tell.
The effect of scraped content. Let me speak to that.
All you wordpress users out there. Hear ye, hear ye. Take a sentence or more from an article from a post. Paste into Google. Results? If there is anyone ranking above you for that content (if it is in fact a duplicate of your text/article a portion or in entirety) then you should be filling out the Google scraper doc form and let them know where a scraper is outranking the original content.
Do it from another snippet or post. Some older, some newer. It's the only way you know for sure what is going on.
Okay let's speak to impact of somebody creatively using your RSS feed for their own content. Maybe it's in a frame on their site, but they are certainly displaying as much content as you provided to them via RSS. Is that called scraping or using what you made available to them?
I can tell you that for me, the content (which was from a RSS feed and was a small percentage of my site content overall) that was being duplicated actually became "their" content in Google rankings. In fact whatever happened, my entire site was washed from Google. Cleansed as it were. A drop of what, 90%, 95%, 98% or organic traffic.
I suppose this really depends on who it is taking/scraping your stuff. If they possess high PR then guess what? My theory is that's why I actually don't own the content anymore in Google's eyes. That creative scraper is held in high regard for some reason and as a result, this algo gives them ownership at some point. I'm not saying in all cases, but in my case and in other cases this is what happens.
If you take your text, paste it and you rank #1 for that exact text as your wrote it, then there is nothing to worry about. I'm sure there are losers scraping some of my RSS feeds right now but the fact is, those sites are deemed junk by Google and can't steal ownership from me.
You can ask whether linking back works etc, but on this experience for me, I would never ever publish a RSS feed from my site. Period.
So based on what I've experienced on this particular issue with the "new" Google, ownership theft is here. That would be site killing btw. My #1 income site in 2011 disappeared in 2012. Dead dead dead. The key thing to do is be proactive. Take "your stuff", Google it and make sure that you still own it. People are talking a bit about higher PR sites being able to "hijack", but I simply call it people stealing ownership of your content. The problem right now is people are oblivious to checking their own content by Googling it. It's a must do. It's a part of the daily check ups for me now.
Oh, so in my instance, the creative scrape has links back etc but it simply doesn't matter. It their content now.
Agreed. It depends on who scrapped your content and how your site is ranked compared to the scrapper.
If, for example, WSJ takes things from your site and publish them, it will surely outrank your site.
I once had some issues with Linkedin and eHow, but both are no longer seen as a competition.
I have posted a <thread about a slightly different but related question involving dupe content that might also be of interest, about whether to change pages that have been scraped a lot but are still ranking>...
Would You Improve Well Ranked Pages in Google?
|Above all, many of these pages were copied/scrapped thousands of times all around the web. Making some modification could confuse Google algo so it misses the originality. |
[edited by: Robert_Charlton at 9:31 pm (utc) on Nov 15, 2012]
[edit reason] removed off-topic comments [/edit]