|DMCA... and how to treat copied content for Panda? |
I'm new to filing DMCA requests and had a question for all of you experts.
In an effort to recover a site from panda, we've run our site through copyscape and have been finding all of the other sites that have stolen our content over the years. Some sites have just taken snippets, while some have taken full articles with images and everything. Some of these provide links back to us, though most don't. Which of these should I be filing DMCA requests against?
Also, what about ehow.com? Some of their pages have copied almost word for word some of our articles. Some of these articles have links to us and some do not. Should I be filing DMCA's against them (or other larger sites like ehow) or does that not seem to help at all? What about the articles that have copied, but provide a link? Is it beneficial to just leave it and have the link back? Or what if they mention us as a reference, but do not provide a link?
If it's just a snippet and maybe one image with a link that makes it clear people should click through for the source - the way LifeHacker features sites - then I don't think a DCMA request will be successful. That practice is just too common, and I believe it's all considered "fair use" (you can Wiki that for more info).
If they're using snippets WITHOUT links, a DCMA may work. I haven't really had that situation myself.
If they're just copying full articles with or without images, OR using several images, the DCMA will work very well. I issue these for a few of my pages that people keep stealing outright periodically, and Google wipes them out of the index within days (two weeks, tops).
If they steal a whole page but leave a link to me, I still DCMA them. That link is never going to send you traffic or constitute a quality inbound. They're just making money off your content.
As for ehow, I'd say give it a shot - you'll know within days whether it worked. Rewording is, sadly, enough to get around the DCMA, but how much rewording? It may be that Google requires it to be exactly the same language before they'll do the DCMA, but there may be wiggle room if it's only slightly reworded. I think it's worth checking into. I'd also love to hear your results!
Panda has nothing to do with duplicated content.
I'm not convinced that Panda has "nothing" to do with duplicate content - it is a very complex algorithm.
Remember when Panda 1.0 was originally called the "Farmer Update"? The step that Google took the month before was called the Scraper Update - something considered necessary to make Farmer/Panda work out properly.
Ehow links are nofollow, they're worthless. There's also a meta tag you can use to stop them scraping content, I have just implemented it myself, so I've yet to see if it's any use.
<meta name="ehow" content="noclip" />
tedster - the first thing I worked on was duplicated content, to beat Panda, but then Matt Cuts said that Panda has nothing to do with duplicated content.
I'm fairly confident duplicate content (as in content that also exists on other sites) is a major factor in Panda, but having duplicate content doesn't necessarily mean you'll suffer from Panda.
Getting scraped content taken down and/or taking ownership is essential for panda sufferers I believe. Especially if your content is being used by well presented, well known sites like ehow.
|the first thing I worked on was duplicated content, to beat Panda, but then Matt Cuts said that Panda has nothing to do with duplicated content. |
I know - but that doesn't mean something isn't going on that Matt couldn't foresee. For example, he said there was no Sandbox at first, but later discovered a complex interaction in the algorithm that was creating the Sandbox effect.
In this case, I know someone who did a lot of Panda analysis and they regularly found a higher level of scraped content for Pandalyzed sites than for sites that weren't affected. That is just a correlation, of course, and not a proof of cause and effect. But it was enough for me to take a look and then take some DMCA action as well as to rebuild trust and authority for an affected site. And, it did recover.
Again, I can't PROVE anything here - it's only one example and something else might have done the trick. But with any complex system, like Panda, people can't always predict all the effects that will emerge from the complexity. It seems certain that Panda hurt more than Content Farms, that's for sure - whether intended or not.
Some companies, especially on Youtube, are using automated bots to crawl sites/videos and find duplicate content which then cause the bot to report the url to Google.
You can catch a bunch of them by downloading a wikimedia file of a nasa rocket launch for example and load it to your youtube account. The wikimedia/nasa video is not copyrightable on several levels. The bots from news agencies will then trip all over the footage wrongfully thinking they are the copyright owners because they published parts of the videos earlier in their news reports.
Unfortunately that means using any part of even allowed public content may yield DMCA complaints against you, and that's not a good thing.
I'd recommend doing it manually and target repeat offenders and the most blatant companies(yes, companies will scrape more than individuals nowadays). Slap your site logo on any image/video/flash files.