Welcome to WebmasterWorld Guest from 188.8.131.52
Remember a few months ago when someone posted here about a decline in earnings then getting serious about fighting thieves who were using her content with AdSense and her efforts resulted in AdSense earnings north of $4K per month?
Have you ever considered what the impact on our earnings might be if the AdSense accounts of everyone using stolen content were disabled tomorrow? There is no magic bullet to make that happen, but we all can still make an impact.
I've suggested previously that every publisher take a little time to find just one content thief per week and then submit the DMCA information to AdSense as they suggest.
If just the publishers who post here on WW would do that (minus any thieves lurking around here), it could have an impact. Statistically speaking, from time to time some of us are going to find one of those publishers with tens of thousands or hundreds of thousands of pages of content. And if that AdSense account is disabled, the ads disappear from all the pages in that account, the pages with the stolen content as well as the legit content.
I'm not claiming this will solve all of anyone's earnings problems, but it's something we all could easily do that would benefit us individually and as a group.
And if for no other reason, it's informative and somewhat entertaining to find the places where your content shows up.
Say I post an original article on my site. My site does not have commenting available (call me old fashioned) so my page remains as it is since the day it's posted.
However, say there is a site which uses a significant portion of my article as a verbatim citation with perhaps just a line of original text about it around it with either linkbacks or just a mention. Pretty legit use of content I'd say.
BUT, here's the interesting thing: that particular site has commenting available, and over time, gets a barrage of informed/uninformed/good/bad comments on something that I WROTE in the first place on my site.
And by virtue of that, Google ranks THAT site's page higher in SERPs because to Google that particular page has more to offer than my original page in terms of freshness and quantity of added material.
I know what the logical conclusion is: I should enable commenting, but may be I don't want my content to be commented on (don't have the time/energy to moderate, so rather do without it). Besides, most of my content is now years old, and allowing commenting only NOW makes it look even more desperate and too-late on my part.
Fighting stolen content is, for the majority, a complete waste of time. While you waste time fighting one, another just pops right up.
Not when you factor in the AdSense component. When you get their AdSense account closed, they tend to not copy your pages anymore as the incentive is lost and they know you aren't going to sit idly by and let it go.
By the way, "a complete waste of time" is what some publishers used to say about fighting/reporting certain AdSense business models. Now the word is out that it's risky and a lot of those sites have disappeared.
Fighting stolen content is, for the majority, a complete waste of time.
What's the purpose of creating new content and letting others use it against you?
When I cracked down on content theft a few years ago my revenues nearly tripled in a year.
Besides, you can file a mass copyright for your entire website for less than $50 which allows you to sue for statutory damages upwards to $150K in the US.
So if you must let them steal, plan to profit from it.
I'm with FarmBoy, nail the AdSense scofflaws and it's more for the rest of us.
[edited by: incrediBILL at 8:37 pm (utc) on June 9, 2009]
I've suggested previously that every publisher take a little time to find just one content thief per week and then submit the DMCA information to AdSense as they suggest
It would be a hell of a lot easier if you could file your DMCA by email, and I know of people who can. That is for both AdSense and Google.
Faxing from Australia is incredibly expensive and snail mail is an absolute pain.
My content thieves also usually leave my copyright notices in. Work that one out.
Google Web Alert for: allinurl:www.example.com mostly that results in scraper directories.
Also there is in Google Webmaster Tools [I think] a facility to seek pages with similar content. I haven't used it for awhile though.
Then when the mood takes me, I get Google to search for certain phrases unique to me that I have used in my topics.
That last one works every time.
How do you typically find out that your content has been placed on another site?
I have CSS hidden keywords scattered throughout the pages that not only ID them as mine, but finger the IP address of the visitor that scraped them.
They tend to remove my CSS and just display the text and VOILA! there it is, the search engine crawls it, I check to see what's newly indexed every now and then.
They really have no excuse when I can directly tie them back to the scraping event.
Also, I have scripts installed that thwart bots after a few pages so there's not much copying going on anymore, not like there used to be.
It still makes me boil inside every time.
One of the worse culprit is Blogger, and it's one of the most difficult because you can't contact the author and Google's procedures are more complicated than most. People still uses faxes? You get the worse Adsense publisher infringing scum on this blog network.
One of the worse culprit is Blogger, and it's one of the most difficult because you can't contact the author and Google's procedures are more complicated than most. People still uses faxes? You get the worse Adsense publisher infringing scum on this blog network
In my personal experience, 100% correct in every respect mentioned.
...and it's one of the most difficult because you can't contact the author
I don't want to contact the thief. When AdSense takes a look in response to the DMCA submission, I want them to see the AdSense ads right there beside the stolen content. If I contact the thief myself first, that gives him time to clean things up before for a while.
If it works for you guys, more power to you.
The efforts of other publishers can benefit all publishers, including you, in two ways:
1. If I get a thief's AdSense account shut down and he has also stolen some of your content, he is no longer making money off my content or yours
2. As more AdSense accounts get shut down and the word spreads, more thieves will be hesitant to engage in future content theft, especially those who have some stolen content and some original content
Fighting stolen content is, for the majority, a complete waste of time.
I have personally lost significant amounts of AdSense income because of copyright infringers. Three months after taking down over a dozen high level infringers (including a university lecturer's web page at a major university), the money is flowing back to me.
Three months after taking down over a dozen high level infringers (including a university lecturer's web page at a major university)...
I've been amazed how often the thief is associated with a university or K-12 school system. Back when I used to contact the site owners directly, they usually pleaded poverty that they had to take content from others (without attribution) because they had no funds to develop their own.
Meanwhile the poverty-stricken universities were building $100,000 stone entrance gates where a sidewalk entered the campus and the poverty-stricken K-12 schools were spending upwards of $10K per student per year.
Over the last 9 years I periodly change a word or two on my home page... Of course then when I search for an exact sentence I miss many stolen pages from months or years ago.
So, my question is does, say, a 9 line paragraph have to be identical on the other site or can it be considered stolen if a couple of words are different?
Heh! I'll give that some thought. Any ideas or suggestions?
I imagine a resources section regarding copyrights law, a mission statement about authors getting fed up with being copied on the web, some educational pages for victims, helping them find their stolen content on the web, but also some clear pages explaining to infringers why what they did was wrong so it can easily be linked when sending a complaint, maybe even some pages for web hosts on how to deal with infringers on your network, possibly a blacklist of hosts that need to wake up and be more proactive, a forum to coordinate collective efforts toward specific massive infringers (I sometimes notify other authors to help bring someone down), etc. Just a site to promote "the cause" ;) A lot of webmasters aren't aware of being copied.
I wonder if any of you guys were among them :)
With a very large site, how would I even begin to try to protect myself from thieves/scrapers?
Whitelist access to your server to just allowed bots and validated browser user agents:
The set up a DMZ security zone, which many of us do in the Spider forum, which is set up massive deny lists of all known data centers since real visitors don't live in data centers, we block them and punch holes in the firewall as needed for various sites we allow.
Also, blocking specific countries you don't do business with, if you're an ecommerce site or such, can solve a lot of problems. One of the spider forum members only allows the US to visit his site, that's it.
Then you can manage the scrapers that don't want to be caught, hiding out as browsers, using server side PHP or PERL scripts that stop bot behavior
Last but not least, tag your content with hidden tracking bugs easily found in SEs.
Almost all of this can be done with automation using PHP or PERL and can be run before each page viewed, even if you have a static site.
It's basically installing a very elaborate port 80 firewall, loads of fun and very educational too!
Even some so called gurus are encouraging to use content like this...
it is really a serious problem and i think it is a good idea if we all can report at least one website with content stealing to stop this...