Welcome to WebmasterWorld Guest from 23.20.8.182

Forum Moderators: Robert Charlton & aakk9999 & andy langton & goodroi

Message Too Old, No Replies

Are DMCAs the answer to Panda?

     
2:34 pm on Jun 14, 2011 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:May 14, 2006
posts:656
votes: 5


The past 14 days I have been very busy checking every page of my website for copyright violations. Result: more than 500 DMCAs (!)

Conclusions drawn from this investigation:
My pages that have been copied were the hardest hit by Panda. Some pages that were copied multiple times and that used to be on page one of the serps simply disappeared from the serps after Panda.

Five year old pages were outranked with my own content by websites and especially (spam) blogs that copied it.

I'm afraid that Panda doesn't take the ownership of the content into account and it's easy to get outranked by your own content.

With "thin content" Google might mean "duplicate content". The problem however is that Panda doesn't take the original source into account and thus penalizes content owners too.
3:15 pm on June 14, 2011 (gmt 0)

Senior Member

WebmasterWorld Senior Member tedster is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:May 26, 2000
posts:37301
votes: 0


There's no doubt that for some sites, scraped content is a big problem with Panda. Unfortunately, even those who successfully handled most of their scraper issues still did not recover.

There was a sequence of updates here. A month before Panda 1.0 there was a Scraper Update [webmasterworld.com] that was supposed to lay the groundwork by getting attribution correct. I'd say that was the algo that messed up.

At least Matt Cutts has publicly admitted that there is a problem here, and he said that Google will be addressing it. In the meantime, I guess a little DMCA practice can be a valuable experience. Checking out the scraper landscape can certainly open the eyes!
3:18 pm on June 14, 2011 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:May 14, 2006
posts:656
votes: 5


Checking out the scraper landscape can certainly open the eyes!


I agree, but don't do it if you have a weak heart.
3:28 pm on June 14, 2011 (gmt 0)

Senior Member

joined:Dec 29, 2003
posts:5428
votes: 0


chrisv1963, it's your content so go ahead with DCMA.

Also look for proxies doing essentially the same by caching your pages, I had 5 of them, two even outranked my home page with filter=0.

I still wonder if this is the cause or the symptom, but no one should have your content anyway.
3:49 pm on June 14, 2011 (gmt 0)

Junior Member

5+ Year Member

joined:Feb 25, 2011
posts: 176
votes: 0


@Walkman,
I found proxies caching my pages too. What is the best way to fight those? Does a Google DMCA or spam report do the job?
3:55 pm on June 14, 2011 (gmt 0)

Senior Member

joined:June 3, 2007
posts:6024
votes: 0


My pages that have been copied were the hardest hit by Panda. Some pages that were copied multiple times and that used to be on page one of the serps simply disappeared from the serps after Panda.


Yep, welcome to my world, I've been shocked at how many sites have copied my stuff, not just text but one site from about 7-8 years ago has copied an entire site's coding and images as well.

I know it's from 7-8 years ago by the on-page text etc.
4:20 pm on June 14, 2011 (gmt 0)

Full Member

5+ Year Member

joined:Sept 14, 2010
posts: 205
votes: 0


Yep, it's a big issue. I recently checked my <12 month old site and it had been scraped loads. Over half of the 'sites' my content were on were Google Blogger blogs.

Overall Google are good at reviewing the DMCAs and taking prompt action, but one case really annoyed me.

A Google Blogger blog had 7 pages; 6 of them were my content (and the other was stolen from another website). I reported each of its 6 pages, and Google took action and removed the pages.

However Google really should have removed the blog full stop - it was clearly a spam blog, and the DMCAs for 6/7 of its pages clearly tell this.

But then the user simply re-added all the pages! So I filed more DMCAs - and also e-mailed Google Blooger asking for them to use common sense and remove the blog (in a nicer way than that, of course!) - and they ignored my e-mail but removed the pages.

A week later, the user had re-added the pages.

I just given up. Google will take minor action when they receive a DMCA, but they're too lazy to actually take down the blog. And I very much doubt that their system doesn't record how many DMCAs a blog gets (relative to its number of pages), so it probably is a case of Google doing the bare minimum to fulfil the DMCA.

Ah well. I'll keep doing some DMCAs, but in this site's case, I've given up. They aren't outranking me, so it's not the end of the World I guess. Still annoying though.

In short, I'd definitely do DMCAs - but as above, it's not for the faint of hearted! Especially since so much of the stolen content is hosted by Google!
4:33 pm on June 14, 2011 (gmt 0)

Senior Member

WebmasterWorld Senior Member planet13 is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:June 16, 2010
posts:3813
votes: 29


Especially since so much of the stolen content is hosted by Google!


I wonder if someone can sell some of their content to a copyright lawyer in Santa Clara who would be interested in suing google. Can that be done in cases where google has shown repeated neglect even when they have been informed of copyright violations.
4:49 pm on June 14, 2011 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:May 14, 2006
posts:656
votes: 5


Google will take minor action when they receive a DMCA


I agree. They do take action ... but really the minimum. They will only remove the page with the infringing content. The rest of the spam blog is left untouched, even when there are plenty of adult banners, multiple pop-ups (the worst I've seen is 10 pop-ups on one page) and attempts to install trojans on your computer. Be sure to have an updated antivirus program when you check Blogspot for infringements. It's a jungle!
4:51 pm on June 14, 2011 (gmt 0)

Senior Member

WebmasterWorld Senior Member tedster is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:May 26, 2000
posts:37301
votes: 0


I found proxies caching my pages too. What is the best way to fight those?

Usually this happens when googlebot crawls your content via the proxy address. In other words, the proxy server does not really cache a copy on their server, but googlebot actually crawls your content via the proxy.

In those cases your best bet is to do the "double reverse" verification for googlebot [webmasterworld.com] - and verification for googlebot is good to have in your permanent setup anyway. Many scrapers also come calling faking a googlebot user agent.

When googlebot crawls via the proxy, the user agent will say googlebot but the IP address will belong to the proxy server. Don;t serve your content in that case - say 403 or serve whatever you want. Google is no longer confused about the proxy owning your content.
4:59 pm on June 14, 2011 (gmt 0)

Senior Member

WebmasterWorld Senior Member wheel is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:Feb 11, 2003
posts:5069
votes: 12


. Result: more than 500 DMCAs (!)

Wow. 500? I did a bunch (nowhere near 500) over the past month or so. The end result? I just called the last guy and told him I'm not doing a DMCA, too much work. Either remove the content or I call a copyright lawer. And that's where I'm going in the future.I'll use DMCA's if they're hosted in the US, otherwise I'm going right to the laywer. I don't have time to do 1000 DMCA's.
6:44 pm on June 14, 2011 (gmt 0)

Junior Member

5+ Year Member

joined:Apr 13, 2011
posts: 154
votes: 0


My pages that have been copied were the hardest hit by Panda. Some pages that were copied multiple times and that used to be on page one of the serps simply disappeared from the serps after Panda.


That's exactly the same situation I'm in. I have greatly reduced number of scraped pages by blocking RSS scraper's IPs (where possible), set up a bot trap for non-RSS scrapers that are using bots, an filling DMCAs.

Also did Google Search spam reports, but that wasn't effective.

Finally, I noticed today that the Google Custom Search on search.icq.com displays scraper-free results for all terms I investigate. On search.icq.com I rank with my original content very similarly to rankings that scrapers hold on Google.com with that same content of mine.

[edited by: tedster at 7:46 pm (utc) on Jun 14, 2011]

8:20 pm on June 14, 2011 (gmt 0)

Senior Member from CA 

WebmasterWorld Senior Member 10+ Year Member

joined:June 18, 2005
posts:1733
votes: 18


Same situation as you guys. My site most affected by Panda was really copied all over the place, and articles where I eliminated most detectable copies with DMCAs started to rank better.

The worse culprit is Google Blogger / Blogspot. One blog consisted exclusively of articles copied from my site, with Adsense everwhere. Google do remove infringing pages (with a copy of your complaint sent to Chilling Effects, ugh) but will not close down the blog, even after removing all content twice because the blog owner republished it! There couldn't be a more obvious spam and MFA site.

I got the feeling Google had zero respect for original content creators and was just trying to protect itself. Most other hosts seem to send a strong warning to plagiarists once they get caught and they don't dare do it again.

If I were black hat, I'd simply copy my competitors content on Google Blogger and watch them tank.
9:10 pm on June 14, 2011 (gmt 0)

Senior Member from GB 

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month

joined:Apr 29, 2005
posts:1935
votes: 61


Maybe I'm a bit naive but I think the reason my main site has never been scrapped is that the majority of pages are .asp - old technology I know but more modern technology has the same effect.

The .asp handles pages dependant on previous user input - in my case mainly location.

The key effect is that any scraper doesn't know what the content of the page will be at any one time. What they are scraping may be rubbish or at the very least entirely location based. Any thoughts?
10:19 pm on June 14, 2011 (gmt 0)

Senior Member from GB 

WebmasterWorld Senior Member dstiles is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:May 14, 2008
posts:3121
votes: 3


Yes, naive, I think. :)

I use ASP Classic for all my and my clients' sites. Some pages have the same filename with different querystrings, others have different filenames, others are more mixed. In some cases content is fed from a database and in others not. In the case of database sourcing only searched data is usually different for a given filename.

You can get exactly the same result using PHP or any other site-building language. At the browser/bot they ALL look like HTML (or XML or whatever) unless you have a bug.

The ONLY way of preventing content scraping is to block user-agents and IPs (and a few other things I will not tell you!). You may still get a small amount of scraping but the biggest scrapers will not get you without a lot of proxy work, and even proxies can usually be blocked.

The only scraper you may then have trouble with is google. :)
10:52 pm on June 14, 2011 (gmt 0)

Preferred Member from GB 

5+ Year Member

joined:Sept 29, 2009
posts:444
votes: 19


a couple of questions about DMCAs to google - do they still insist on the DMCA being faxed. Or do they now accept mailed DMCA notices?

is there a quicker process for blogspot? Where are you sending your DMCAs?
11:14 pm on June 14, 2011 (gmt 0)

Senior Member from CA 

WebmasterWorld Senior Member 10+ Year Member

joined:June 18, 2005
posts:1733
votes: 18


is there a quicker process for blogspot?


Essentially it all goes to the same place where they ask you to select a few options before deciding where you should go, but Blogger has also a convenient "report this site" link at the top.
11:30 pm on June 14, 2011 (gmt 0)

Junior Member

5+ Year Member

joined:Apr 13, 2011
posts: 154
votes: 0


My experience is that its better to fight blogspot scrapers by filling DMCA through google.com/dmca.html than to do "report this site".

I have done multiple "report this site" reports, but after they remove the blogspot page, it continues to rank in SERPs even without any content for a long time (for example, one of reported and removed blogspot page remained on the first SERP page for almost two months).

DMCA via google.com/dmca.html removes scrapers as soon as they process a removal request.
5:49 am on June 15, 2011 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:May 14, 2006
posts:656
votes: 5


DMCA via google.com/dmca.html removes scrapers as soon as they process a removal request


Login to your Google account when you submit a DMCA and Google will process it quite fast. I have the feeling that you have to build up some "trust" when submitting DMCAs.

My first DMCA was processed after a week and the more DMCAs I submitted (with all infringements approved by Google), the faster it got processed. The last DMCA I submitted took only about 3 hours.
8:00 am on June 15, 2011 (gmt 0)

Full Member

10+ Year Member

joined:Mar 31, 2004
posts: 202
votes: 0


A few questions about DMCA:

* "What Google product does your request relate to?": what choice is most effective here, Adsense, Blogger or Web Search?

* Does the copyright violater know where the DMCA is coming from? What if he decides to take revenge and starts click bombing Adsense ads on your site?
8:13 am on June 15, 2011 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:May 14, 2006
posts:656
votes: 5


A few questions about DMCA:

* "What Google product does your request relate to?": what choice is most effective here, Adsense, Blogger or Web Search?

* Does the copyright violater know where the DMCA is coming from? What if he decides to take revenge and starts click bombing Adsense ads on your site?



For Blogspot infingements, a blogger DMCA is the most effective.

For other infringements a Web Search DMCA is the most effective + a DMCA to the infringer's hosting provider.

When the infringer is violating Adsense policies, I always report this too. In the event that he gets his Adsense account disabled, there's less need to copy content from others ... :-)

I think that the violator does know where the DMCAs come from but I don't mind. If he knows that I report violators succesfully, then he might think twice before stealing content from my website again. I'm sure that Adsense detects click bombing and ignores those clicks. Otherwise it would be extremely easy to get the accounts of all your competitors blocked.
8:56 am on June 15, 2011 (gmt 0)

Senior Member

joined:Dec 29, 2003
posts:5428
votes: 0


* Does the copyright violater know where the DMCA is coming from? What if he decides to take revenge and starts click bombing Adsense ads on your site?

Not only they do, but the entire world does, possibly 'forever.' Google posts them on [chillingeffects.org...] so be careful what you write.

It will be interesting to see if this helps anyone moving up in SERPS.
10:11 am on June 15, 2011 (gmt 0)

Senior Member from GB 

WebmasterWorld Senior Member piatkow is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Apr 5, 2006
posts:3329
votes: 22



a DMCA to the infringer's hosting provider.

If my hosting provider acted on a DCMA I would have him in court PDQ as my contract with him is under English not US law. Not that I do anything that would warrent a DCMA.