homepage Welcome to WebmasterWorld Guest from 54.234.59.94
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Home / Forums Index / Google / Google SEO News and Discussion
Forum Library, Charter, Moderators: Robert Charlton & aakk9999 & brotherhood of lan & goodroi

Google SEO News and Discussion Forum

    
Are DMCAs the answer to Panda?
chrisv1963




msg:4325948
 2:34 pm on Jun 14, 2011 (gmt 0)

The past 14 days I have been very busy checking every page of my website for copyright violations. Result: more than 500 DMCAs (!)

Conclusions drawn from this investigation:
My pages that have been copied were the hardest hit by Panda. Some pages that were copied multiple times and that used to be on page one of the serps simply disappeared from the serps after Panda.

Five year old pages were outranked with my own content by websites and especially (spam) blogs that copied it.

I'm afraid that Panda doesn't take the ownership of the content into account and it's easy to get outranked by your own content.

With "thin content" Google might mean "duplicate content". The problem however is that Panda doesn't take the original source into account and thus penalizes content owners too.

 

tedster




msg:4325963
 3:15 pm on Jun 14, 2011 (gmt 0)

There's no doubt that for some sites, scraped content is a big problem with Panda. Unfortunately, even those who successfully handled most of their scraper issues still did not recover.

There was a sequence of updates here. A month before Panda 1.0 there was a Scraper Update [webmasterworld.com] that was supposed to lay the groundwork by getting attribution correct. I'd say that was the algo that messed up.

At least Matt Cutts has publicly admitted that there is a problem here, and he said that Google will be addressing it. In the meantime, I guess a little DMCA practice can be a valuable experience. Checking out the scraper landscape can certainly open the eyes!

chrisv1963




msg:4325967
 3:18 pm on Jun 14, 2011 (gmt 0)

Checking out the scraper landscape can certainly open the eyes!


I agree, but don't do it if you have a weak heart.

walkman




msg:4325973
 3:28 pm on Jun 14, 2011 (gmt 0)

chrisv1963, it's your content so go ahead with DCMA.

Also look for proxies doing essentially the same by caching your pages, I had 5 of them, two even outranked my home page with filter=0.

I still wonder if this is the cause or the symptom, but no one should have your content anyway.

falsepositive




msg:4325977
 3:49 pm on Jun 14, 2011 (gmt 0)

@Walkman,
I found proxies caching my pages too. What is the best way to fight those? Does a Google DMCA or spam report do the job?

HuskyPup




msg:4325980
 3:55 pm on Jun 14, 2011 (gmt 0)

My pages that have been copied were the hardest hit by Panda. Some pages that were copied multiple times and that used to be on page one of the serps simply disappeared from the serps after Panda.


Yep, welcome to my world, I've been shocked at how many sites have copied my stuff, not just text but one site from about 7-8 years ago has copied an entire site's coding and images as well.

I know it's from 7-8 years ago by the on-page text etc.

tristanperry




msg:4325987
 4:20 pm on Jun 14, 2011 (gmt 0)

Yep, it's a big issue. I recently checked my <12 month old site and it had been scraped loads. Over half of the 'sites' my content were on were Google Blogger blogs.

Overall Google are good at reviewing the DMCAs and taking prompt action, but one case really annoyed me.

A Google Blogger blog had 7 pages; 6 of them were my content (and the other was stolen from another website). I reported each of its 6 pages, and Google took action and removed the pages.

However Google really should have removed the blog full stop - it was clearly a spam blog, and the DMCAs for 6/7 of its pages clearly tell this.

But then the user simply re-added all the pages! So I filed more DMCAs - and also e-mailed Google Blooger asking for them to use common sense and remove the blog (in a nicer way than that, of course!) - and they ignored my e-mail but removed the pages.

A week later, the user had re-added the pages.

I just given up. Google will take minor action when they receive a DMCA, but they're too lazy to actually take down the blog. And I very much doubt that their system doesn't record how many DMCAs a blog gets (relative to its number of pages), so it probably is a case of Google doing the bare minimum to fulfil the DMCA.

Ah well. I'll keep doing some DMCAs, but in this site's case, I've given up. They aren't outranking me, so it's not the end of the World I guess. Still annoying though.

In short, I'd definitely do DMCAs - but as above, it's not for the faint of hearted! Especially since so much of the stolen content is hosted by Google!

Planet13




msg:4325997
 4:33 pm on Jun 14, 2011 (gmt 0)

Especially since so much of the stolen content is hosted by Google!


I wonder if someone can sell some of their content to a copyright lawyer in Santa Clara who would be interested in suing google. Can that be done in cases where google has shown repeated neglect even when they have been informed of copyright violations.

chrisv1963




msg:4326020
 4:49 pm on Jun 14, 2011 (gmt 0)

Google will take minor action when they receive a DMCA


I agree. They do take action ... but really the minimum. They will only remove the page with the infringing content. The rest of the spam blog is left untouched, even when there are plenty of adult banners, multiple pop-ups (the worst I've seen is 10 pop-ups on one page) and attempts to install trojans on your computer. Be sure to have an updated antivirus program when you check Blogspot for infringements. It's a jungle!

tedster




msg:4326023
 4:51 pm on Jun 14, 2011 (gmt 0)

I found proxies caching my pages too. What is the best way to fight those?

Usually this happens when googlebot crawls your content via the proxy address. In other words, the proxy server does not really cache a copy on their server, but googlebot actually crawls your content via the proxy.

In those cases your best bet is to do the "double reverse" verification for googlebot [webmasterworld.com] - and verification for googlebot is good to have in your permanent setup anyway. Many scrapers also come calling faking a googlebot user agent.

When googlebot crawls via the proxy, the user agent will say googlebot but the IP address will belong to the proxy server. Don;t serve your content in that case - say 403 or serve whatever you want. Google is no longer confused about the proxy owning your content.

wheel




msg:4326032
 4:59 pm on Jun 14, 2011 (gmt 0)

. Result: more than 500 DMCAs (!)

Wow. 500? I did a bunch (nowhere near 500) over the past month or so. The end result? I just called the last guy and told him I'm not doing a DMCA, too much work. Either remove the content or I call a copyright lawer. And that's where I'm going in the future.I'll use DMCA's if they're hosted in the US, otherwise I'm going right to the laywer. I don't have time to do 1000 DMCA's.

danijelzi




msg:4326082
 6:44 pm on Jun 14, 2011 (gmt 0)

My pages that have been copied were the hardest hit by Panda. Some pages that were copied multiple times and that used to be on page one of the serps simply disappeared from the serps after Panda.


That's exactly the same situation I'm in. I have greatly reduced number of scraped pages by blocking RSS scraper's IPs (where possible), set up a bot trap for non-RSS scrapers that are using bots, an filling DMCAs.

Also did Google Search spam reports, but that wasn't effective.

Finally, I noticed today that the Google Custom Search on search.icq.com displays scraper-free results for all terms I investigate. On search.icq.com I rank with my original content very similarly to rankings that scrapers hold on Google.com with that same content of mine.

[edited by: tedster at 7:46 pm (utc) on Jun 14, 2011]

koan




msg:4326122
 8:20 pm on Jun 14, 2011 (gmt 0)

Same situation as you guys. My site most affected by Panda was really copied all over the place, and articles where I eliminated most detectable copies with DMCAs started to rank better.

The worse culprit is Google Blogger / Blogspot. One blog consisted exclusively of articles copied from my site, with Adsense everwhere. Google do remove infringing pages (with a copy of your complaint sent to Chilling Effects, ugh) but will not close down the blog, even after removing all content twice because the blog owner republished it! There couldn't be a more obvious spam and MFA site.

I got the feeling Google had zero respect for original content creators and was just trying to protect itself. Most other hosts seem to send a strong warning to plagiarists once they get caught and they don't dare do it again.

If I were black hat, I'd simply copy my competitors content on Google Blogger and watch them tank.

nomis5




msg:4326136
 9:10 pm on Jun 14, 2011 (gmt 0)

Maybe I'm a bit naive but I think the reason my main site has never been scrapped is that the majority of pages are .asp - old technology I know but more modern technology has the same effect.

The .asp handles pages dependant on previous user input - in my case mainly location.

The key effect is that any scraper doesn't know what the content of the page will be at any one time. What they are scraping may be rubbish or at the very least entirely location based. Any thoughts?

dstiles




msg:4326161
 10:19 pm on Jun 14, 2011 (gmt 0)

Yes, naive, I think. :)

I use ASP Classic for all my and my clients' sites. Some pages have the same filename with different querystrings, others have different filenames, others are more mixed. In some cases content is fed from a database and in others not. In the case of database sourcing only searched data is usually different for a given filename.

You can get exactly the same result using PHP or any other site-building language. At the browser/bot they ALL look like HTML (or XML or whatever) unless you have a bug.

The ONLY way of preventing content scraping is to block user-agents and IPs (and a few other things I will not tell you!). You may still get a small amount of scraping but the biggest scrapers will not get you without a lot of proxy work, and even proxies can usually be blocked.

The only scraper you may then have trouble with is google. :)

ChanandlerBong




msg:4326177
 10:52 pm on Jun 14, 2011 (gmt 0)

a couple of questions about DMCAs to google - do they still insist on the DMCA being faxed. Or do they now accept mailed DMCA notices?

is there a quicker process for blogspot? Where are you sending your DMCAs?

koan




msg:4326181
 11:14 pm on Jun 14, 2011 (gmt 0)

is there a quicker process for blogspot?


Essentially it all goes to the same place where they ask you to select a few options before deciding where you should go, but Blogger has also a convenient "report this site" link at the top.

danijelzi




msg:4326182
 11:30 pm on Jun 14, 2011 (gmt 0)

My experience is that its better to fight blogspot scrapers by filling DMCA through google.com/dmca.html than to do "report this site".

I have done multiple "report this site" reports, but after they remove the blogspot page, it continues to rank in SERPs even without any content for a long time (for example, one of reported and removed blogspot page remained on the first SERP page for almost two months).

DMCA via google.com/dmca.html removes scrapers as soon as they process a removal request.

chrisv1963




msg:4326252
 5:49 am on Jun 15, 2011 (gmt 0)

DMCA via google.com/dmca.html removes scrapers as soon as they process a removal request


Login to your Google account when you submit a DMCA and Google will process it quite fast. I have the feeling that you have to build up some "trust" when submitting DMCAs.

My first DMCA was processed after a week and the more DMCAs I submitted (with all infringements approved by Google), the faster it got processed. The last DMCA I submitted took only about 3 hours.

dirkji




msg:4326285
 8:00 am on Jun 15, 2011 (gmt 0)

A few questions about DMCA:

* "What Google product does your request relate to?": what choice is most effective here, Adsense, Blogger or Web Search?

* Does the copyright violater know where the DMCA is coming from? What if he decides to take revenge and starts click bombing Adsense ads on your site?

chrisv1963




msg:4326290
 8:13 am on Jun 15, 2011 (gmt 0)

A few questions about DMCA:

* "What Google product does your request relate to?": what choice is most effective here, Adsense, Blogger or Web Search?

* Does the copyright violater know where the DMCA is coming from? What if he decides to take revenge and starts click bombing Adsense ads on your site?



For Blogspot infingements, a blogger DMCA is the most effective.

For other infringements a Web Search DMCA is the most effective + a DMCA to the infringer's hosting provider.

When the infringer is violating Adsense policies, I always report this too. In the event that he gets his Adsense account disabled, there's less need to copy content from others ... :-)

I think that the violator does know where the DMCAs come from but I don't mind. If he knows that I report violators succesfully, then he might think twice before stealing content from my website again. I'm sure that Adsense detects click bombing and ignores those clicks. Otherwise it would be extremely easy to get the accounts of all your competitors blocked.

walkman




msg:4326317
 8:56 am on Jun 15, 2011 (gmt 0)

* Does the copyright violater know where the DMCA is coming from? What if he decides to take revenge and starts click bombing Adsense ads on your site?

Not only they do, but the entire world does, possibly 'forever.' Google posts them on [chillingeffects.org...] so be careful what you write.

It will be interesting to see if this helps anyone moving up in SERPS.

piatkow




msg:4326335
 10:11 am on Jun 15, 2011 (gmt 0)


a DMCA to the infringer's hosting provider.

If my hosting provider acted on a DCMA I would have him in court PDQ as my contract with him is under English not US law. Not that I do anything that would warrent a DCMA.

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Google / Google SEO News and Discussion
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved