homepage Welcome to WebmasterWorld Guest from 54.242.200.172
register, free tools, login, search, subscribe, help, library, announcements, recent posts, open posts,
Pubcon Website
Visit PubCon.com
Home / Forums Index / Google / Google AdSense
Forum Library, Charter, Moderators: incrediBILL & jatar k & martinibuster

Google AdSense Forum

This 149 message thread spans 5 pages: 149 ( [1] 2 3 4 5 > >     
AdSense Policy Update Puts Scrapers on Notice: DMCA May Be Invoked?
incrediBILL




msg:3223295
 7:42 am on Jan 18, 2007 (gmt 0)

Website publishers may not display Google ads on web pages with content protected by copyright law unless they have the necessary legal rights to display that content. Please see our DMCA policy for more information.

Ladies and Gentlemen, fire up your word processor and start firing off DMCA [google.com] letters if you've been scraped.

Account Termination

Many Google Services do not have account holders or subscribers. For Services that do, Google will, in appropriate circumstances, terminate repeat infringers. If you believe that an account holder or subscriber is a repeat infringer, please follow the instructions above to contact Google and provide information sufficient for us to verify that the account holder or subscriber is a repeat infringer.

It would appear that if enough people file against the scrapers, they will lose their accounts.

It's about time.



see also: [webmasterworld.com...]

[edited by: Brett_Tabke at 12:38 pm (utc) on Jan. 25, 2007]

 

tallguy




msg:3223317
 8:36 am on Jan 18, 2007 (gmt 0)

Yeah abt time.

And those who are sitting there and copying sites!

Time to whack the content thieves.

Hobbs




msg:3223319
 8:40 am on Jan 18, 2007 (gmt 0)

if enough people file against the scrapers, they will lose their accounts

You of all people IB know that there aren't enough people to report scrapers out of existence, too many of them and spammers, and too little people ready to dedicate resources to closing them down. This is an algo job, human assisted, yes, but at their scale, only the code can take them on.

incrediBILL




msg:3223750
 4:27 pm on Jan 18, 2007 (gmt 0)

You of all people IB know that there aren't enough people to report scrapers out of existence

I know I may not be able to stop all scrapers, but the AdSense scrapers will be a good start!

What I've learned is that ONE scraper has many many sites, so what looks like many scrapers to you may be a single scraper with MULTIPLE VIOLATIONS which, according to AdSense, will now get their accounts cancelled.

It's just like botnets as there are too many compromised machines to kill a botnet. However, I recently took a huge chunk out of a botnet by sending AUP violation reports with their hosts and cut one group attacking me to about 1/3 of it's former self. If someone else could just get rid of the other 1/3, it would be a good day ;)

Besides, some hosts will honor an AUP violation report for a scraper too so we have more than one tool to use in the war on scrapers.

elsewhen




msg:3223868
 5:33 pm on Jan 18, 2007 (gmt 0)

bill, i am not sure that your interpretation is right. i think the operative phrase is "content protected by copyright law"

scrapers sidestep copyright law by relying on "fair use" - they only take small snippets from a particular website.

now, for sites that do wholesale copying, that is a different story. in several instances, we have sent dmca requests to google-adsense (and google search), stating that other sites also running adsense had copied our content. after waiting for a few weeks, we get a response back from google stating that the problem is resolved. the sites are still there, but no adsense on them - i presume that in these cases either their accounts had been disabled or they were asked by google to remove adsense ads from the sites.

in any case, this was happening before the tos change.

jomaxx




msg:3223890
 5:44 pm on Jan 18, 2007 (gmt 0)

This point has been made before, but it's unlikely that Google will ever interpret scrapers using snippets to be in violation of copyright law because they do the exact same thing Google itself does.

ytswy




msg:3223894
 5:46 pm on Jan 18, 2007 (gmt 0)

Ladies and Gentlemen, fire up your word processor and start firing off DMCA letters if you've been scraped.

I think I'd lose half my backlinks if I did that..

And as mentioned above, these sites seem to just produce SERP type pages - if these are a copyright infringement then so is any real search engine. Yes they should be removed from both the Google index and the Adsense program on quality grounds, but I can't see it as a copyright violation.

Sunshine1




msg:3223904
 5:52 pm on Jan 18, 2007 (gmt 0)

Google has copied everything I have ever written online and put it in their cashe. Never once did they ask, or atempt to ask.
They reserve the right to do whatever they like with it at a later time.

This is American exceptionalism at its worst. Google excepts itself from the objective rules it tries to make for everyone else.

LifeinAsia




msg:3223912
 5:59 pm on Jan 18, 2007 (gmt 0)

Actually, if you do not have a META CACHE-CONTROL tag on your pages, then you are implicitly allowing Google (and other search engines) to cache your site.

Since it's so upsetting to you that Google has cached your content and linked to your site, perhaps you should ask them to remove your site from their index? Please write back and let us know how that goes.

oddsod




msg:3223935
 6:14 pm on Jan 18, 2007 (gmt 0)

>>Since it's so upsetting to you that Google has cached your content and linked to your site, perhaps you should ask them to remove your site from their index?

Why remove? Just block Google from caching. They do honour [209.85.135.104] that meta-tag.

fischermx




msg:3223944
 6:21 pm on Jan 18, 2007 (gmt 0)

I don't see how this is related to scrapers.
They can safely argue fair-use. And even though I don't like them, nor support them, I don't think you can accuse them of copyright violation. That's if we're talking about those who put snippets of web pages, just like the real search engines does.
If you mean people copying entire pages, then, well, those are not what we use to call "scrapers".

But instead you can accuse them with the Adsense team for putting ads on pages with no real value or content.

Mods: the thread title should be modified.

incrediBILL




msg:3223950
 6:27 pm on Jan 18, 2007 (gmt 0)

They can safely argue fair-use.

Not true at all.

scrapers sidestep copyright law by relying on "fair use" - they only take small snippets from a particular website.

Not all scrapers use small snippets, and regardless of how much is used it can't always be to 'fair use' where they make money off your content unless it's used as part of news reporting, criticism, satire, etc.

Besides, it's not up to Google to decide what is or isn't fair use as they are not part of the judiciary system nor the copyright holder on the material being infringed.

Read this Nolo Press [nolo.com] article on Fair Use.

Violations often occur when the use is motivated primarily by a desire for commercial gain.

Also see this article from Nolo Press [nolo.com]

Lawsuits are even more likely if you stand to make any money off the use, such as posting copyrighted song lyrics on your site to increase traffic and attract advertisers. uses like this are likely to bring record companies knocking.

Basically, you only lose the rights over your copyright when you give them up and giving scrapers free reign by assuming they have fair use on their side is a wrong assumption.

FYI, I used to work for a book publisher that published a software directory in the early 80s and they were sued for ONE 3-line paragraph in an 1,100 page book that you would think was 'fair use', as it literally constituted the same use as scrapers on the 'net, and they LOST.

[edited by: incrediBILL at 6:34 pm (utc) on Jan. 18, 2007]

justageek




msg:3223958
 6:32 pm on Jan 18, 2007 (gmt 0)

Actually, if you do not have a META CACHE-CONTROL tag on your pages, then you are implicitly allowing Google (and other search engines) to cache your site.

This logic must allow scrapers then.

If this logic were true, and actually legal, then the only way a scraper could get into trouble would be if they included content from a site that Google was not able to crawl due to a meta tag.

JAG

jomaxx




msg:3223961
 6:34 pm on Jan 18, 2007 (gmt 0)

The for-profit nature of the use is ONE element to be considered. But since Google and pretty much all other search engines are also in it to make money, the distinction is slight at best.

Anyway, the words have changed but I don't see anything in them that suggests that Google's actual interpretation or enforcement of DMCA reports has changed.

incrediBILL




msg:3223980
 6:43 pm on Jan 18, 2007 (gmt 0)

I think you're all missing the point that we ALLOW search engines to crawl, or not, via robots.txt, which scrapers do not honor. There's a HUGE difference between fair use which is authorized, even if it's monetized, and the same behavior done by scraping without permission and masking as IE browsers or FF to avoid detection.

The intent is completely different as one is being done honestly and the other is NOT!

jomaxx




msg:3224001
 6:52 pm on Jan 18, 2007 (gmt 0)

So if a scraper were to obey the robots.txt file, that would make it OK? I actually think this is a plausible position, but I thought you were arguing for more than that.

incrediBILL




msg:3224012
 6:55 pm on Jan 18, 2007 (gmt 0)

If the scraper obeyed robots.txt they wouldn't be scraping my site, end of story!

wrgvt




msg:3224017
 6:56 pm on Jan 18, 2007 (gmt 0)

So if a scraper were to obey the robots.txt file, that would make it OK? I actually think this is a plausible position, but I thought you were arguing for more than that.

It's not the scraping or crawling of the site that is the problem. Google and other search engines crawl our sites and then display links to our sites in their SERPS. The problem with scrapers is that they steal that same content and pass it off as theirs. It's the displaying of the scraped content that is the problem, not the actual retrieval of it (besides the bandwidth considerations).

poisonerbg




msg:3224029
 7:08 pm on Jan 18, 2007 (gmt 0)

what about adsense on site like youtube and other videos sites?

ytswy




msg:3224039
 7:12 pm on Jan 18, 2007 (gmt 0)

I think you're all missing the point that we ALLOW search engines to crawl, or not, via robots.txt, which scrapers do not honor. There's a HUGE difference between fair use which is authorized, even if it's monetized, and the same behavior done by scraping without permission and masking as IE browsers or FF to avoid detection.

The intent is completely different as one is being done honestly and the other is NOT!

The problem is that copyright law AFAIK has not really advanced this far. Legal recognition of robots.txt (and a corresponding legal obligation to use robots.txt rather than go straight to law) would make a lot of sense IMO, but I don't believe that there have been any cases in the US that have expressed an opinion on this issue (I'm British, but lets face it, it's US law that is important here).

That said, I still find it hard to think of copyright law as the proper solution to (snippet) scrapers - copyright does not, and should not IMO, give you the right of exclusive use to every sentence you have ever uttered. And that's all a snippet really is.

The real problem with scrapers is that they get traffic from SEs. At the end of the day it won't be until this stops that they won't be a problem. But once it does they will be a complete irrelevance.

justageek




msg:3224044
 7:16 pm on Jan 18, 2007 (gmt 0)

If the scraper obeyed robots.txt they wouldn't be scraping my site, end of story!

Maybe. But how many people actually do this? How many people know how to do it correctly?

It's not the scraping or crawling of the site that is the problem. Google and other search engines crawl our sites and then display links to our sites in their SERPS.

Really? So as long as I link to Stephen Kings web site I can post any of his novels on my site?

For the record - I do not like scrapers but this topic is interesting enough to examine both sides equally.

JAG

incrediBILL




msg:3224054
 7:21 pm on Jan 18, 2007 (gmt 0)

BREAKING NEWS...

I've been checking some major scraping operations I found that I reported to Google over a month ago and NONE of them are running AdSense at this time. I haven't checked every site as there were almost 10K of them, but spot checking many of the links shows that most of them are now using some other affiliate networks.

Additionally, I've been scanning Yahoo for about an hour now, spot checking the hundreds of other sites Yahoo have indexed which scraped my site that ran AdSense, it's the same situation.

Were they banned or merely switched on their own?

The interesting part is the scraped content no longer shows up on the sites as they're running full page ads, so it may be just cloaked.

The upside is AdSense has definitely vanished from many thousands of scraper sites.

FYI, I showed a few of these lists of scraper sites to martinibuster when AdSense was on them and now that it's gone, he can confirm my observations.

martinibuster




msg:3224060
 7:27 pm on Jan 18, 2007 (gmt 0)

Yep, they're gone.

arubicus




msg:3224068
 7:35 pm on Jan 18, 2007 (gmt 0)

My opinion is HOW the content is used and what/and how much content is considered FAIR use.

In a directory style scraper I don't have much problem when content is minimal/nothing about that content has changed. Pretty much like any search engine does.

When they pass content as their own and or change words in the content to plop keywords in...I get a bit annoyed.

When they take complete works or the heart of the work without reason other than just to use to create another ad laced page. (A reason would be to use my work as an example, reference, etc. within their own work) Again I have a problem.

Actually, if you do not have a META CACHE-CONTROL tag on your pages, then you are implicitly allowing Google (and other search engines) to cache your site.

I know this is the argument in which Google gets away doing what they do. To me this is half a..ed backwards. Usually in most cases permission is granted FIRST by the owner. Such as someone making a robot and send it into your unlocked home and making copies of the stuff in your file cabinet. I guess you implied that it must be ok for them enter your home and make the copies as well as distribute the copies in part and whole because no main stream directive was placed and the door was left unlocked.

To me there should be an allow FIRST and directives to allow and disallow parts of the site. If not allow is found then no copy or cache. If the meta/robots are so main stream then what is the problem making the switch.

Even at that there are legal TOS/TOU that may or may not be even remotely considered in the scheme of things.

Robots have no complete understanding of a TOS/TOU, it cannot imply anything, nor can it enter into a contract. The humans who run it must do such things for it. Since they don't visit the site FIRST or periodically I believe that it must be implied that the robots shouldn't enter, copy, or cache the site unless allowed verbally, written, or by directive. Since Google has a webmaster resource center that is main stream to where directives can be placed there along with lists of accessable, crawlable, and cacheable pages.

[edited by: arubicus at 7:51 pm (utc) on Jan. 18, 2007]

europeforvisitors




msg:3224071
 7:37 pm on Jan 18, 2007 (gmt 0)

I don't see how this is related to scrapers.
They can safely argue fair-use.

They can argue "fair use," but Google isn't obliged to accept their argument.

What's more, if Google wants to get serious about purging scrapers, it doesn't need an excuse to close the scrapers' AdSense accounts. (It needs an excuse only to keep unpaid monies at the time of cancellation.)

trinorthlighting




msg:3224093
 7:49 pm on Jan 18, 2007 (gmt 0)

I agree Bill,

We are going through to see who we can report that are scraping our sites as well. It might not get rid of the scrapers, but it will definately have a positive impact for all of us adsensers....

ytswy




msg:3224095
 7:50 pm on Jan 18, 2007 (gmt 0)

To me there should be an allow FIRST and directives to allow and disallow parts of the site. If not allow is found then no copy or cache. If the meta/robots are so main stream then what is the problem making the switch.

But if this was enforced (and it pretty much is current law) there would be no Google. Only a tiny percentage of the web would have the correct ALLOWs setup, and most of those who didn't would only omit them because of ignorance - you put up a page, it's because you want people to read it after all..

arubicus




msg:3224103
 7:56 pm on Jan 18, 2007 (gmt 0)

But if this was enforced (and it pretty much is current law) there would be no Google. Only a tiny percentage of the web would have the correct ALLOWs setup, and most of those who didn't would only omit them because of ignorance - you put up a page, it's because you want people to read it after all..

Ahhh. It is their argument that the robots directives and meta directives are main stream that any webmaster should know how to disallow. This argument has been used in a case recently.

Now by their argument of being main stream then a switch in directives would be known by the vast majority. Google makes one announcement and it is picked up by virtually in every publication known to man.

Now anyone who no longer sees their site in google...they can find out why by digging on their site for the answer. You must allow now before we enter your site.

Besides all they got to do is tell About, Wiki, and Amazon which makes up the bulk of their results and they wouldn't loose hardly anything :)

you put up a page, it's because you want people to read it after all

You put up the page for HUMANS to read yes. But a robot cannot imply what you do or do not want. Just because you want it read by humans does not mean you want it automatically distributed without your direct permission FIRST. And if you do want it distributed automatically does not mean that everyone feels the same.

Scrapers under fair use is a whole ball of wax to contend with. Google and us.

beren




msg:3224114
 8:10 pm on Jan 18, 2007 (gmt 0)

I don't understand why this thread was started. Is there a new policy in place? AdSense has had a rule against its ads appearing on sites that violate copyright for a long time.

I've been filing DMCAs on AdSense sites for years. This is not new. I DO NOT file them on scraper sites, though. If a site only copies one or two paragraphs, I let it go. There are too many scraper sites to handle.

For those who don't know: Google has different offices for handling DMCA complaints against sites in its index and sites that use AdSense. You can file a DMCA for the Google index, but that isn't particularly satisfying. What's really great is if the copying site uses AdSense. Then you file a DMCA directly to AdSense and Google shuts off their ads until they respond with a letter promising they won't do it again. This is one of the most fun things you can do on the web: make plagiarists scramble like frightened rabbits when threatened with a cutoff of AdSense.

mojomike




msg:3224128
 8:18 pm on Jan 18, 2007 (gmt 0)

I think I understand the final outcome of the newest Google rules.

i think that the quality score of my site will be validated since I won't get scraped anymore, the quality of the conversion to the advertisers will improve and the income stream should increase since I feel that my leads convert better for the advertisers.

since I don't run a lot of sites, it's a breeze for me to hit every scraper and knock them out of AdSense.

I would venture to think, that every one that make money with Google would do there hardest to "legally" knock out scrapers. the end results are websites removed from the index.

mojomike

This 149 message thread spans 5 pages: 149 ( [1] 2 3 4 5 > >
Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Google / Google AdSense
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About
© Webmaster World 1996-2014 all rights reserved