Forum Moderators: martinibuster

Message Too Old, No Replies

Will Google eventually be dumping scraper sites?

interesting email from Google i received in response

         

javahava

8:24 pm on May 16, 2005 (gmt 0)

10+ Year Member



So I, like many others, are quite frustrated about adsense scraper sites for a variety of reasons: e.g., they're diluting the revenue pool using other people's content (and are perhaps partly responsible for large dips in EPC), encouraging site owners to spend time creating junk instead of user-oriented content, and spamming up google's index. i sent all these concerns to google, with the note that i'd probably start making such sites myself if such sites were allowable under the TOS. i asked out of frustration, point blank, if such sites were ok if started producing them. this is the response i got:

<paraphrase> We understand the concern regarding sites that appear to be scraper sites.

As the content owner, you may file a DMCA complaint with Google.

Publishers also must adhere to the webmasterguidelines [google.com...]

I highly suggest that you do not participate in these practices as they are violations of our policies.

We will take steps against other sites not adhering to our policies, but because we respect the confidentiality of all publishers, we cannot disclose additional details about them.
</paraphrase>
------------

Do you guys think some kind of tech or manual screen will be applied at some point? is it worth reporting a dmca complain? here's to hoping the situation improves.

[edited by: Jenstar at 8:28 pm (utc) on May 16, 2005]
[edit reason] paraphrased email quote; actual quotes not allowed as per TOS [/edit]

Marketing Guy

10:50 am on May 17, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Has anyone personally reported a scraper and seen an actual response from Google? ie., seen the adverts removed?

Saw quite a few sites removed from the index on the last PR / backlinks update, but others (same network remain). Quite a few large scraper sites (400k+ pages) too. Good to see the removal, but the remainder of the network was still running Adsense.

I'm not sure, but could there be legal issues surrounding enforcing bans from Adsense for "spam" sites - given that the level of spam is becoming more subtle and Adsense account holders have detailed contracts, may there is a legal grey area causing Google issues with banning people from Adsense. (they can kick people from their index easier because there's no contracts involved)

Just speculation though - no hard evidence to back it up.

Freedom

10:59 am on May 17, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I've alwasy thought of scraper sites as those sites that act as phony search engines or directories. They "Scrape" the SERPS of legitimate search engines then present the results as their own.

When AS began, ad units were not allowed on "search engines." However, since they changed this rule, they may find it difficult to back track and enforce it again.

However, a simple solution would be to write a new rule stating that only Ad Links can be allowed on "search engines" and "directories." Since Publishers are only allowed 1 adlink per page, this would greatly reduce the amount of real estate they can devote to income generating, which further reduces the incentive for these webmasters.

Another point is that surfers who click on Ad LInks have to go through a 2 step process to get to the advertiser. Surfers that have to click on the first Ad Link, and then the ad that appeals to them the most are more likely to convert to a sale, IMHO.

The point of creating a phony search engine or scraper site is to suck people in and trap them with no real content. In an effort to get out of the page, they will often click on an Adsense ad whether it's out of curiousity or because they find their back button was disabled by the page.

Enforcing an Ad Link unit only policy for scrapers would result in higher conversions (because of the 2 step process) for advertisers and reduce the incentive for webmasters to mass produce these low content websites.

Of course, I'd like to see Google disallow all ads on these scrapers but I don't think that is realistic anymore given their non-action so far and the back peddling they would have to do.

That's my advice, I hope it makes the rounds in the AS office and does some good.

kokaroach

11:51 am on May 17, 2005 (gmt 0)

10+ Year Member



After carefully reading this thread I've come to the conslusion that what I'm doing with several of my sites is ok, well, according to half of you.

I have a few niche portals. They contain a searchable directory of related topics that are mostly imported from dmoz, articles from article submissions sites where republishing rights are given, news pages where I include headlines and summaries, etc. And original (my own writing) content is added to these sites on a daily basis utilizing blogs.

I have knowledge, opinions, and an interest in the topic of each site. I would never simply grab content just for the sake of another page of web real estate to use as a bilboard for ads.

Each site is also set up with human visitors in mind and as a valuable resource for those visitors.

Would I have put those sites online if Adsense didn't exist? Probably not, but I'm proud of my efforts to provide my visitors with organized and relevant info, and adwords advertisers the opportunity to benefit from my traffic.

I HAVE been accused of being a 'Scrapermaster' though and even after trying to explain my position as I did above, these people were still adamant about it.

My sites are still ranking high in the major SE's though and and on average 6% of visitors are bookmarking.

My point? Even though 95% of my content comes from somewhere else, it's presented in such a way as to be a benefit for the end user, and I'm generously rewarded with a nice adsense check at the end of the month.

K

MrSpeed

12:26 pm on May 17, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



kokaroach-
I don't consider that a scraper site, or any dmoz clones.

To me a scaper site is created using some fairly well known programs where you enter a list of keywords and it creates "content" from search results for the keyword.

crescenta

1:20 pm on May 17, 2005 (gmt 0)

10+ Year Member



Before I'd even realized that I could use Adsense on my sites, I had encountered scraper sites and learned to hate them.

Scraper sites have no intelligence behind them. You search for "how to make blue widgets" and you'll be directed to a scraper site with a title "How to make blue widgets." Great, you think. Just what I am looking for. But the listings in the scraper page entitled "How to make blue widgets" will have pages for "buying green widgets" or "how to make blue knick-knacks" and other stuff that has absolutely *nothing* to do with your original search.

A "directory" site--one that is created by a thinking individual--would not include "how to make blue knick-knacks" or "buying green widgets" on a directory page titled "How to make blue widgets." That would make no sense. But many scraper sites do, which makes them cheap and useless to me.

Sure, sometimes a scraper site will have a link to something I want, but in their listings are often a lot of static (not relevant listings). Furthermore, the "snippets" they often scrape are obviously not included with any intelligence.

For instance, when I see my own sites scraped, often the "snippet" of text they use from my site is merely my top navigation menu (I have an all-text links menu at the top of each page). So the "descriptive snippet" from my site's listing might be: ¦ Blue Widgets ¦ Green Widgets ¦ Orange Widgets ¦ Purple Widgets ¦

*Obviously* this is complete B.S. That's no descriptive snippet. That's the kind of mindlessness that comes from a scraper site.

Other scraper sites swipe almost a whole (short) article from my site. It's not "fair use" when they republish the significant majority of your content, but I've seen that happen too.

jetteroheller

2:04 pm on May 17, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I reported a only artifical content and 2 300x250 ad blocks site to Google.

I made a typing mistake at a serach and this page was 1. in the SERP.

I watched the page and now it's around place 50 for this search.

I think Google tries to solve the problem with search algorithms

europeforvisitors

2:28 pm on May 17, 2005 (gmt 0)



People are really complaining because they don't like competition, not because their content is being jacked.

Oh, please, not that old canard. It's one of most tired cliches at Webmaster World. Usually it's served up by members who think they're being original and clever when they define "spam" as "Sites Positioned Above Mine" in the search results.

Fact is, some of us actually use the Web as more than a place to dump our own pages. We use search every day, and we find it annoying when the clutter of affiliate pages and scraper pages makes it hard to find the information that we're looking for.

To Google's credit, affiliate spam seems to have brought under a semblance of control (at least compared to a year ago), and these days the action has moved to scraper spam and template-based "button-pusher" sites. Perhaps the "TrustRank" concept (see the Google News forum) will help to solve these newer problems.

Getting back to the topic of "what's a scraper site?", I think that anyone who's at all rational and who isn't a troll can tell the difference between Google and a scraper or DMOZ and a scraper. It's also worth pointing out that some scrapers don't just scrape content pages; some scrape search results from Google and other SEs. Obviously, it isn't in Google's interest to have its SERPs direct users to clones of those same SERPs, so I think we can safely assume that the Google Search team has a strong desire to bring scrapers to heel.

Finally, we should remember that--like all large corporations--Google is a collection of groups and teams that have different objectives. The search team is responsible for delivering the best possible search results; the AdWords/AdSense team is responsible for maximizing revenue. Without strong direction from the top (something that may be difficult to maintain in America's decentralized, team-based corporate cultures), conflicting priorities are inevitable.

drall

2:47 pm on May 17, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Agree with EFV on this.

People can justify scaper sites anyway they want but in the end they are spam loaded junk that feeds off the hardwork of honest webmasters.

Google will never drop there ads though on these sites as they are probably accounting for a large part of there revenue.

They can't stop the html spam and they know it so why not profit from it is what they have most likely concluded.

From a publisher point of view though this is very upsetting for alot of us because we have no doubt that this is leaving a bad taste in the advertisers mouth and probably one of the major direct contributers to the massive drop in epc across all of our sites over the last 20 months.

Scraper/spam html eats up ad inventories, drives down epc and produces horrible roi for advertisers, in the end I believe this will be what seperates the overture/msn competeing products coming out shortly and pull many major advertisers out of Googles program forever if they havent already left.

europeforvisitors

3:08 pm on May 17, 2005 (gmt 0)



Google will never drop there ads though on these sites as they are probably accounting for a large part of there revenue.

Yes, Google is making money off scraper sites, but "smart pricing" is probably resulting in low earnings per click for most such sites and for Google.

On the other hand, who knows--maybe Google has a way of identifying scraper sites and keeping a bigger chunk of their revenues. (Google has never published its compensation formula or promised a specific percentage split, so a "starve 'em out" approach could be one way to minimize the incentive for scraping while profiting from the scrapers' greed.)

Also, Google may prefer to leave the scraper problem to the search team. If scraper pages can be shoved far enough down in the SERPs, they'll no longer affect search quality or waste significant quantities of advertisers' money. And if Google Search has moved to a data-mining approach (as has been suggested in the Supporters and Google News forum), there may be value in leaving the scraper sites in place while the "black box" software learns how to detect and deal with them.

drall

3:29 pm on May 17, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



True points EFV, I do not know any guys that do this but I have to assume there is some large financial reward for some of the more sophisticated setups we have seen coming out lately.

When I read the adwords forum here though it really is upsetting to see so many advertisers with such a negative view of the content network as a result of these scraper/html spam sites.

We are just keeping on the same path we have been on for 7 years now and continue to grow and will be here long after this has passed but my hope is that G is/will do some of the better points you have mentioned but I cannot help but think this has already hurt the content network on a large scale.

rover

3:30 pm on May 17, 2005 (gmt 0)

10+ Year Member



Also, Google may prefer to leave the scraper problem to the search team. If scraper pages can be shoved far enough down in the SERPs, they'll no longer affect search quality or waste significant quantities of advertisers' money.

It seems to me that this is the way they are trying to deal with it. I see links to my site from 'scrapers', but as a searcher on google, I really don't see the problem much. Maybe its the type of searches I do.

Can someone give me an example of a search phrase for google that will turn up a scraper site within the first 10 - 20 results? (By scraper, I don't mean a human-edited directory, but rather one of those that copies google/yahoo/etc. search results, or has taken content snippets in an automated way).

david_uk

3:53 pm on May 17, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



For all any of us know, Google may make more money out of scrapers than legit sites. As their principal obligation is to the shareholders, if this is the case then they are in a difficult position. One assumes that Google intends long term survival, yet they may not be able to do anything that damages shareholders short term gains.

I guess that by introducing the new featuers for advertisers to pick and choose sites to advertise on, they are trying to let market forces decide, and let market forces push made for adsense/scrapers out.

Ultimately they will have to resolve it. People switched to Google because is was fast and provided good results on searches. If they no longer are the best at doing this then people will use another search engine. We've seen the demise of biggies before (Alta Vista) and there isn't any reason that the same can't happen to Google if their search engine isn't producing relevant results on searches.

flyerguy

4:08 pm on May 17, 2005 (gmt 0)

10+ Year Member



All I gotta say is, If you feel your site is threatened by a bunch of copied search result pages and a list of keyphrase-oriented results, then you have to be thinking about your own site's design and strategy.

Yes, it is 'Sites Placed Above Mine', because if you guys are getting beaten by garbage pages, then this is obviously the situation. If you weren't getting beaten by them, there'd be no problem, right?

I work hard on my flagship site and have in 2 years seen no approaching tide of scraper pages approaching my number one position. I don't use dirty tricks, I don't buy links, I just focus on building a proper site.

europeforvisitors

4:16 pm on May 17, 2005 (gmt 0)



All I gotta say is, If you feel your site is threatened by a bunch of copied search result pages and a list of keyphrase-oriented results, then you have to be thinking about your own site's design and strategy.

Yes, it is 'Sites Placed Above Mine', because if you guys are getting beaten by garbage pages, then this is obviously the situation. If you weren't getting beaten by them, there'd be no problem, right?

Not all of us are being "beaten" by scraper sites, and yes, there is a problem when search results are cluttered with garbage.

It never hurts to take off the "me" glasses occasionally and look at the larger picture.

birdstuff

4:34 pm on May 17, 2005 (gmt 0)

10+ Year Member



Yes, Google is making money off scraper sites, but "smart pricing" is probably resulting in low earnings per click for most such sites and for Google.

My experience suggests that scraper sites are not being hit by smart pricing at all. And why should they? Ads on those sites are by far the most targeted ads you'll find on most any site. Great targeting leads to higher conversion rates. Contrary to most of the posts on the subject in this forum, the advertisers are happy (at least the ones who are primarily concerned with ROI) and Google is happy.

There are serious ethical problems IMO with running scraper sites, but with few exceptions, smart pricing does not affect them any more than it affects any other type of site. In fact, smart pricing seems to be more or less random at hitting sites, and ironically it seems to be missing most scrapers with its scattergun approach.

david_uk

4:37 pm on May 17, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Not all of us are being "beaten" by scraper sites, and yes, there is a problem when search results are cluttered with garbage.

In my case, it's definitely not about "Me". Just did a Google search on my main keywords, and I note I've moved up to position 2, have another entry at position 19 and not a scraper in sight on the first two pages.

However, I'm also a web user and find myself getting very frustrated when the first two pages contain mostly spam sites, and spam ads.

europeforvisitors

5:14 pm on May 17, 2005 (gmt 0)



Contrary to most of the posts on the subject in this forum, the advertisers are happy (at least the ones who are primarily concerned with ROI) and Google is happy.

Really? That isn't what I've been hearing in the AdWords forum.

crescenta

5:41 pm on May 17, 2005 (gmt 0)

10+ Year Member



Like david_uk and others here, I rank #1 or #2 in my main keywords. I'm not griping about scraper sites because I am losing out to them in the search engine listings. As far as I can tell, I'm not.

It's just that I hate searching for stuff and finding a bunch of non-relevant *garbage* listed first--stuff that just ends up wasting my time. Such garbage sites have always annoyed me, way before I signed up with Adsense, and during a time when I was oblivious of my own sites' search engine rankings.

So, there's no "sour grapes" going on with me. I just hate the time-wasting crap quality of scrapers.

birdstuff

8:18 pm on May 17, 2005 (gmt 0)

10+ Year Member



Really? That isn't what I've been hearing in the AdWords forum.

What we hear in the Adwords forum is basically what we hear in the AdSense forum - the people who are happy don't post that often. The ones who aren't post all the time.

europeforvisitors

8:38 pm on May 17, 2005 (gmt 0)



Well, the fact that AdWords/AdSense now allows advertisers to block up to 25 domains says something.

Also, your suggestion that "great targeting leads to higher conversion rates" isn't necessarily true. Many scraper pages are disguised as SERPs, with multiple large AdSense rectangles that blend into the page and push the scraped search listings "below the fold." In such cases, it's hard to argue credibly (or even with a straight face) that clicks on the ads are likely to be clicks by qualified prospects.

Scraper spam, like affiliate spam, is bad for Google and other search engines (never mind AdWords/AdSense) because it lowers the quality of the SERPs and therefore weakens the appeal of the SE's core product. Google can afford to tolerate scraper sites only if it can keep them from dominating its SERPs for secondary as well as major search terms. It's possible, of course, that Google sees a competitive advantage in allowing scraper sites as long as they pollute Yahoo and MSN's SERPs instead of Google's. :-)

flyerguy

8:38 pm on May 17, 2005 (gmt 0)

10+ Year Member



"What we hear in the Adwords forum is basically what we hear in the AdSense forum - the people who are happy don't post that often. The ones who aren't post all the time"

Makes sense. If I was making as much bread as I wanted to I would be going for another martini by the pool rather than posting messages in a forum.

Atticus

12:36 am on May 18, 2005 (gmt 0)



I can't imagine that advertising on scraper sites has much value at all. Scraper sites are notoriously untargetted. They hardly ever rank well for short, specific, common, profitable phrases.

Scrapers succeed by producing thousands of pages, some of which will appear 'somewhere' in the SERPS for various multi-word, less common search terms. Scrapers do this by optimizing for everything at once and really nothing at all.

Scrapers are the opposite of targeted results. I'd be more likely to pay NOT to appear on one -- hey, there's an idea...

zeus

11:10 am on May 28, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I also told google that another site has copied my totaly, they said I should fill out a dmca form, but thats not what I was asking, I was asking why they sponsor a site like that and if they will do anything about.

I dont think they care!

beer234

7:15 am on May 29, 2005 (gmt 0)

10+ Year Member



Yea its called the Bourbon Update

ckc1227

8:41 pm on May 29, 2005 (gmt 0)

10+ Year Member



People can justify scaper sites anyway they want but in the end they are spam loaded junk that feeds off the hardwork of honest webmasters.

Yep, and Google is the king of the scraper sites, with Yahoo running a close second, no matter how YOU wish to justify THEIR actions. Spinning it won't change that. And the "robots.txt" argument is weak at best. If I leave my car unlocked, does that give john doe the right to take it for a spin? Of course not. Just because you're a billion dollar corporation doesn't mean you are above it all.

100 scraper sites going online this week, thanks for the motivation. :) muahahahahahahahahhah.......

And just so you know, yes, I AM proud to be a bottom feeding scraper scum. If its good enough for the big G, heck, its good enough for me! ;)

ownerrim

1:07 am on May 30, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



"I can't imagine that advertising on scraper sites has much value at all. Scraper sites are notoriously untargetted. They hardly ever rank well for short, specific, common, profitable phrases."

And since they can seldom be found in a user's search, it reallllllyyyyy makes you wonder about the legitimacy of the clicks that are "earned" by such sites.

I'm sure EFV is on target. Most advertisers probably don't want to appear on scrapers sites. For that matter, more advertisers would turn the content network on if there were more quality sites available. I'm certain this plays into the development of the whole site targeting-cpm model.

kokaroach

1:28 am on May 30, 2005 (gmt 0)

10+ Year Member



Yep, and Google is the king of the scraper sites, with Yahoo running a close second, no matter how YOU wish to justify THEIR actions.

I wouldn't knock them too much if I were you. They're the ones sending you the .10c worth of traffic to your crapper sites every day.

K

zeus

7:00 pm on May 30, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



It must be a real pain for google or is it? - after adsense was released the scrapers went nuts, it almost everything you see in the rankings a soon you use 2 keywords and above, as said before I even said to google that a site has totaly copied my site and uses adsense, they dident care about that, it was not there problem, that must be a sign that they dont care on what site adsense is, also all those scraper sites out there, we know they know about that problem but nothing is done.

drall

7:39 pm on May 30, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



We are all very happy for you ckc1227 and your 100 websites of pure trash you just launched, please keep populating the web with your mindless dribble.

In the meantime we will keep producing high quality content for our visitors as we approach 2 million unique users a month. Our websites will be around in 5 years, as for your junk... well that will be another story.

guitaristinus

5:36 pm on Jun 1, 2005 (gmt 0)

10+ Year Member



Doing a search for "sound recording websites" (no quotes) on Google and got a crap site first off. Just a bunch of links to other sites. Nothing useful on first page of results. I'm looking for info on how to make sound files for a site.
This 60 message thread spans 2 pages: 60