Forum Moderators: martinibuster
<paraphrase> We understand the concern regarding sites that appear to be scraper sites.
As the content owner, you may file a DMCA complaint with Google.
Publishers also must adhere to the webmasterguidelines [google.com...]
I highly suggest that you do not participate in these practices as they are violations of our policies.
We will take steps against other sites not adhering to our policies, but because we respect the confidentiality of all publishers, we cannot disclose additional details about them.
</paraphrase>
------------
Do you guys think some kind of tech or manual screen will be applied at some point? is it worth reporting a dmca complain? here's to hoping the situation improves.
[edited by: Jenstar at 8:28 pm (utc) on May 16, 2005]
[edit reason] paraphrased email quote; actual quotes not allowed as per TOS [/edit]
Has anyone personally reported a scraper and seen an actual response from Google? ie., seen the adverts removed?
Saw quite a few sites removed from the index on the last PR / backlinks update, but others (same network remain). Quite a few large scraper sites (400k+ pages) too. Good to see the removal, but the remainder of the network was still running Adsense.
I'm not sure, but could there be legal issues surrounding enforcing bans from Adsense for "spam" sites - given that the level of spam is becoming more subtle and Adsense account holders have detailed contracts, may there is a legal grey area causing Google issues with banning people from Adsense. (they can kick people from their index easier because there's no contracts involved)
Just speculation though - no hard evidence to back it up.
When AS began, ad units were not allowed on "search engines." However, since they changed this rule, they may find it difficult to back track and enforce it again.
However, a simple solution would be to write a new rule stating that only Ad Links can be allowed on "search engines" and "directories." Since Publishers are only allowed 1 adlink per page, this would greatly reduce the amount of real estate they can devote to income generating, which further reduces the incentive for these webmasters.
Another point is that surfers who click on Ad LInks have to go through a 2 step process to get to the advertiser. Surfers that have to click on the first Ad Link, and then the ad that appeals to them the most are more likely to convert to a sale, IMHO.
The point of creating a phony search engine or scraper site is to suck people in and trap them with no real content. In an effort to get out of the page, they will often click on an Adsense ad whether it's out of curiousity or because they find their back button was disabled by the page.
Enforcing an Ad Link unit only policy for scrapers would result in higher conversions (because of the 2 step process) for advertisers and reduce the incentive for webmasters to mass produce these low content websites.
Of course, I'd like to see Google disallow all ads on these scrapers but I don't think that is realistic anymore given their non-action so far and the back peddling they would have to do.
That's my advice, I hope it makes the rounds in the AS office and does some good.
I have a few niche portals. They contain a searchable directory of related topics that are mostly imported from dmoz, articles from article submissions sites where republishing rights are given, news pages where I include headlines and summaries, etc. And original (my own writing) content is added to these sites on a daily basis utilizing blogs.
I have knowledge, opinions, and an interest in the topic of each site. I would never simply grab content just for the sake of another page of web real estate to use as a bilboard for ads.
Each site is also set up with human visitors in mind and as a valuable resource for those visitors.
Would I have put those sites online if Adsense didn't exist? Probably not, but I'm proud of my efforts to provide my visitors with organized and relevant info, and adwords advertisers the opportunity to benefit from my traffic.
I HAVE been accused of being a 'Scrapermaster' though and even after trying to explain my position as I did above, these people were still adamant about it.
My sites are still ranking high in the major SE's though and and on average 6% of visitors are bookmarking.
My point? Even though 95% of my content comes from somewhere else, it's presented in such a way as to be a benefit for the end user, and I'm generously rewarded with a nice adsense check at the end of the month.
K
Scraper sites have no intelligence behind them. You search for "how to make blue widgets" and you'll be directed to a scraper site with a title "How to make blue widgets." Great, you think. Just what I am looking for. But the listings in the scraper page entitled "How to make blue widgets" will have pages for "buying green widgets" or "how to make blue knick-knacks" and other stuff that has absolutely *nothing* to do with your original search.
A "directory" site--one that is created by a thinking individual--would not include "how to make blue knick-knacks" or "buying green widgets" on a directory page titled "How to make blue widgets." That would make no sense. But many scraper sites do, which makes them cheap and useless to me.
Sure, sometimes a scraper site will have a link to something I want, but in their listings are often a lot of static (not relevant listings). Furthermore, the "snippets" they often scrape are obviously not included with any intelligence.
For instance, when I see my own sites scraped, often the "snippet" of text they use from my site is merely my top navigation menu (I have an all-text links menu at the top of each page). So the "descriptive snippet" from my site's listing might be: ¦ Blue Widgets ¦ Green Widgets ¦ Orange Widgets ¦ Purple Widgets ¦
*Obviously* this is complete B.S. That's no descriptive snippet. That's the kind of mindlessness that comes from a scraper site.
Other scraper sites swipe almost a whole (short) article from my site. It's not "fair use" when they republish the significant majority of your content, but I've seen that happen too.
People are really complaining because they don't like competition, not because their content is being jacked.
Oh, please, not that old canard. It's one of most tired cliches at Webmaster World. Usually it's served up by members who think they're being original and clever when they define "spam" as "Sites Positioned Above Mine" in the search results.
Fact is, some of us actually use the Web as more than a place to dump our own pages. We use search every day, and we find it annoying when the clutter of affiliate pages and scraper pages makes it hard to find the information that we're looking for.
To Google's credit, affiliate spam seems to have brought under a semblance of control (at least compared to a year ago), and these days the action has moved to scraper spam and template-based "button-pusher" sites. Perhaps the "TrustRank" concept (see the Google News forum) will help to solve these newer problems.
Getting back to the topic of "what's a scraper site?", I think that anyone who's at all rational and who isn't a troll can tell the difference between Google and a scraper or DMOZ and a scraper. It's also worth pointing out that some scrapers don't just scrape content pages; some scrape search results from Google and other SEs. Obviously, it isn't in Google's interest to have its SERPs direct users to clones of those same SERPs, so I think we can safely assume that the Google Search team has a strong desire to bring scrapers to heel.
Finally, we should remember that--like all large corporations--Google is a collection of groups and teams that have different objectives. The search team is responsible for delivering the best possible search results; the AdWords/AdSense team is responsible for maximizing revenue. Without strong direction from the top (something that may be difficult to maintain in America's decentralized, team-based corporate cultures), conflicting priorities are inevitable.
People can justify scaper sites anyway they want but in the end they are spam loaded junk that feeds off the hardwork of honest webmasters.
Google will never drop there ads though on these sites as they are probably accounting for a large part of there revenue.
They can't stop the html spam and they know it so why not profit from it is what they have most likely concluded.
From a publisher point of view though this is very upsetting for alot of us because we have no doubt that this is leaving a bad taste in the advertisers mouth and probably one of the major direct contributers to the massive drop in epc across all of our sites over the last 20 months.
Scraper/spam html eats up ad inventories, drives down epc and produces horrible roi for advertisers, in the end I believe this will be what seperates the overture/msn competeing products coming out shortly and pull many major advertisers out of Googles program forever if they havent already left.
Google will never drop there ads though on these sites as they are probably accounting for a large part of there revenue.
Yes, Google is making money off scraper sites, but "smart pricing" is probably resulting in low earnings per click for most such sites and for Google.
On the other hand, who knows--maybe Google has a way of identifying scraper sites and keeping a bigger chunk of their revenues. (Google has never published its compensation formula or promised a specific percentage split, so a "starve 'em out" approach could be one way to minimize the incentive for scraping while profiting from the scrapers' greed.)
Also, Google may prefer to leave the scraper problem to the search team. If scraper pages can be shoved far enough down in the SERPs, they'll no longer affect search quality or waste significant quantities of advertisers' money. And if Google Search has moved to a data-mining approach (as has been suggested in the Supporters and Google News forum), there may be value in leaving the scraper sites in place while the "black box" software learns how to detect and deal with them.
When I read the adwords forum here though it really is upsetting to see so many advertisers with such a negative view of the content network as a result of these scraper/html spam sites.
We are just keeping on the same path we have been on for 7 years now and continue to grow and will be here long after this has passed but my hope is that G is/will do some of the better points you have mentioned but I cannot help but think this has already hurt the content network on a large scale.
Also, Google may prefer to leave the scraper problem to the search team. If scraper pages can be shoved far enough down in the SERPs, they'll no longer affect search quality or waste significant quantities of advertisers' money.
It seems to me that this is the way they are trying to deal with it. I see links to my site from 'scrapers', but as a searcher on google, I really don't see the problem much. Maybe its the type of searches I do.
Can someone give me an example of a search phrase for google that will turn up a scraper site within the first 10 - 20 results? (By scraper, I don't mean a human-edited directory, but rather one of those that copies google/yahoo/etc. search results, or has taken content snippets in an automated way).
I guess that by introducing the new featuers for advertisers to pick and choose sites to advertise on, they are trying to let market forces decide, and let market forces push made for adsense/scrapers out.
Ultimately they will have to resolve it. People switched to Google because is was fast and provided good results on searches. If they no longer are the best at doing this then people will use another search engine. We've seen the demise of biggies before (Alta Vista) and there isn't any reason that the same can't happen to Google if their search engine isn't producing relevant results on searches.
Yes, it is 'Sites Placed Above Mine', because if you guys are getting beaten by garbage pages, then this is obviously the situation. If you weren't getting beaten by them, there'd be no problem, right?
I work hard on my flagship site and have in 2 years seen no approaching tide of scraper pages approaching my number one position. I don't use dirty tricks, I don't buy links, I just focus on building a proper site.
All I gotta say is, If you feel your site is threatened by a bunch of copied search result pages and a list of keyphrase-oriented results, then you have to be thinking about your own site's design and strategy.Yes, it is 'Sites Placed Above Mine', because if you guys are getting beaten by garbage pages, then this is obviously the situation. If you weren't getting beaten by them, there'd be no problem, right?
Not all of us are being "beaten" by scraper sites, and yes, there is a problem when search results are cluttered with garbage.
It never hurts to take off the "me" glasses occasionally and look at the larger picture.
Yes, Google is making money off scraper sites, but "smart pricing" is probably resulting in low earnings per click for most such sites and for Google.
My experience suggests that scraper sites are not being hit by smart pricing at all. And why should they? Ads on those sites are by far the most targeted ads you'll find on most any site. Great targeting leads to higher conversion rates. Contrary to most of the posts on the subject in this forum, the advertisers are happy (at least the ones who are primarily concerned with ROI) and Google is happy.
There are serious ethical problems IMO with running scraper sites, but with few exceptions, smart pricing does not affect them any more than it affects any other type of site. In fact, smart pricing seems to be more or less random at hitting sites, and ironically it seems to be missing most scrapers with its scattergun approach.
Not all of us are being "beaten" by scraper sites, and yes, there is a problem when search results are cluttered with garbage.
In my case, it's definitely not about "Me". Just did a Google search on my main keywords, and I note I've moved up to position 2, have another entry at position 19 and not a scraper in sight on the first two pages.
However, I'm also a web user and find myself getting very frustrated when the first two pages contain mostly spam sites, and spam ads.
Contrary to most of the posts on the subject in this forum, the advertisers are happy (at least the ones who are primarily concerned with ROI) and Google is happy.
Really? That isn't what I've been hearing in the AdWords forum.
It's just that I hate searching for stuff and finding a bunch of non-relevant *garbage* listed first--stuff that just ends up wasting my time. Such garbage sites have always annoyed me, way before I signed up with Adsense, and during a time when I was oblivious of my own sites' search engine rankings.
So, there's no "sour grapes" going on with me. I just hate the time-wasting crap quality of scrapers.
Also, your suggestion that "great targeting leads to higher conversion rates" isn't necessarily true. Many scraper pages are disguised as SERPs, with multiple large AdSense rectangles that blend into the page and push the scraped search listings "below the fold." In such cases, it's hard to argue credibly (or even with a straight face) that clicks on the ads are likely to be clicks by qualified prospects.
Scraper spam, like affiliate spam, is bad for Google and other search engines (never mind AdWords/AdSense) because it lowers the quality of the SERPs and therefore weakens the appeal of the SE's core product. Google can afford to tolerate scraper sites only if it can keep them from dominating its SERPs for secondary as well as major search terms. It's possible, of course, that Google sees a competitive advantage in allowing scraper sites as long as they pollute Yahoo and MSN's SERPs instead of Google's. :-)
Makes sense. If I was making as much bread as I wanted to I would be going for another martini by the pool rather than posting messages in a forum.
Scrapers succeed by producing thousands of pages, some of which will appear 'somewhere' in the SERPS for various multi-word, less common search terms. Scrapers do this by optimizing for everything at once and really nothing at all.
Scrapers are the opposite of targeted results. I'd be more likely to pay NOT to appear on one -- hey, there's an idea...
People can justify scaper sites anyway they want but in the end they are spam loaded junk that feeds off the hardwork of honest webmasters.
Yep, and Google is the king of the scraper sites, with Yahoo running a close second, no matter how YOU wish to justify THEIR actions. Spinning it won't change that. And the "robots.txt" argument is weak at best. If I leave my car unlocked, does that give john doe the right to take it for a spin? Of course not. Just because you're a billion dollar corporation doesn't mean you are above it all.
100 scraper sites going online this week, thanks for the motivation. :) muahahahahahahahahhah.......
And just so you know, yes, I AM proud to be a bottom feeding scraper scum. If its good enough for the big G, heck, its good enough for me! ;)
And since they can seldom be found in a user's search, it reallllllyyyyy makes you wonder about the legitimacy of the clicks that are "earned" by such sites.
I'm sure EFV is on target. Most advertisers probably don't want to appear on scrapers sites. For that matter, more advertisers would turn the content network on if there were more quality sites available. I'm certain this plays into the development of the whole site targeting-cpm model.
In the meantime we will keep producing high quality content for our visitors as we approach 2 million unique users a month. Our websites will be around in 5 years, as for your junk... well that will be another story.