Welcome to WebmasterWorld Guest from 188.8.131.52
“don't get caught up in the hype of "scraper sites are bad"
Google is a search engine (among other things), Yahoo is a search engine, and directory
CNN is a news network, and a "website"
You’re misinformed as to what a scraper site is. A scraper site steals content and offers nothing back in return; Google, Yahoo and CNN all do, so your assessment is comical at best.
I get link exchange requests from scraper sites all the time. I wouldn't be surprised if scraper sites now make up the majority of the link exchange requests I receive.
Some of those who scrape long lists of SERPs apparently then spider the sites they link to, harvest an e-mail address and send link exchange requests to as many sites that they link to as they can find e-mail addresses for. They count on getting links from those who are link-crazy enough to accept every link exchange request they receive.
Search engines add value to the Web by crawling pages, indexing the text, and delivering answers to search queries in the form of ranked search results. If you were to remove the ads from a SERP, the page would still have a reason to exist (though it might not be profitable for the search engine).
A scraper site, on the other hand, adds no value of its own; it merely steals or borrows results from another source and uses them as filler for an ad page (where, more often than not, AdSense ads are disguised as search results with the scraped search results being hidden "below the fold"). If you were to remove the ads from a scraper page, the page would have no reason to exist, because it was conceived and designed solely as a platform for ads.
A scraper site, on the other hand, adds no value of its own
Depends on how you use them :) They're pretty useful to see what Google's SERPs were before Bourbon... whatever your favourite search term.
I'd like to see scrapers out of SERPs but let's be fair - there is very little difference between them and Google
1. Google respects robots.txt (well, sort of, they will read and index blocked pages but just not show them in SERPs), scrapers don't
2. Google eats your bandwidth - scrapers don't. They tend to eat SEs' bandwith. Yes, those snippets they stole from your site was actually done without visiting your site. They "stole" if from the copy the SE took earlier.
3. Google attempts to provide the user with relevant results. Scrapers don't. And as long as scrapers appear in Google SERPs then it could be argued that Google isn't really providing relevant results.
4. Google will remove you if you ask, scrapers won't
5. Google sends traffic, scrapers send traffic (most of them anyway). Proportionate to their size scrapers may send you more traffic than Google does.
So far in the analysis Google is slightly ahead. But then scrapers don't leave 40 year cookies, track your movements with a toolbar, add links on your content to take people away to Amazon....
I'm not trolling you EFV. I'd like to see them gone. But, for lasting effect, they need to go for the right reason and via the right tools (via improving the algo ... not the shortcut of closing Adsense accounts).
By that operating definition Google is not a scraper, as nobody has to be listed in Google, and Google will respect anybody's wish not to be indexed or cached.
Oh? My TOS has very specific language about not scraping the content, but Google does daily.
Robots.txt you say? Ok, let me get this straight...I'm supposed to follow the rules of the scraper!?
Just playing devil's advocate a bit, but it makes sense to me. All engines are scrapers, just like the spammy scrapers, only difference between the two is that the engines are generally liked.
Show 100 average Internet users a choice of using Google or a scraper site to find useful information. The ones who aren't drunk or asleep will choose.
So, play semantic games all you want. Find clever ways to demonstrate that Google is a scraper. Go right ahead.
But in the end what matters is that there are real differences, ones which even average users are aware of, on some level.
Secondly, Scraper sites do not compete with me on SERPs...
As I stated earlier... I can't think of a scraper site that is in the top 10 for any keyword that is worth anything...other than the ones I've stated...btw...those sites I stated do NOT create all the content that you see on their pages....
Thirdly, I feel (my personal OPINION) SOME of those scapers are alot more useful than the results you get when searching for a less searched key phrase...
Lastly, Instead of making blank statements and broad stroke judgements we should try to debate inteligently.
Reading some of the posts I get a sense of great anger...I appreciate that you may have a reason to be angry, but if you would explain why these scraper sites have effected you so negatively it would add to the debate...
Have a wonderful day! ;)
If you define what constitutes a scraper site in terms of technology used (namely, using a spider to gather information from other sites and repackage it on their own site), then, yes, BY THAT DEFINITION Google and all the other search engines are scraper sites.
If you define a scraper site in terms of builder's intent (namely, repackaging information gathered from other sites for the purpose of convincing searchers that they will find relevent content there when in fact the site contains nothing but the same ads they saw in the search engine results they came from), then, no, BY THAT DEFINITION Google and all other major search engines are not scraper sites.
One definition is strictly technological, the other is predominently ethical. And the definition you choose to use is a matter of preference.
play semantic games all you want
hunderdown, all I can really say to that is *you* seem to be playing semantic games if you think search engines are anything but large, well financed, highly polished, scrapers. and the engines are just the beginning, many other major sites scrape too.
the OP's question was about what a scraper site is, and I think it's important to call out that there are a variety of scraper sites, both good and bad. That's where we get to the meat of it -- which is which when the process, output, and ultimate goal is essentially identical?
But I noticed you DIDN'T respond to my assertion that the average user sees a difference, even if you can define it away, as you just did once more.
This is a pointless discussion, really, because the two camps here just don't agree on the value or lack thereof of scraper sites, and so can't even agree on the definition of a scraper site.
The thing is there seems to be a feeling that anybody associating Google with scrapers has got to be operating a scraper site i.e. nobody would equate Google with a scraper unless they were running a scraper themselves. The logic is flawed.
I'm quite happy to say that I own no scrapers and would be happy to see them out of the SERPs. I'm also satisfied in my mind that Google is a scraper (albeit a welcome one in most cases).
>> Google doesn't pretend to be a browser
Not overtly like the scrapers do. No. It's done covertly ... like via the toolbar ... to collect information that the bot didn't get.
The latest is that they use randomization and thesauri in combination with scraped content to produce pages that look quite normal. (check out articlebot)
Having seen some of this, it's hard to see how Google's bots could counter it. These scraper guys are one step ahead. And I'm not certain they have consciences.
It won't be by content, but by the quality of backlinks that you're known.
Probably, content quality will always have to be judged by humans.
Reading some of the posts I get a sense of great anger... I appreciate that you may have a reason to be angry, but if you would explain why these scraper sites have effected you so negatively it would add to the debate...
First of all, I cannot =prove= that these site have effected me negatively, at least not with figures ("I lost XXX due to scrapers").
But after having seen countless scrapers as a web user and also as an alert webmaster, I came to the following conclusion:
1) Scrapers do =not= provide any useful service (which is the difference to real search engines). They are so utterly ugly/bad/useless that any user who sees it either hits the back button or is being tricked into clicking one of the numerous ads. This reveals the real intention of the scrapermasters: Earn money (which in principle is fine, BTW).
2) They do it off stolen content, i.e. without adding any content or service on their own. A software creates thousands of useless pages containing just gibberish, fine-tuned to show up in SERPs, finetuned to trick users into clicking the ads.
3) So we know they do it for the money, and they use stolen content. But, you see, advertisers have limited budgets. Every single click for a scraper pulls money out of the advertisers pocket. Money that otherwise would be spent for ads on sites belonging to the real content owners. So I assume that the community of honest AS publishers is being hurt =collectively= by the community of AS scrapermasters with their zillions of useless pages. And that's why every honest webmaster gets emotional when it comes to scrapers...
We can not prove it, but we know that they hurt us.
Of course scraper spam sites are injurious and steal money from us.
They also degrade the quality of the internet, which gets in the way of our customers reaching us, and decreases their trust in the internet as an information source.
Let's move on to how to respond to them, please.
“A scraper site, on the other hand, adds no value of its own; it merely steals or borrows results from another source and uses them as filler for an ad page”
Its funny you said that, I was just searching for “keyword + location” and your site was number 1.
Did it have the content I was looking for? No. It had links to other peoples content with Adsense ads on top.
If I didn't know it was you, I would have thought it was a scraper site.
i don't have any scrapers btw BUT, i think EFV statment here does not hold much weight. if you were to remove all ads from Yahoo & Google, do you think they would be around much longer? Isn't the goal of all search engines to profit from their scraped listings? i.e "conceived and designed solely as a platform for ads". Tell me if i am wrong, but no search engine was founded to make the internet a better place now was it? hehe Also take into account that 5/6 surfers don't know when they are clicking on a advertisment or a organic listing.
The definition of scraper sites that is useful if your goal is to make money online is:
"a site that steals content from other sites without adding any utility etc."
the normal rules of plagiarism and fair usage should apply.
the pirate scrapers we're concerned with are the ones who don't want to do any real work, just want to steal content and make money with it.
now, what do you do about them?