homepage Welcome to WebmasterWorld Guest from 54.205.189.156
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Pubcon Platinum Sponsor 2014
Home / Forums Index / Google / Google AdSense
Forum Library, Charter, Moderators: incrediBILL & jatar k & martinibuster

Google AdSense Forum

This 223 message thread spans 8 pages: < < 223 ( 1 2 3 [4] 5 6 7 8 > >     
What is a scraper site?
sunzfan




msg:1378418
 4:11 pm on Jun 2, 2005 (gmt 0)

Okay - people keep referring to scraper sites and I'm not sure exactly what that is - could someone quickly give me a definition?

It's different than spam pages?

 

vincevincevince




msg:1378508
 5:26 pm on Jun 4, 2005 (gmt 0)

The difference between Google and a scraper is this:


User-agent: *
Allow: /searchhistory/
Disallow: /search
Disallow: /groups
Disallow: /images
Disallow: /catalogs
Disallow: /catalog_list
Disallow: /news
Disallow: /nwshp
Disallow: /?
Disallow: /addurl/image?
Disallow: /pagead/
Disallow: /relpage/
Disallow: /sorry/
Disallow: /imgres
Disallow: /keyword/
Disallow: /u/
Disallow: /univ/
Disallow: /cobrand
Disallow: /custom
Disallow: /advanced_group_search
Disallow: /advanced_search
Disallow: /googlesite
Disallow: /preferences
Disallow: /setprefs
Disallow: /swr
Disallow: /url
Disallow: /wml?
Disallow: /xhtml?
Disallow: /imode?
Disallow: /jsky?
Disallow: /pda?
Disallow: /sprint_xhtml
Disallow: /sprint_wml
Disallow: /pqa
Disallow: /palm
Disallow: /hws
Disallow: /bsd?
Disallow: /linux?
Disallow: /mac?
Disallow: /microsoft?
Disallow: /unclesam?
Disallow: /answers/search?q=
Disallow: /local?
Disallow: /local_url
Disallow: /froogle?
Disallow: /froogle_
Disallow: /print?
Disallow: /scholar?
Disallow: /complete
Disallow: /sponsoredlinks
Disallow: /videosearch?
Disallow: /videopreview?
Disallow: /videoprograminfo?
Disallow: /maps?
Disallow: /translate?
Disallow: /ie?

kwasher




msg:1378509
 5:29 pm on Jun 4, 2005 (gmt 0)

I've created a blacklist for 'viagra'

also for

'spammers'
'people you dont like'
'whiners'
'your competitors'

So just pm me some links.

Long live McCarthyism.

Sheesh.

Oh, while we are at it, lets make the CEO of google Dictator of the World.

Some of you people frighten me.

jim_w




msg:1378510
 5:37 pm on Jun 4, 2005 (gmt 0)

this is just an attempt to stop the 'email notification of replies' from this thread

andrea99




msg:1378511
 6:01 pm on Jun 4, 2005 (gmt 0)

The content as per our TOU says that you cannot download and save ANY information for finacial gain from the server w/o express written consent.
Which makes a mockery of terms in general. If your terms are unrealistic and unenforceable it does seem to indicate a detachment from reality.

A.

Broadway




msg:1378512
 6:07 pm on Jun 4, 2005 (gmt 0)

I got email from a site who was disappointed that I had dropped them from our links page (evidently some months ago. Before dropping a site from my links page I do a Google "site:www.widgets.com mysite.com" search. In most cases if I can't find my site listed then I don't consider it to be truely reciprocal and I drop them.)

Anyway this person was writing and telling me how they have been linking to us and therefore expected a link back. They also complained that one site found on our links page was a "copyright infringer", having taken information from their site.

So here's the kicker. I did a new Google search to see if I had made a mistake in dropping them. Since the last time I evaluated their site they have joined the AdSense program and yes they do link to my site, maybe up to 10 pages, all AdSense/Scraper type linking.

What I find so amazing is an AdSense/Scraper site:

1)Considers their linking to me to be a plus and a reason I should link to them.

2) Will complain about copy being stolen from their site by another but considers their Scraper pages to be totally white hat.

birdstuff




msg:1378513
 6:27 pm on Jun 4, 2005 (gmt 0)

The only people who can't tell the difference between a SE and a scraper are scraper publishers who prevaricate on this board and say "I don't publish scrapers, but I love them!"

If you believe this you either haven't read all of my posts or you cannot understand them.

The only difference between the search engines (of which there are a lot more than just the big three) and other scraper sites is the scraper sites are a lot more discriminating about the sites they choose to scrape (most notably Google).

Yahoo, Google, Alta Vista et al have rightly earned their legitimacy by being first, being massive, and receiving (and sending) tons of traffic, but trying to claim they are any less of a scraper site than the "scrapers" described on WebmasterWorld is simply out of sync with reality.

Do I hate scrapers in general? Yes. With a passion. Do I hate all scrapers? No. I happen to love Google, Yahoo and MSN, for purely selfish reasons. If you're honest with yourself and others you'll admit to feeling the same way. We're all motivated by "what's in it for me?". That's what makes a capitalist system work so well.

kwngian




msg:1378514
 6:33 pm on Jun 4, 2005 (gmt 0)


Now the scrapers are comparing themselves to the like of SEs. Give them more time, they would obscure the definitions until they become better than the search engines. Shameless.

Admit you publish garbage and stop trying to tell me that I should love it cuz it smells and tastes like chicken.

Well said.

security56




msg:1378515
 6:44 pm on Jun 4, 2005 (gmt 0)

Umm I wonder when i go to a scraper site do I get links to google. NOTTTTT, why not cause is a peace of crap site, thats is there to take other people stuff. In the other hand when I go to google I get scrapers site.

See the difference.
Scrapers has not google links
Google has a bunch of crap scrapers.

I might not sound sufisticated, the way I esplain thing, but at list I am smart enough to see the big differences. but you people keep on comparing them, they mot the same Wake up.

deano6410




msg:1378516
 6:52 pm on Jun 4, 2005 (gmt 0)

oh and one more point, think how many people read this thread per day. Instead of eliminating scraper sites you are merely advertising the fact that money can be made from such sites.

My 2nd sad fact of the day = Every day you whinge about these sites there will prob be 100+ people who read the thread and then go and find out how to make such sites.

Ive said it b4 and i will say it again, focus on your own sites, keep them whitehat and work as hard as you can, but all this thread has done is promote scraper sites.

birdstuff




msg:1378517
 6:54 pm on Jun 4, 2005 (gmt 0)

Now the scrapers are comparing themselves to the like of SEs. Give them more time, they would obscure the definitions until they become better than the search engines. Shameless.

The scraper webmasters can speak for themselves. I certainly don't. I just refuse to bury by head in the sand and try to twist the facts to fit my opinions. The facts speak for themselves.

Every description of a scraper site that we have read in these forums applies equally to the "search engines". Can you name one that doesn't? And no, the robots.txt exclusion doesn't apply.

The scrapers typically don't even crawl individual websites. They crawl the Google SERPS, therefore the robots.txt exclusion isn't even an issue. By allowing Google to crawl your sites and extract snippets, you're making a decision to allow the use of those snippets by anyone or any bot that uses Google. You're in effect making snippets from your site open to the online world.

If it walk like a duck and quacks like a duck, it's a duck. Google, Yahoo and MSN do the walking and the quacking, therefore they are ducks (scraper sites). But they're scraper sites that I love because they help me make lots of money. The other scrapers can be banished to the pits of hell as far as I'm concerned because they do not help me make money. But they are no more scraper sites than the big three. Reality, plain and simple.

birdstuff




msg:1378518
 6:56 pm on Jun 4, 2005 (gmt 0)

I've said it b4 and i will say it again, focus on your own sites, keep them whitehat and work as hard as you can, but all this thread has done is promote scraper sites.

On this we agree 100%. But allowing false information to be perpetuated unchallenged does no one any good. It just helps undermine the immense credibility of WebmasterWorld.

Are scrapers in general good for the searchers? No. They make it a lot more difficult to efficiently find what you're looking for in the search engines.

Are they good for the AdSense program overall? Only Google can ultimately decide that based on their own goals, not the goals of advertisers and publishers.

Do scraper sites help me in any way? Yes, three of them do. Google, Yahoo and MSN send me tons of well-converting traffic that I don't have to pay for. The rest are of no help to me at all however, and from a purely selfish perspective, if they all get banned from Google tomorrow I'll be pleased as punch.

But saying that the big three search engines are any less scraper sites than the others is simply intellectually dishonest and a sign of selective morality at its extreme.

Craig_F




msg:1378519
 7:16 pm on Jun 4, 2005 (gmt 0)

birdstuff, your last couple posts are right on and what myself and others were saying pages ago. I'm astonished that ANY WebmasterWorld members would see it any other way. I'd expect this elsewhere, but not here. blinded by the hatred of scrapers I guess?

Atticus




msg:1378520
 7:41 pm on Jun 4, 2005 (gmt 0)

bs,

"But saying that the big three search engines are any less scraper sites than the others is simply intellectually dishonest and a sign of selective morality at its extreme."

Why aren't there any search boxes on scraper sites?

Am I morally blind to the search box?

A scraper may be a half-assed, useless "directory," but until they put a search box on it (and NO, not a third party search box) it ain't no search engine.

I find it difficult to understand why you insist that a site which doesn't have its own searchable database is a "search engine."

birdstuff




msg:1378521
 8:18 pm on Jun 4, 2005 (gmt 0)

A scraper may be a half-assed, useless "directory," but until they put a search box on it (and NO, not a third party search box) it ain't no search engine.

I find it difficult to understand why you insist that a site which doesn't have its own searchable database is a "search engine."

I don't insist any such thing. A scraper site is a scraper site, whether it be a search engine, a directory, or just a useless list of links with AdSense stuck all over the pages.

By definition, a scraper site extracts snippets of text (and possibly URLs) from one or more websites and displays those results to visitors as a means to an end: making money. The use (or not) of a search box has absolutely no bearing on whether a site is a scraper or not. What determines whether a site is a scraper or not in based entirely on how it gets its content, not how is delivers it to the vistor.

Apparently you're failing to comprehend my posts.

Qur1uS




msg:1378522
 8:20 pm on Jun 4, 2005 (gmt 0)

man o man....i'm done reading this thread....

I don't like scraper sites because they make me talk/read/worry/have nightmares etc. all day all night....all I can think about are scrapers scraping my site and plunging my sites in to the pits of hell...

ahhhhhhhhh the scrapers are coming.....

Please somebody give me the scraper anonymous hotline number. I can't scrape scrapers out of my mind!

Ok, so seriously... scrapers are bad....now lets all go back to making money making activities.

it has been a blast

Atticus




msg:1378523
 8:38 pm on Jun 4, 2005 (gmt 0)

bs,

You DO insist that scrapers are "search engines."

Do you remember saying this? "The only difference between the search engines (of which there are a lot more than just the big three) and other scraper sites is the scraper sites are a lot more discriminating about the sites they choose to scrape (most notably Google)."

You have unequivocally stated that a scraper site is exactly the same as a "search engine," but with "only [one) difference" in that they are "more discriminating about the sites they choose to scrape."

So I am still wondering where that search box is! Cuz you told me a scraper is exactly like a search engine, just more discriminating.

Wanna point out that searchable database that makes them a more discrimating search engine?

birdstuff




msg:1378524
 8:45 pm on Jun 4, 2005 (gmt 0)

Nowhere in that post do I even imply that all (or even many) scraper sites are search engines. I simply say that all scraper sites scrape their content from other sites. We're talking about how the scrapers get their content, not how they deliver it.

When you try to read something into a post that isn't there it makes us both look a bit silly.

Atticus




msg:1378525
 8:54 pm on Jun 4, 2005 (gmt 0)

bs,

I've shown you your own post.

You say that the "only difference" between a search engine and a scraper is that the scraper is "more discriminating."

That's what you said. We can all read English, so why deny it?

I can search a search engine. A search engine has a database organized by a particular algo to deliver results to user queries. You DO imply that a scraper can do this too. Because you said they are the same, with only one difference, and that had to do with how "discriminating" scrapers are, not their indexing, cataloging and search capabilities.

So where is that search box? I still can't find it.

Or are you willing to admit that there is more of a difference between a search engine and a scraper than that the scraper is "more discriminating?"

birdstuff




msg:1378526
 9:05 pm on Jun 4, 2005 (gmt 0)

Perhaps my post didn't clearly convey what I meant to you. I'll try again, just for you:

The only difference between the way the search engines and the other scraper sites get their content is that the other scraper sites are a lot more discriminating in the site(s) they choose to scrape (most choose to scrape only Google while Google scrapes most any site it can find via links). Most scraper sites don't have search boxes because they're set up as pseudo-directories, not search engines.

I hope that clears it up to you. I never meant to imply that all scraper sites are search engines, and if you took it that way hopefully this clarification will clear everything up. I hope so anyway because I just don't know how to simplify it any further.

Atticus




msg:1378527
 9:24 pm on Jun 4, 2005 (gmt 0)

bs,

Oh, I see.

All of a sudden, we are only discussing scrapers in regard to "the way the search engines and the other scraper sites get their content?"

I thought this thread was titled, "What is a scraper site?" The poster asked, "I'm not sure exactly what that is - could someone quickly give me a definition?"

During the course of this discussion, I said, "The only people who can't tell the difference between a SE and a scraper are scraper publishers who prevaricate on this board and say "I don't publish scrapers, but I love them!"

At no time did you, I or anyone else limit this conversation to "the way the search engines and the other scraper sites get their content." That's a fiction you created later to try to cover over your contradictions.

You answered by saying that a scraper is exactly the same as a "search engine," the "only difference" being that scrapers are "more discriminating."

I saw no disclaimer from you at that time that you were only discussing "the way the search engines and the other scraper sites get their content."

You can try to change the supposed subject of a conversion half way through in order to make a futile attempt to cover up your own contradictions, but it's a more effective strategy to tell the truth in the first place.

birdstuff




msg:1378528
 9:27 pm on Jun 4, 2005 (gmt 0)

I saw no disclaimer from you at that time that you were only discussing "the way the search engines and the other scraper sites get their content."

I wasn't aware that a disclaimer would be neccessary because I thought the meaning of my posts would be obvious. Apparently, to you they were not. I apologize profusely.

[edited by: birdstuff at 9:31 pm (utc) on June 4, 2005]

Atticus




msg:1378529
 9:28 pm on Jun 4, 2005 (gmt 0)

Oh, yeah. I think I understand you pretty well at this point.

Atticus




msg:1378530
 9:54 pm on Jun 4, 2005 (gmt 0)

bs,

In reply to your edit:

So just to get the ground rules straight, YOU reserve the right to change the subject of a discussion in a significant way midstream, without informing other participants in the discussion? And you are furthur saying that those of us who don't realize that you have changed the subject are guilty of "intellectually dishonesty and a sign of selective morality at its extreme?"

Remember when you said, "Every description of a scraper site that we have read in these forums applies equally to the "search engines". Can you name one that doesn't? And no, the robots.txt exclusion doesn't apply."

What part of that did I misunderstand? Again you say that the definition of a scraper equally fits a search engine. You say nothing about limiting this to "the way the search engines and the other scraper sites get their content." What you do try to limit, is an example of real differences between scrapers and search engines (robots.txt) when others provide them.

That's a neat trick. Using your method of discourse, you can prove that a penguin is exactly the same as the space shuttle as long as we bar any evidence to the contrary and claim latter that by "exactly the same" you just mean they both have wings.

Brilliant!

appicat




msg:1378531
 12:37 am on Jun 5, 2005 (gmt 0)

any minute someones gona mention hitler.

At least now i know what a scrap/per site is isnt. I think.

europeforvisitors




msg:1378532
 1:18 am on Jun 5, 2005 (gmt 0)

Speaking of scraper sites, Marcus007 has an interesting comment in message #37 of the AdWords Forum thread at:

[webmasterworld.com...]

He states that, according to his detailed tracking statistics, the conversion rate for scraper sites is five percent of the conversion rate for content sites.

Yes, that's 5%, as in 1/20th.

That number goes a long way to explain why "smart pricing" is necessary, why bids for some keywords may be falling, and why some AdSense publishers complain of major declines in earnings.

birdstuff




msg:1378533
 1:32 am on Jun 5, 2005 (gmt 0)

I didn't change anything at any point. I simply clarified what I meant all along since you alone apparently didn't "get it". You seem to be missing the entire intent of my posts, which is simply to say that yes, scraper sites are bad. I don't like them for reasons stated earlier in several posts and I'm certainly not here to defend them.

All I'm saying is the search engines, while very useful and much appreciated because of the traffic they bring me, are still just as much scraper sites as the scraper sites so (rightly) vilified on WebmasterWorld. Nothing more, nothing less. Why was that so hard for you to understand after reading all of my posts in their entirety?

Since you're the only one who didn't "get it", I'm guessing that everyone else understood what I was saying even if they disagree with it (which is certainly their right).

[edited by: birdstuff at 1:38 am (utc) on June 5, 2005]

cagey1




msg:1378534
 1:36 am on Jun 5, 2005 (gmt 0)

I thought one of the original gripes about scraper sites was that they displayed others' content but not the links to that contents' page/site of origin (thus leaving visitors with little option but to click on Adsense ads).

If there are no links other than ads, then these sites are definately different than Google, Yahoo! and MSN, and much less useful (to the point of useless).

If there are valid links to the sites containing the scraped content, then they are exactly like Google, Yahoo! and MSN. They are not just scrapers, they are search engines/directories (however inadequate their results).

Perhaps Google seems to tolerate them because removing them means removing all pages with few or zero outbound links. I would think this would remove many innocent (and useful) pages in addition to guilty scrapers.

Atticus




msg:1378535
 3:17 am on Jun 5, 2005 (gmt 0)

bs,

Search engines are not scraper sites. As I said way back when, the only people who claim otherwise are scraper publishers who say they aren't scraper publishers so that they can pretend to be honest, thought provoking voices of reason in popular fora.

Who in their right mind would claim that search engines are scraper sites?

Even -- da, da, dum -- Hitler wouldn't do that!

spaceylacie




msg:1378536
 5:41 am on Jun 5, 2005 (gmt 0)

But saying that the big three search engines are any less scraper sites than the others is simply intellectually dishonest and a sign of selective morality at its extreme.

I thought we were talking about sites that read like an Ad-Lib, no human invention... these are the scrapers, right?

bbcarter




msg:1378537
 6:03 am on Jun 5, 2005 (gmt 0)

you know, i like getting notices of replies for a bit, but is there a way to tell it to stop sending them on a particular discussion if you aren't interested in it anymore?

B

jim_w




msg:1378538
 7:15 am on Jun 5, 2005 (gmt 0)

I couldn't, so I set a spam filter in OE ;-))

This 223 message thread spans 8 pages: < < 223 ( 1 2 3 [4] 5 6 7 8 > >
Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Google / Google AdSense
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved