homepage Welcome to WebmasterWorld Guest from 54.235.36.164
register, free tools, login, search, subscribe, help, library, announcements, recent posts, open posts,
Pubcon Platinum Sponsor
Home / Forums Index / Google / Google AdSense
Forum Library, Charter, Moderators: incrediBILL & jatar k & martinibuster

Google AdSense Forum

This 223 message thread spans 8 pages: < < 223 ( 1 2 [3] 4 5 6 7 8 > >     
What is a scraper site?
sunzfan




msg:1378418
 4:11 pm on Jun 2, 2005 (gmt 0)

Okay - people keep referring to scraper sites and I'm not sure exactly what that is - could someone quickly give me a definition?

It's different than spam pages?

 

spaceylacie




msg:1378478
 9:28 pm on Jun 3, 2005 (gmt 0)

We start a black list for one thing. Submit it to Google once a month? Anyone got connections?

bbcarter




msg:1378479
 9:33 pm on Jun 3, 2005 (gmt 0)

interesting-

you could also create a 'black list' website
it would list but not link to the violator websites
as well as list the domain name owners etc

php-mysql
add to database-
but make sure there's an appeal process for people who feel they've been listed in error

if we create this and link to it, it will get seen-

make it easy for google to grab the data, and we shouldn't need too many connections

ogletree




msg:1378480
 9:33 pm on Jun 3, 2005 (gmt 0)

They exist because the make the owner a lot of money. Google's website makes the owner a lot of money. I know people that know very little about the Internet who bought some programs off the Internet signed up for an adsense account and make over $10K a month. As long as that can happen "scrapers" will make money. I personaly don't have one. Those things out rank me. As oilman said in Vegas Google really likes their own results. In my opinion G has allowed it to prosper long enough for me to think that they approve of it. The reason they exist is because you can not distinguish the legit ones from the non legit ones. The only thing that distinguishes a legit site from a "scraper" site is the ammont of traffic you get. I could show you a large legitamate site that is not considered spam by google that if I took off the branding anybody includeing efv would consider it a spam site.

spaceylacie




msg:1378481
 9:39 pm on Jun 3, 2005 (gmt 0)

Check out my profile, what do you think?

Craig_F




msg:1378482
 9:58 pm on Jun 3, 2005 (gmt 0)

I think it's a novel idea, but ultimately a waste of time spacey. No offense intended at all, but as ogletree just posted:

The reason they exist is because you can not distinguish the legit ones from the non legit ones

And then...what does legit really mean?

spaceylacie




msg:1378483
 10:10 pm on Jun 3, 2005 (gmt 0)

Yes, a waste of time. But, it would be fun. Make one wrong move and you are on the list! LOL.

Juan_G




msg:1378484
 10:17 pm on Jun 3, 2005 (gmt 0)

Loki99, recently I've used the EFV's topic-specific website that you mention, and it was useful to save time finding good, selected information for a trip to Madrid.

Clearly, that's not a cheater/scraper site, that's a helpful hub/webguide.

And it's very simple to distinguish: If it does not help users, and just cheats them and search engines, it's a scraper. If, on the contrary, it does help users to find the good stuff in the information jungle, it's an useful hub.

spaceylacie




msg:1378485
 10:19 pm on Jun 3, 2005 (gmt 0)

Oh, we're still working on the description? Oh!

Good summary.

ogletree




msg:1378486
 10:22 pm on Jun 3, 2005 (gmt 0)

There are very legit sites that look exactly like scraper site and do everything that is considered wrong about scraper sites. They don't have to scrape they have a deal with google to access the db directly. There site is seo'ed like any good scraper site. They have high pr and rank very well in google. You type in a term in Google you get results from this site you click on that site you get the same results but they now have their own ads which are much worse than googles. It is the same thing it is just sanctioned by G. G has no problem with scraper sites. Why do you think they still exist and prosper so well in Adsense?

jeffb




msg:1378487
 10:26 pm on Jun 3, 2005 (gmt 0)

Hey, if you set up a site to gather names of scraper sites to have Google blacklist, will everybody debate endlessly whether to put Google on the list? ;)

base64




msg:1378488
 10:33 pm on Jun 3, 2005 (gmt 0)

What about website, based on Amazon datafeeds (REST/SOAP, AWS)? This data is not stolen, not against copyright rules. And it will be rendered out in customized way (i mean, i will not use some "ready-to-run" scripts, all the code is written from scratch). As much as i can understand, this kind of website is offering useful info for websurfer, who is searching specific technical data (or other kind of data) about hardware/software/music/etc. Is this kind of site pure scraper? Is it against Adsense TOS?

Loki99




msg:1378489
 10:35 pm on Jun 3, 2005 (gmt 0)

Useful is subjective.

Is a well seod links page (designed to rank in SEs) useful when you come across it in the organic serps?

Depends.

spaceylacie




msg:1378490
 10:41 pm on Jun 3, 2005 (gmt 0)

Maybe this is why Google hasn't figured it out yet... how to get rid of scraper sites... maybe they are busy chasing their own tail too.

Qur1uS




msg:1378491
 11:56 pm on Jun 3, 2005 (gmt 0)

Do you think scraper sites get quailty inbound links?
Do they have high PR?

Frankly, I don't even think they can make that much from Adsense...You need alot of traffic to make $10,000/month

If they want to fill the gaps till "real useful quality" sites compete for the keywords....better than nothing?

That could be another reason why google doesn't care...

Or does google care? Has anyone asked?

Is there anyone with a "real" site getting killed by a scaper site....I would LOVE to see it....Please Please Please...

qbert




msg:1378492
 12:15 am on Jun 4, 2005 (gmt 0)

I never really paid attention to what scrapper sites were, but you know the funny thing? I was looking in google, to see who else linked to my site, and my site came up on another site. Funny thing is, my site is a fairly new site with very low traffic (depending on what updates I put, anywhere from 50 - 200 hits a day).

The best (worst) part is, the scrapper site is not even in english.

security56




msg:1378493
 12:25 am on Jun 4, 2005 (gmt 0)

Jesuuuuuuuuusssssss, Stop comparing Google to a scraperrrrrrrrrrrr, I as a user go to Google knowingly, that I will type a keyword and get the links that will take me to what I am looking for,

And I quote myself so you guys can get it through your head.
I as a user go to Google knowingly

I don't want to type a keyword on Google and be taking to another search engine,

Why can your people see the difference .

spaceylacie you for real, I got one site that should take first place, at list in my niche, lol

Qur1uS, can I pm you the site, mine is legit but this scraper site is on top in all my keywords lol, so yea there legit sites getting affected, one thing I think this mainly affects small sites, I think like mine which usually gets like 700 unique a day, although lately I am getting only 400 mainly due to this site which I email Google about but nothing happen.

spaceylacie




msg:1378494
 12:54 am on Jun 4, 2005 (gmt 0)

Sticky me, I'll blacklist them! ;-)

Qur1uS




msg:1378495
 2:19 am on Jun 4, 2005 (gmt 0)

PM away baby.... :0

jim_w




msg:1378496
 3:38 am on Jun 4, 2005 (gmt 0)

>>Why can your people see the difference
Because they are trying to justify what they are doing.

Wonder what (G) will do if these scraper sites get enough data to sell lists of the high paying keyword and when they make enough money on the lists they wont care if they get their account canceled or not

markus007




msg:1378497
 6:14 am on Jun 4, 2005 (gmt 0)

In one market i follow Yahoo has just approached all the major publishers within the past week. It looks like they will be paying several times what google pays.. At any rate i advertise in this market heavilly, and if those publishers go then there are only scrapper sites left i'll be turning off adsense campaigns thats for sure.

spaceylacie




msg:1378498
 12:02 pm on Jun 4, 2005 (gmt 0)

Huh? Where? Yahoo is recruiting publishers? I find that hard to believe.

deano6410




msg:1378499
 1:15 pm on Jun 4, 2005 (gmt 0)

the sad fact is you will never get rid of such sites, all you can do is make it harder for them.

If you somehow manage to ban them from adsense then they will just move to another ppc company. There are companies out there who pay about half of what adsense pays, but they have no morales, and therefore you can put ads on any type of sites.

The blacklist wont do anything either.

I have reported sites and nothing gets done, and dont forget that these sites owners prob have numerous servers, numerous adsense accounts, numerous ips etc...

i have though about this a lot and i think the best thing i can do is spend my time improving my own sites and spend less time worrying about scraper sites.

Believe it or not you will make more money from working on your own sites instead of whinging about others... sad fact, but true.

mzanzig




msg:1378500
 1:41 pm on Jun 4, 2005 (gmt 0)

deano6410:

the sad fact is you will never get rid of such sites, all you can do is make it harder for them.
If you somehow manage to ban them from adsense then they will just move to another ppc company. There are companies out there who pay about half of what adsense pays, but they have no morales, and therefore you can put ads on any type of sites.

Fine, banning them from AS would be a first step, and a big one. If scrapers have to move to another PPC company, they will at least stop draining AdSense advertisers' budgets - i.e. this money will then be spent for valid high-quality sites that provide a service (and presumably also high-quality clicks).

Advertisers will not follow the scrapers to the new PPC programs. In fact they would be happy to see the scrapers gone! Just watch the related threads over at the AdWords Forum. You will see that the advertisers demand this from G as well.

Truth is that AS is the most popular PPC program just BECAUSE it generates real money fast. Other PPC programs are less reliable and less attractive.

-- M.

birdstuff




msg:1378501
 3:41 pm on Jun 4, 2005 (gmt 0)

How webmasters define "scraper sites" is irrelevant. It's how the major search engines define them that counts, and it's clear (at least to me) that they love them. Or perhaps, like us webmasters, their execs can't even agree among themselves how to define a scraper site without painting themselves with the same brush.

IMO the only difference between "scraper sites" and Google, Yahoo and MSN is the search engines are welcome, even encouraged to come by and scrape our sites.

Let's look at a few characterstics of scrapers as defined in this thread:

1 - Scrapers rely almost entirely on automation to generate their pages. Google is the king of relying almost entirely on automation to generate their pages.

2 - Scrapers come by and extract snippets of content without first asking permission. Google does this on a daily basis.

3 - Scraper sites exist solely to make money. Let's face it, Google exists solely to make money. Remove Google's ability to earn revenue from advertising and it will disappear very quickly.

That being said, with the exception of Google, Yahoo and MSN, I don't like scrapers either for many of the reasons stated in this thread... but most of all because they clutter up the pages of my preferred scrapers: Google, Yahoo and MSN.

Let's be honest here - the criteria we use to form our own definitions of "scraper sites" are purely selfish ones. If a site helps us make money we like to say it isn't a scraper. If it helps someone else make money (but not us) it should be labeled a scraper. It's really that simple.

Google, Yahoo and MSN are scrapers in every rational sense of the term, but I love it when they crawl my sites and index my pages. Why? Because they help me make money.

The lower tier scraper sites that make my pages harder to find in the top tier scrapers get my goat. Why? Because they don't help me make any money.

To say I hate all scrapers would be intellectually dishonest because I happen to love Google, Yahoo and MSN. I simply hate the scrapers that don't benefit me and my family.

Qur1uS




msg:1378502
 3:45 pm on Jun 4, 2005 (gmt 0)

Well said birdstuff!

I wish I could articulate as well as you...

jim_w




msg:1378503
 3:49 pm on Jun 4, 2005 (gmt 0)

I invited Google, Yahoo and MSN to my site. I have NOT invited one scraper site. As a matter of fact I have turn at least one down on distributing my freeware. Plus at any time I can uninvite them with robots.txt, I cannot uninvite scrapers.

birdstuff




msg:1378504
 3:51 pm on Jun 4, 2005 (gmt 0)

I invited Google, Yahoo and MSN to my site. I have NOT invited one scraper site. As a matter of fact I have turn at least one down on distributing my freeware. Plus at any time I can uninvite them with robots.txt, I cannot uninvite scrapers.

Thanks for agreeing with my post.

andrea99




msg:1378505
 4:08 pm on Jun 4, 2005 (gmt 0)

I have NOT invited one scraper site.
And this betrays a fundamental misunderstanding of how the internet works. When you post a page on the internet that is not password protected you are inviting all comers. If you don't want them, require password access--that will stop them.

Every day dozens of unwanted bots visit my site for unknown reasons using my bandwidth. Not all are scrapers but they simply can't be stopped except by blocking IP's or user-agents. Fact of life, get over it.

A.

jim_w




msg:1378506
 4:19 pm on Jun 4, 2005 (gmt 0)

>>If you don't want them, require password access

Wrong. then the bots I do invite cannt get to the content. The content as per our TOU says that you cannot download and save ANY information for finacial gain from the server w/o express written consent.

Atticus




msg:1378507
 4:58 pm on Jun 4, 2005 (gmt 0)

Search engines are not scraper sites.

A SE creates an index through spidering the web, applying an algo and delivering results from the index based on user queries. A SE attempts to organize the entire web, regardless of how much Adsense is paying per click on a given topic.

A scraper copies information from search engines; they do not collect their own info and attempt to organize it. A scraper is only interested in money keywords. That's why scrapers post garbage pages about things like "carribean-travel-ringtones-texas-hold-em."

The only people who can't tell the difference between a SE and a scraper are scraper publishers who prevaricate on this board and say "I don't publish scrapers, but I love them!"

Little tip guys -- scrapers are a big pain in the rear, but they are not illegal. If you are going to publish garbage, I say, be proud! Admit you publish garbage and stop trying to tell me that I should love it cuz it smells and tastes like chicken.

vincevincevince




msg:1378508
 5:26 pm on Jun 4, 2005 (gmt 0)

The difference between Google and a scraper is this:


User-agent: *
Allow: /searchhistory/
Disallow: /search
Disallow: /groups
Disallow: /images
Disallow: /catalogs
Disallow: /catalog_list
Disallow: /news
Disallow: /nwshp
Disallow: /?
Disallow: /addurl/image?
Disallow: /pagead/
Disallow: /relpage/
Disallow: /sorry/
Disallow: /imgres
Disallow: /keyword/
Disallow: /u/
Disallow: /univ/
Disallow: /cobrand
Disallow: /custom
Disallow: /advanced_group_search
Disallow: /advanced_search
Disallow: /googlesite
Disallow: /preferences
Disallow: /setprefs
Disallow: /swr
Disallow: /url
Disallow: /wml?
Disallow: /xhtml?
Disallow: /imode?
Disallow: /jsky?
Disallow: /pda?
Disallow: /sprint_xhtml
Disallow: /sprint_wml
Disallow: /pqa
Disallow: /palm
Disallow: /hws
Disallow: /bsd?
Disallow: /linux?
Disallow: /mac?
Disallow: /microsoft?
Disallow: /unclesam?
Disallow: /answers/search?q=
Disallow: /local?
Disallow: /local_url
Disallow: /froogle?
Disallow: /froogle_
Disallow: /print?
Disallow: /scholar?
Disallow: /complete
Disallow: /sponsoredlinks
Disallow: /videosearch?
Disallow: /videopreview?
Disallow: /videoprograminfo?
Disallow: /maps?
Disallow: /translate?
Disallow: /ie?

This 223 message thread spans 8 pages: < < 223 ( 1 2 [3] 4 5 6 7 8 > >
Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Google / Google AdSense
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About
© Webmaster World 1996-2014 all rights reserved