Welcome to WebmasterWorld Guest from 54.145.167.92

Forum Moderators: incrediBILL & martinibuster

Message Too Old, No Replies

If you can't beat them...

join them

   
12:32 pm on Aug 10, 2005 (gmt 0)



I am thinking of putting together a few scraper sites to see how they perform against my content sites.

Can anyone give me an idea how you create scraper sites? What tools are used to get the content?

12:37 pm on Aug 10, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



You're asking for trouble posting this in the Adsense forum. ;)

Good luck with the scraper sites. But if you need help making one then you've missed the boat. Basic scrapers are getting picked up by the algo quite easily. You're going to have to build a "next generation" scraper and use the newer types of machine generated content. Seeding from DMOZ or wikipedia is very last year. This is not easy if you haven't already dabbled in scrapers.

I'm getting tempted myself :) if only I can find the time.

1:56 pm on Aug 10, 2005 (gmt 0)



LOL... lotta little sneaky eyes will be watching this one.
2:03 pm on Aug 10, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Sure, but the rewards seem astronomical. I've built content sites, I've bought content sites, I've spent lots of money on hiring writers/commissioning content. And my sites do well.

But the scraper boys can throw up a site in a few hours that makes more in a month than each individual content site of mine makes all year. If I do throw a scraper together and it makes tons I hereby promise that every cent will go to a major UK charity. I'm serious. It's a challenge and may take me on an interesting journey.

2:04 pm on Aug 10, 2005 (gmt 0)

10+ Year Member



Hey,

Yup, your asking for trouble when you want to earn more income :)

Come on, just post 100% unique content, and the money will come! Err, sometime :)

C.K.

2:39 pm on Aug 10, 2005 (gmt 0)

10+ Year Member



i would stay away from scrapers... IMO they clearly go against the "made for adsense" clause in the TOS; if google decides to go against these sites, can you imagine if your entire adsense account was revoked?

another thing to be concerned about: dont forget that advertisers can now turn off publishers that they don't want their ads appearing on. your scrapers will start to show up on lots of these lists... considering the possibility that google will adjust your account-wide smart pricing (there have been several threads on this) - aside from all the unethical issues that future posters to this thread will mention, this seems too risky.

2:57 pm on Aug 10, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



>> if your entire adsense account was revoked

Create a new account. In fact, start a company just for this and open a bank account in the company name. To be on the safe side buy a new PC, get a second ISP (IP) for this PC's internet connection, and use a web hosting company you've never used before (and a dedicated IP). Be paranoid about what you do on that PC/internet connection/host. Don't download any toolbars. Never use Google (because of the 38 year cookie). Don't visit any of your other sites.

>> dont forget that advertisers can now turn off publishers
There are millions of advertisers. And millions of scrapers. I wouldn't worry about this making a big dent.

Is this worth all the trouble? I don't know. Is $5,000 a day attractive? Run it for two months ($300K), move to a new domain and start again.

3:06 pm on Aug 10, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Oddsod you've got me convinced. This white hat stuff is just a marketing ploy anyway! ;)
3:12 pm on Aug 10, 2005 (gmt 0)

10+ Year Member



I hereby promise that every cent will go to a major UK charity

Is $5,000 a day attractive? Run it for two months ($300K), move to a new domain and start again.

Where do I register to become a UK charity? :)

3:17 pm on Aug 10, 2005 (gmt 0)

WebmasterWorld Senior Member jomaxx is a WebmasterWorld Top Contributor of All Time 10+ Year Member



I think you are wildly exaggerating how much a scraper site will make. There are scads of them out there, but the only time I ever see them in Google's results is when I am doing an extremely specific search (combining 2 or even 3 exact-string matches) or searching for a unique phrase from my own website.
3:19 pm on Aug 10, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Novice, you missed the "major" part. But, seriously, the only reason more people don't do it is because they don't know how to. Yeah, yeah and some have principles.

jomaxx, not all make that kind of money but I've seen the stats of some that do (they tend to put the site up for sale after they've been sussed ;))

3:36 pm on Aug 10, 2005 (gmt 0)

WebmasterWorld Senior Member digitalghost is a WebmasterWorld Top Contributor of All Time 10+ Year Member



The short scoop;

1. You need a way to gather content. There are a few choices. One nifty piece of coding allows you to choose your keyphrases, then specify the numbers of words to collect that come before and after your phrases. Paste in a KW list, set the word counts and you're off.

2. Rewrite the content. Again, search the web a bit, and you'll find a few options. Or you just translate the content into another language, then back into the language of your choice. Have to defeat those dupe filters eh? If you like, you can opt for the multi-translating method, translate the content into say, French, then Spanish, then back to the original language. Use different translators. Change the order of the paragraphs.

3. Order the content. Helps to have a CMS you can hack because you need to turn that text into html and dynamically create all those new pages and the navigation. Look for dynamic site creation tools, fake blogs, etc. They're out there.

4. Add your Adsense code.

Done.

Of course, some of those systems leave footprints, so you'll want to edit those from the final code. Add RSS feeds that are relevant, set up a few fake blogs that point to your new creation, etc.

Now you have a collection of semi-literate text that is totally worthless to users, but appears to be perfect spider food. Currently, scrapers work quite well. 2 months from now? Who can say.

Or you can spend your time working on something you can be proud of. Cash versus warm fuzzies. If you make the correct choice, you can have both. I'm willing to bet that the search engine engineers are working fast and furious to develop ways to hinder or eliminate scraper sites. Which means that if you opt for the fast cash, you're constantly at war. Combat can make you weary. Enough cash can make you wealthy.

Don't sticky me for tool links, they're out there, they can be found and if you can't find them, you probably don't have the skills to use them. ;) And personally, I hate scrapers. They clutter the SERPs with useless garbage and annoy me no end.

3:36 pm on Aug 10, 2005 (gmt 0)

10+ Year Member



I would imagine that scraper sites have a very high CTR because people will generally want to leave the rubbish site they've just arrived at asap but I wonder whether they convert well?
5:06 pm on Aug 10, 2005 (gmt 0)



...but I wonder whether they convert well?

I would think yes. Visitors were searching for specific content, probably, wound up fooled on a scraper site and now look for a way out. They're still looking for specific content, only now they're looking in another direction.

5:42 pm on Aug 10, 2005 (gmt 0)

5+ Year Member



I wonder whether they convert well?

According to many AdWords Advertisers, they tend to convert better in some cases.

The AdWords advertisers are the whole reason that scrapers can continue to exist. If the advertisers started to throw a major fit about the scraper sites because they never converted I'd imagine Google would start revoking publishers AdSense accounts in a hurry. Start killing scraper publishers en masse and I'm willing to bet many many other scrapers would simple dissapear.

This method requires no algo changes, etc. The Google quality team merely has to report the scraper to the AdSense team, they whack the account, and BAM now Mr. Scraper has 40 or so websites that are doing nothing more than costing him money all from one of them being found.

6:44 pm on Aug 10, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



>> The Google quality team merely has to report the scraper to the AdSense team, they whack the account, and BAM

You hit the nail on the head. If they're unwilling to take this simple step to rid the world of (most) scrapers then I say: go ahead and join the party. Is it p*ssing you off that scrapers are taking money that could otherwise go to content sites? Take that money back by starting your own scraper. Not happy that some of your adwords money is going to scrapers (you can't block them all!)? Get some of your money back by running your own scraper.

When there are enough scrapers and the game is getting more and more difficult for an algo to detect (and taking too much computing power) maybe they'll start closing Adsense accounts of scrapers. Which is something they should have done when the problem first started.

10:11 pm on Aug 10, 2005 (gmt 0)

5+ Year Member



i still don't understand how these sites get rated in the serps well? Who would link to them?
10:33 pm on Aug 10, 2005 (gmt 0)

10+ Year Member



I still don't understand how these sites get rated in the serps well? Who would link to them?

Themselves...and link sellers?

11:34 pm on Aug 10, 2005 (gmt 0)

10+ Year Member



I still don't understand how these sites get rated in the serps well?

It's been a long time since I've seen any in the Google SERPS, both as a user and in the keywords I try to rank on. Google really did an admirable job in the last updates.

I've played with the idea of building scraper sites as it's really not that hard and the Adsense department does not seem to have problems with them, but I'm now convinced building good quality sites makes more with the same effort :)