Welcome to WebmasterWorld Guest from 22.214.171.124
Forum Moderators: martinibuster
joined:May 1, 2004
Can anyone give me an idea how you create scraper sites? What tools are used to get the content?
Good luck with the scraper sites. But if you need help making one then you've missed the boat. Basic scrapers are getting picked up by the algo quite easily. You're going to have to build a "next generation" scraper and use the newer types of machine generated content. Seeding from DMOZ or wikipedia is very last year. This is not easy if you haven't already dabbled in scrapers.
I'm getting tempted myself :) if only I can find the time.
joined:Aug 12, 2004
But the scraper boys can throw up a site in a few hours that makes more in a month than each individual content site of mine makes all year. If I do throw a scraper together and it makes tons I hereby promise that every cent will go to a major UK charity. I'm serious. It's a challenge and may take me on an interesting journey.
another thing to be concerned about: dont forget that advertisers can now turn off publishers that they don't want their ads appearing on. your scrapers will start to show up on lots of these lists... considering the possibility that google will adjust your account-wide smart pricing (there have been several threads on this) - aside from all the unethical issues that future posters to this thread will mention, this seems too risky.
Create a new account. In fact, start a company just for this and open a bank account in the company name. To be on the safe side buy a new PC, get a second ISP (IP) for this PC's internet connection, and use a web hosting company you've never used before (and a dedicated IP). Be paranoid about what you do on that PC/internet connection/host. Don't download any toolbars. Never use Google (because of the 38 year cookie). Don't visit any of your other sites.
>> dont forget that advertisers can now turn off publishers
There are millions of advertisers. And millions of scrapers. I wouldn't worry about this making a big dent.
Is this worth all the trouble? I don't know. Is $5,000 a day attractive? Run it for two months ($300K), move to a new domain and start again.
jomaxx, not all make that kind of money but I've seen the stats of some that do (they tend to put the site up for sale after they've been sussed ;))
1. You need a way to gather content. There are a few choices. One nifty piece of coding allows you to choose your keyphrases, then specify the numbers of words to collect that come before and after your phrases. Paste in a KW list, set the word counts and you're off.
2. Rewrite the content. Again, search the web a bit, and you'll find a few options. Or you just translate the content into another language, then back into the language of your choice. Have to defeat those dupe filters eh? If you like, you can opt for the multi-translating method, translate the content into say, French, then Spanish, then back to the original language. Use different translators. Change the order of the paragraphs.
3. Order the content. Helps to have a CMS you can hack because you need to turn that text into html and dynamically create all those new pages and the navigation. Look for dynamic site creation tools, fake blogs, etc. They're out there.
4. Add your Adsense code.
Of course, some of those systems leave footprints, so you'll want to edit those from the final code. Add RSS feeds that are relevant, set up a few fake blogs that point to your new creation, etc.
Now you have a collection of semi-literate text that is totally worthless to users, but appears to be perfect spider food. Currently, scrapers work quite well. 2 months from now? Who can say.
Or you can spend your time working on something you can be proud of. Cash versus warm fuzzies. If you make the correct choice, you can have both. I'm willing to bet that the search engine engineers are working fast and furious to develop ways to hinder or eliminate scraper sites. Which means that if you opt for the fast cash, you're constantly at war. Combat can make you weary. Enough cash can make you wealthy.
Don't sticky me for tool links, they're out there, they can be found and if you can't find them, you probably don't have the skills to use them. ;) And personally, I hate scrapers. They clutter the SERPs with useless garbage and annoy me no end.
joined:Aug 12, 2004
...but I wonder whether they convert well?
I would think yes. Visitors were searching for specific content, probably, wound up fooled on a scraper site and now look for a way out. They're still looking for specific content, only now they're looking in another direction.
I wonder whether they convert well?
According to many AdWords Advertisers, they tend to convert better in some cases.
The AdWords advertisers are the whole reason that scrapers can continue to exist. If the advertisers started to throw a major fit about the scraper sites because they never converted I'd imagine Google would start revoking publishers AdSense accounts in a hurry. Start killing scraper publishers en masse and I'm willing to bet many many other scrapers would simple dissapear.
This method requires no algo changes, etc. The Google quality team merely has to report the scraper to the AdSense team, they whack the account, and BAM now Mr. Scraper has 40 or so websites that are doing nothing more than costing him money all from one of them being found.
You hit the nail on the head. If they're unwilling to take this simple step to rid the world of (most) scrapers then I say: go ahead and join the party. Is it p*ssing you off that scrapers are taking money that could otherwise go to content sites? Take that money back by starting your own scraper. Not happy that some of your adwords money is going to scrapers (you can't block them all!)? Get some of your money back by running your own scraper.
When there are enough scrapers and the game is getting more and more difficult for an algo to detect (and taking too much computing power) maybe they'll start closing Adsense accounts of scrapers. Which is something they should have done when the problem first started.
I still don't understand how these sites get rated in the serps well?
It's been a long time since I've seen any in the Google SERPS, both as a user and in the keywords I try to rank on. Google really did an admirable job in the last updates.
I've played with the idea of building scraper sites as it's really not that hard and the Adsense department does not seem to have problems with them, but I'm now convinced building good quality sites makes more with the same effort :)