Forum Moderators: martinibuster
It involves TWO websites (2 different domain names). One of them is an auto generated website made with several of the available "made for Adsense" page makers (this website should be 1000+ pages).
The other is a website of about 10 webpages, each page with original articles of about 200 words each. You could put some Adsense blocks on the pages with your original content for some extra cash.
Then put in a link on EACH page of your original website to the index page of your auto-generated website. Pay more attention to your original 10 page website concentrating on getting a higher page rank for that site (links, maybe even more original article page, etc). With higher page rank, spiders will crawl it well and follow it right to your auto-generated 1000+ page Adsense website.
Does this sound like a good method? It's worked pretty well for me. I figure I'm doing work, getting visitors to Google advertisers websites so it's a win/win situation...beats writing 1000+ pages of original content, I just have to write 10 pages of original content. I'm in this Adsense game to make money and since this an Adsense forum, I assume readers of this post are in the game to make money, too.
>> They do not provide enough value
>> Let's look at it from a visitor's point of view
Yeah, yeah, yeah, yeah. Keep trotting out that same old rubbish. Nobody is disputing that autogenerated pages are largely clutter, nobody is disputing that scrapers should be completely removed from SERPs. So, shouting yourself hoarse that someone should shut the stable door doesn't add anything to the discussion.
The argument isn't about whether scrapers are bad, they are. The topic of discussion here is autogenerated pages, autogenerated anything - from collecting flight price feeds and autogenerating comparisons to autogenerated log tables. The chances are that at some time in your life you used an autogenerated page and found it - horror of horrors - useful!
>> When a visitor (the logical focus of this or any kind of site effort when it comes to adsense) arrives at a site, and it offers only nonsense auto generated rubbish, it could be said that that site has crossed the line.
Translated into Spanish by a popular online translator (and back to English): "When a visitor ... arrives in a place, and in the car of tributes only engendered of the foolishness of trash, would be capable of the said that that the places there are crusaders the line".
So, translated stuff is OK? The inane mumblings of semi-literate teenager IMs are OK material for pages? Machine transcribed copies of audio speeches recorded in Speakers Corner of Hyde Park are OK? What about machine translated copies of speeches in foreign languages? I could go on. There's a lot of garbage on the net. Some people put theirs on convinced they are making a worthwhile contribution to the net, some put theirs on knowing it's complete rubbish. Strangely, the latter is sometimes more useful than the former (or rather "less worthless").
The whole net revolves around the basic principles of people being able to put up any legal material they want. It can be complete and utter rubbish but they have a right to post it online. It could be ravings of a lunatic, the babblings of a toddler, or someone predicting of the end of the universe. Or it could be autogenerated rubbish. Why is it taking the SEs so long to get to grips with excluding rubbish?
<editing spelling>
[edited by: oddsod at 6:28 pm (utc) on Sep. 17, 2005]
The topic of discussion here is autogenerated pages, autogenerated anything - from collecting flight price feeds and autogenerating comparisons to autogenerated log tables.
The topic of discussion here -- at least with those of us who are saying that the autogenerated pages in question are wrong -- are clearly pages that are being called "autogenerated" by unscrupulous individuals who wish not to divulge the true nature of their property. Make no mistake, these are scrapers.
If an autogenerated site needs help staying in the SERPS, its data has no relevant use. It's both an autogenerated and a scraper site. The OP is not organizing data in a useful way -- that has never been the topic of this thread.
The topic of discussion here is autogenerated pages, autogenerated anything - from collecting flight price feeds and autogenerating comparisons to autogenerated log tables.
I don't think that was the implication at all - my site is dynamic so technically all my pages are "autogenerated" but it's from original content.
The OP said:
One of them is an auto generated website made with several of the available "made for Adsense" page makers
So your argument in defense of "autogenerated pages" in this case doesn't hold water as the OP specifically said he used crap site generation tools specifically designed to build sites to drain AdSense money. Now he's patting himself on the back that's he's created a link farm scheme to make money by getting search engines to clutter the internet with this junk by following from a theoretically legitimate site.
LOL. It really amazes me that, contrary to all common sense, people still confuse legal activities with illegal ones just because they happen to fall foul of one particular search engines "guidelines".
incrediBILL, my defence is not for autogenerated rubbish ... or for scrapers. I abhor both. But, to borrow a phrase, I'll defend to the death your right to put rubbish on the net.
oddsod is right, these autogenerated sites do fall under the fair use act and are not illegal.
The internet will never make everybody happy. Just as webmaster hate autogenerated sites, christians hate porn sites and parents hate sites with violent online games for their kids.
Is the internet perfect, no, but it certainly protects our freedom of expression.
These autogenerated site are a huge problem that Google created by letting any site publish AdSense without a review. I am confident they will correct it. Afterall it is their own search engine that is being polluted by these sites.
/edit/removed something i'm not certain of/
Anyone that has been studying the WW Google threads know fully well that one of the "targets" of the last purge were auto-gen sites.
It is also predictable that auto-gen scraper sites {not useful DATABASE DRIVEN (there is a BIG difference) types with original your type incrediBILL} will continue to be on TOP of the list of "content" that Google is intent on eliminating. So if you are a producer of such "content" it is only a matter of time before you are removed from Google's SERPs....
Consequently, you will be left to earn your revenue from dramatically lower visitor traffic from the other search engines, with tend to provide lower click through rates & lower earnings per click ( see all associated, pertinent "smart pricing" whining treads)... If you choose to work toward ultimate oblivion... that's your choice.
The Internet allows people to be self-destructive... what could be more democratic than that! Have at it all you Scraper-Backers...
I always chose to try to better my lot in life - but the Internet provides you with the freedom to conduct your "business" the way you want to... until "smart pricing" become even more sophisticated and you earn $0.00 per click because advertisers refuse to pay for clicks eminating from your poor quality sites.
Then we will have to suffer from seeing even MORE of your moaning about "earnings dropped to 1 cent a click because of smart pricing... How to get rid of smart pricing..."
Been said many times before... the way to earn decent money with Adsense is to honestly publish good quality content...
The chance for sustainable success can't be more simple than that...
...these autogenerated sites do fall under the fair use act and are not illegal.
Fair Use has to be the most abused provision to ever exist. Fair use came to exist because there was no set of rules for reproducing copyrighted work, but there are guidelines. It's not the umbrella of protection that some people hide behind.
I have a piece of original, copyrighted work that has been partially scraped from one of my sites. A scraper displays this along with other pieces from other works that make a "page." Go through the factors described in fair use.
1. Purpose of the use. My work is being displayed in an effort to exploit the holes in software (which is against the terms of use of that software, but I'm sure that doesn't matter to you either) to generate revenue. There is simply a listing of data. There is no comparison, review, criticism, or any educational use. No fair use.
2. Classification of the material. An article is a perfect example of material that should (and usually is) copyrighted. No fair use.
3. Amount of the work taken. An article or research paper is often concise, but scrapers usually only take a couple of sentences. This could be more or less condemning depending on what they happen to take. Most likely fair.
4. Effect on value of the work. Many of us pay for our content, or earn a living off of our own work. Sometimes it can be difficult to tell the nature of the work, but I know exactly how much an article will earn me, give or take 10%. If somebody searches for my article and can only find scraper references to it, that's taking money out of my pocket. I can reasonably prove the value of the content, and the damage of the infringment. No fair use.
Taking pieces of articles on the web and displaying them to generate revenue simply is not Fair Use. Try stealing a paragraph from Time or Newsweek or the AP -- they will hunt you down. Just because Joe Webmaster does not have the resources to bring it to light does not mean that he's not the victim of a crime.
If I were to put the same content on paper and leave on my front porch and the NY Times came by everyday to pick up and put in their paper I would keep putting it out there every day. That is what they are doing. It is not our fault that Google is too stupid to tell the difference between that and a good website.
So if I told you I would give you an autgenerated website that made $1000 a day in adsense you would not take it
If a dealer offers me dope, which I could resell for $1000 a day, I would shure not take it.
To much dangers to ruin a good runing legal business.
A drug dealer can go a live time in prison.
A scraper can be baned for a life time by Google.
As is typical with some of the less sophisticated members in this thread you confuse an impartial, unemotional, level-headed view of the reality as support for an odious practise. Oh, well, as I said, this internet is open to rubbish, whatever its origin ;)
ncreegan, it is indeed the case that the scrapers branch of the autogeneration business take snippets from sites without their permission. Rant and rally against that and you know what you achieve? They'll still steal your snippets, pump it through a text mangler/translator and then publish it. The modified text now bears no resemblence to your copyrighted text. What'll you do then? Trust me, I've got a lot of money invested in content. Some like Brett, who know who I am, may have an idea of what I've got in content. I'm in the same boat as you and I'd like to see a better internet. However, when otherwise sensible webmasters seem to think this is a webmaster problem rather than an SE problem, I despair. There will always be webmasters who take shortcuts, who toy with the illegal. But it's the SEs' responsibility to keep them out of SERPs and out of their programs like Adsense.
Don,t forget they decide what made for adsense means, and they decide of the TOS, they can decide to take back their money and then pay back the advertisers. It will even be a good pr move for them, and a really bad hit for mister 50 sites a month.
Imagine you have to give back all that money,,,,, mfff I would find it hard to sleep at night after that!
I agree, it would certainly be a good PR move to show that they are trying to police the system by publicising removal of some sites made by these programs.
I mean I'm sure they soend all the money they make right away on clothing or stuff.... they couln't pay it back and they would really piss their pants...
(It's not that I like the idea of seeing people go bankcrupt or something, but these actions should have real consequences, because it'S real money in the real world...
Maybe google does not cancel the accounts and claim the money because they just don,t care about where the clicks come from?
If they're serious about eliminating this junk from the Adsense program then they should be doing some more clawing back and some better policing, in addition to what now seems some (long overdue) effort to get these sites out of SERPs.
As is typical with some of the less sophisticated members in this thread you confuse an impartial, unemotional, level-headed view of the reality as support for an odious practise. Oh, well, as I said, this internet is open to rubbish, whatever its origin ;)
Oh please.. mincing with words make you one up on others. What is wrong is wrong. You need a reality check and cut back on that ego of yours.
So, everybody takes the auto-generated page route. Now it's not just one guy creating 50 websites a week, but we have 10,000 people doing exactly what he is doing. Next thing you know the web has 100 billion auto-generated websites. People start getting frustrated wasting their time with so much useless garbage. A new search engine comes out that ONLY includes sites with original content. People find a place that actually gives them the results they are looking for. This search engine becomes the next Google.
Conclusion, people that spend all their time creating original content now have large content filled websites listed in the new search engine that 80% of people are using. The people spending all their time creating auto-generated crap are scrambling to put together some original content of their own to catch up to those who have been spending their time creating original content instead of wasting time filling up the web with useless spam.
Simple question, if their was a search engine today, that only returned results from sites with original content, who wouldn't use it? Nobody wants to spend their time sorting through spam/junk websites, NOBODY.
I have found, much to my amazement, that Google's algos are pretty darn good in detecting scraper sites and eliminating them from the SERPs. For instance, I found over 600 references to one of my domain names in Google, using a query that will return all hits including "Supplemental Results." I went through the entire list, one by one. I didn't bother visiting ones that were obviously DMOZ duplicates by the URL construct.
Most of them were scrapers and DMOZ uses, and the VAST majority of those came back as "Supplemental Results". There was one legitimate site with a legitimate backlink (very small 3 or 4 page personal site, but I happen to know the webmaster, as a member of my community) that was designated as Supplemental.
All the rest were legitimate backlinks or references, and they were not designated as "Supplemental". It was actually pretty amazing.
Also to tell me that having my sites is the same thing as being a drug dealer is wrong. Whoever said that has some very screwed up logic.
When I spoke about a WebmasterWorld member doing what the op does I was talking about senior members that even speak at conferences. My point is that WebmasterWorld is made up of a lot of people many do things that you may not approve of. This is a place to talk about being a webmaster. Not about what is right or wrong in your opinion. We share ideas and learn from each other. You guys need to quit talking bad about your fellow members. I promise you most the sites you don't like are made by somebody here. If they rank well probably a well respected member.
Adsense has no problem with these sites. It is not against the law or their tos.
Then why do they have this in their policies:
No Google ad may be placed on pages published specifically for the purpose of showing ads, whether or not the page content is relevant.
I'm pretty sure auto-generated sites fit this clause.
Explain why they don't violate that policy and you get a gold star on your forehead.
>> that Google's algos are pretty darn good in detecting scraper sites
They are getting better, I'll admit. After a lot of poncing around and years of ignoring the problem in SERPs and the problem in Adsense. Why do you think more and more people got into rubbish sites? Ans: Because Google was turning a blind eye. Now all Google needs to do in SERPs is work out the difference between worthwhile and rubbish autogens, list the former and ban the later, and get their finger out on the issue of weeding Adsense of the undesirables.
I agree that all things in business are shades of grey/grey.
I agree that there is huge difference between "scraper" and "autogenerated" else Amazon would be a scaper too, for example. My main site is DB-driven and thus "autogenerated" and has been since about 1998 when it got too big to manage by hand any more. It takes real effort to give away lots of original material in such a way that people can find it! B^>
I do not agree with the assertion that G is a scraper, though I see where you are coming from. G adds value by the comprehensiveness of its coverage and services, its immense effort to filter and present in an appropriate manner, and so on.
Rgds
Damon
This thread is proving useful in make the distinctions between all the various practices clearer, to me at least.
There also seems to be, mostly, agreement that Google is not a scraper, but a different type of autogen site.
What, from a definition point of view, is the essence of the difference between an SERP and a scraper? Is it "permission" - by submitting a site to a search engine, one effectively gives licence to use the site text, and that licence can also be withdrawn (eg: through commands in the robots.txt, using sitemaps, or in the html itself). With scrapers, no permission has been given, nor can use of the text be withdrawn.
If I'm correct about that distinction, then Scraper sites infringe the Adsense terms because they do not have ownership, or licence to use, the text copyright.
There was a discussion about copyright and adsense terms a year ago, in this thread: [webmasterworld.com...] - what that thread leaves unanswered, in my mind, is the boundary between "fair use" and copyright infringement. Eg: if I write a blog page, quote a snippet from a news site (that provides RSS feeds) and then link to the news site, is that fair use, or is it copyright infringement and therefore a breach of the Adsense terms.
This is a place to talk about being a webmaster. Not about what is right or wrong in your opinion.
so we are not allowed to talk about what's good or evil in this case? in the adsense game, this discussion is absolutely necessary. and i'm happy to notice that there is still an overwhelming majority of webmasters here with common sense.
You guys need to quit talking bad about your fellow members.
i don't care if these members have a few thousand posts or speak at conferences. if they got lost of professional ethics on their way up, no respect to them.
>> Is it "permission" - by submitting a site to a search engine, one effectively gives licence to use the site text
What if you don't submit your site? Can permission be assumed if you don't block a specific bot? If so, what's to prevent scrapers using that argument to justify what they do?
Often on WW the comparison between scrapers and Google is made because of the similarity in the way data is collected and the contents and layout of the pages both present to end users. For the first time I'll propose a different reason why SEs are like scrapers. Not all will agree with me but those who've been in this business for a while may see my point.
Are you familiar with the rel=nofollow issue? It is now your responsibility to use that disclaimer if the site you're linking to is not a site you would like an algo to associate you with. To take it to an extreme - your criticism of the DailyNewsArticleOnGeorgeBush can't just be a normal link, it's got to be a rel=nofollow for the SEs benefit... to help them see better.
Are you familiar with autolink? If you're discussing a book and have an ISBN number Google can make that into a link and take your visitor away to a bookseller of their choice without any compensation to you. Google may send you zero visitors but take all your one million visitors away for free. Your get out is this: Google won't change a link if you already have your ISBNs linked somewhere. So, it becomes your responsibility to make all your ISBN numbers into links ...or risk Google's toolbar stealing your traffic. Tomorrow they could extend the autolinks to words other than ISBNs and you'll have to jump whatever hoops they place (if you want to protect your traffic being stolen).
There are other examples.
Gone are the days when SEs just did SERPS. They're getting into everything now and they're getting bolder. To protect our interests we, as webmasters, are having to run harder, be more informed about issues, and keep taking steps to prevent others taking traffic and money away from us, even SEs. If you don't the SEs will help themselves - in a variety of innovative but "legal" ways - kinda like what scrapers do at the moment. What the scrapers do isn't nice, isn't pretty, isn't what you'd call ethical, and it's borderline illegal. Just like autolink.
Yes, the other similiarity with scrapers are that they are willing to get down in the mud and play dirty if they earn a few bob out of it. Make no mistake about that. Excercise the caution now, later it may be too late.
You guys need to quit talking bad about your fellow members.
i don't care if these members have a few thousand posts or speak at conferences. if they got lost of professional ethics on their way up, no respect to them.
But, given the relative silence on the issue, it would appear that most of the spammers here already know that. I doubt that they're the tortured souls, cringing from our slings and arrows, as ogilvie's comment would suggest. Laughing all the way to the bank, perhaps....