Forum Moderators: Robert Charlton & goodroi
If we are doing something wrong we would obviously appreciate any clue as to what that thing is. How big a deal could it be to tell us?
It gets worse. Fixing the problem and requesting reinclusion does not result in immediate reinstatement. I understand that reinstatement occurs after the expiration of some "punishment period". The length of the punishment period is a secret and apparently varies depending on the alleged "offense". A "fix something and see if that worked" approach therefore could take years so legitimate web site owners are forced into a "shotgun" approach in which they make multiple, expensive changes in the often futile hope that one of them will eventually get them reinstated.
Spammers have none of these problems. Domain names are cheap. A spammer can serve up the same (or nearly the same) information under 20 different domain names and get 20 times the exposure of a legitimate site. If Google eventually finds and bans some of these domains, more are easily added. Spammers don't have to worry about registered trademarks, brand recognition, business cards, print ads, etc. Google policy harasses legitimate site owners and has no effect on hard core spammers who are laughing all the way to the bank. Google should disclose specific reasons why a site has been delisted and discontinue the childish punishment period. Google's policy of treating legitimate web site owners as the enemy is creating spammers, not helping with the spam problem.
I have noticed that the quality of Google searches has been declining and there are ever more garbage sites popping up in high ranking results. Google seems to be gradually losing the spam war. I suspect they are putting most of their effort into diversifying into email, maps, video, etc.
People use Google because they DON'T want their data filtered and are willing to put up with some level of garbage to get it. It is dishonest for Google to pretend that they have no editorial filter and that their search engine is a mechanical, unbiased, device if it is not.
Every search algorithm is based on certain assumptions, parameters, and objectives, because users aren't looking for a raw data dump of pages that include a certain keyword or keyphrase. Users want search results that are arranged in the order of relevancy or quality.
What's more, a search engine may choose to filter pages, sites, or specific SEO techniques (such as artificial linking patterns) that appear to be gaming its algorithm. Why? To increase the likelihood of delivering relevant search results.
This doesn't mean the search engine is "biased" or that it's using what you call an "editorial filter." And if you aren't doing well in Google, it doesn't mean Google is out to get you.
All google needs to do is introduce some kind of 'premium membership' in return for a fee.
So the spammer that made a million with his phishing scams can afford to get a top listing, while content-rich non-commercial academic sites (like mine) will disappear from the first dozen or so pages of results. This hardly seems a useful result -- for the good sites, anyway.
Tiebreaker said:
[Spammers'] sites would not stand up to a manual review.
Tiebreaker said:
...premium members would have their sites added to the index immediately.
Eliz.
My personal favourite from BT and on-topic for this thread.
Msg #1: I thought you knew [webmasterworld.com]
As of a few days ago, your directory became obsolete.
Honestly, I rather use [local.google.com...] . This way I avoid useless bandwidth.
If you feel there are directories that shouldn't be in Google's results, just fill a few of the forms available at [google.com...] .
In my opinion, most of Googles spam problems are caused by their obsession with automating everything in their algo - if you do that, you will always get some good sites thrown out with the bad
and...
To be fair to Google, they're facing an exponential growth in spam
Personally i think the problem has arisen from policies devised by Google and others' own making and I am sure they realise that. I think Adsense/Adwords took off more than even Google expected and now the "fix" leaves them in a catch-22 situation. It probably looked great on paper and to investors and stock markets, but will it eventually prove to be the straw that broke a camel's back'?
Adsense is basically Spam repackaged. Its a program to display ads to search engine surfers via the search engines themselves and the websites in SERPS, plain and simple. Easy to see the problem. Initially it would seem the only possible way towards spam-free SERPS is therfore to remove Spam IE: to remove/delist/relegate largely Adsense dominated pages. Simple, but not something that will ever happen. Its gone way to far for that.
Sure they ar etrying to fix it and hence threads like this that appear to also penalise real players. And like it says above, automation can only achieve so much.
Solution? Only one that i can see. Lobby ICANN to make all domains too expensive for your average spammer. It'll happen and it's the only solution if Google want to retain relevant SERPS and thus, market dominance IMHO. Plus it helps ICANN to work towards resolving other issues regarding brand rights and ownership.
>>Solution? Only one that i can see. Lobby ICANN to make all domains too expensive for your average spammer. It'll happen and it's the only solution if Google want to retain relevant SERPS and thus, market dominance IMHO. Plus it helps ICANN to work towards resolving other issues regarding brand rights and ownership.<<
But what you mention is self defeting obstruction for ICANN :
"As a private-public partnership, ICANN is dedicated to preserving the operational stability of the Internet; to promoting competition; to achieving broad representation of global Internet communities; and to developing policy appropriate to its mission through bottom-up, consensus-based processes."
Furthermore making all domains too expensive will serve spammers who can afford it, while for example the average publisher and students will be penalized.
[edited by: reseller at 9:20 pm (utc) on Oct. 8, 2005]
Adsense is basically Spam repackaged.
AdSense is a perfectly legitimate form of advertising. The problem isn't with AdSense per se, but with who's using it, how they're using it, and how Google enforces its own rules against "made for AdSense" sites.
Easy to see the problem. Initially it would seem the only possible way towards spam-free SERPS is therfore to remove Spam IE: to remove/delist/relegate largely Adsense dominated pages.
Sure, but why stop there? Why not remove all pages with affiliate or e-commerce links? That would get rid of even more spam. Better yet, why not purge all pages that contain any form of advertising? And every corporate domain, too? Restore the Web to what it was before 1995, and there won't be any spam problems at all.
Furthermore making all domains too expensive will serve spammers who can afford it, while for example the average purblisher and students will be penalized
True. But 2 points: a) its not about totally ending Spam, its about reducing it and returning to sensible SERPS and a search engine/directory that provides a user with a search experience that meets their expectations and b) said students and publishers would have to think more about content rather than duplicating sites. If domains were (example) $500 a pop, they would still potentially work but may require more thought and demand more dedication.
And a spammer who can "afford it" at the moment can afford to have 100 domains all saying the same thing. This should at least reduce the number of domains on that portfolio. For example Ultsearch have a ton of domains, but how many would not exceed a (example) $500 threshold and thus become unviable?
AdSense is a perfectly legitimate form of advertising. The problem isn't with AdSense per se, but with who's using it, how they're using it, and how Google enforces its own rules against "made for AdSense" sites.
True.
Sure, but why stop there? Why not remove all pages with affiliate or e-commerce links? That would get rid of even more spam. Better yet, why not purge all pages that contain any form of advertising? And every corporate domain, too? Restore the Web to what it was before 1995, and there won't be any spam problems at all.
Sarcasm duly noted - lol ;) As per above, its not necessarily reducing spam by eliminating *all* forms of advertising but presenting a search user with results that are valuable. Not many care too much if a site contains advertising as long as a) its not overly intrusive and b) the info around it is useful and/or what you wanted and its not page after page of the same old same old in SERPS.
As per above, its not necessarily reducing spam by eliminating *all* forms of advertising but presenting a search user with results that are valuable. Not many care too much if a site contains advertising as long as a) its not overly intrusive and b) the info around it is useful and/or what you wanted and its not page after page of the same old same old in SERPS.
All the more reason for not filtering sites with AdSense, because AdSense is one of the few ways that quality mom-and-pop publishers, nonprofit sites, etc. can monetize their content or earn back their hosting costs.
All the more reason for not filtering sites with AdSense, because AdSense is one of the few ways that quality mom-and-pop publishers, nonprofit sites, etc. can monetize their content or earn back their hosting costs.
I'm all for equality, dont get me wrong. But what you're basically saying with "not filtering sites with adsense" is let everyone do it, in any way they like, and make it accessible to all.
What I'm saying is, sure, give everyone the option but price it to avoid 1 milion people registering 10 domains each, filling them with adsense and hoping they "earn back their hosting costs". Price it to encourage people to work harder and give more thought to fewer sites.
After all marketing 1 site against one million is, in theory, no harder to marketing 10 sites in 10 million and it makes finding what you want on the web an easier task in the process.
It'll reduce squatting, free up domains for stronger branding opportunities and give the same "mom-and-pop" more choice on their domain name.
Kaled - I'm sure Google could employ 1000 people in India tomorrow, no problem - come to that, 5,000 or 10,000 if necessary. If sites remained banned following a manual review, it is for a good reason - they are junk - Google can hand review my site any time they like.
>>> That might be a fine idea if Google were only a commercial index. But I don't think Google is ready to favor e-commerce and affiliate sites over academic, reference, hobby, and other information sites that wouldn't be able to justify paying extra for Business Class perks.
EFV - that's a fair point - but there's a simple answer to this I think - Google could price it so that they could afford to offer the service for free to non profit sites with genuine quality content.
Effectively, the money guys would be subsidising the college guys - but I wouldn't have a problem with that.
I would pay just about any fee Google cared to name if it got my site out of the sandbox!
>>>>> So the spammer that made a million with his phishing scams can afford to get a top listing, while content-rich non-commercial academic sites (like mine) will disappear from the first dozen or so pages of results. This hardly seems a useful result -- for the good sites, anyway.
Spammers won't get their sites listed no matter how much money they have made - their sites will be junk - non commercial, academic sites will get listed for free, subsidised by the money sites.
>>>>>>...premium members would have their sites added to the index immediately.
So then Google would not be doing a manual check...? I'm sorry, but I don't see how letting people pay for immediate listing and top ranking is going to help retard the growth of spam-laden results.
When I say added immediately, I mean immediately after a manual check - assuming it passes of course!
Also, paying a fee would not be a way og guaranteeing top ranking - just a way of guaranteeing inclusion in the index.
>>>> Assuming such a time-intenstive manual review process were even feasible, why would the spammer not get himself listed with a sample of "good" content, and then, upon listing, quickly replace this with phishing scams and viruses? "Bait and switch" is hardly unknown amongst that crowd.
This is a fair point - bait and switch would be a problem that would need to be addressed - but I would suggest that a team of about 100 people working solely on checking sites that have previously been passed as clean could probably keep this issue under control - wages are cheap in India.
But even if the bait and switch issue couldn't be fixed totally, this system would produce vastly improved SERPS - don't reject an idea because it's only 99% perfect, when the current system is only 50% perfect!
I'm all for equality, dont get me wrong. But what you're basically saying with "not filtering sites with adsense" is let everyone do it, in any way they like, and make it accessible to all.
I'm not saying any such thing. What I'm saying is the equivalent of "Don't throw all Palestinians in jail because some Palestinians are suicide bombers."
By the way, in my sector (a huge sector, travel), made-for-AdSense sites don't produce nearly as much spam as affiliate and booking sites do. So filtering sites with AdSense wouldn't clean up the spam problem. Google would need to filter sites with affiliate or e-commerce links as well.
What Google search could do (and should do) is be more aggressive in filter sites that misuse AdSense. For example, the AdSense TOS allow publishers to have multiple ad units, and those ad units can all be displayed "above the fold" in a way that makes them look like directory content. This is a typical ploy of scraper sites, and it shouldn't be that hard to identify and whack pages that use AdSense ad units in such a way. Similarly, automatically generated "user review" sites that display several ad units but have no content (except a line that says "Add a review" or "No review for this product") could and should be filtered by Google search.
Of course, I generally agree. But we are not talking about the algorithm. We are talking about admitted manual blacklisting of individual sites by domain name. Blacklisted sites are excluded from even being indexed and may not even be subsequently crawled. The blacklisting criteria are secret, which is an engraved invitation for abuse.
I also agree that actual abuse prevention criteria that are equally applied to all sites is not "bias". Blacklisting some sites but not others for a particular characteristic is definitely "bias". Blacklisting directories that are not on a secret "good-directories" list would be an example of bias.
Search engines are known to rank sites partly based on site (domain) traffic as might be measured by Nielsen or Mediametrix. I don't consider that bias.
However, "quality" leads to very subjective criteria. What happens if sites about Democrats are by definition "low quality" according to the viewpoint of some Google employee? Would you consider that bias? Would we ever know?
Google does not admit to blacklisting sites based on "quality" but pretty much insists that they are blacklisting only for "abuse".
The problem with a premium membership system is that Google would not be able to keep up with demand. In addition, those whose sites remained banned would be very unhappy. Nevertheless, there is merit and logic in this idea and I would not be surprised if Google implemented something along these lines.
I think they would be well able to cope because the system could be self funding. Let's say that they charge a non refundable $250 for a site review. How long would it take to review a site? Five minutes, ten minutes, 15 minutes? Reviewers could pull in $1K to £2K per hour. They could afford to employ as many as they liked at this rate. They could also outsource it to countries where they could pay reviwers less than $10 per hour. A supervisor could be emloyed for perhaps every forty or fifty reviewers. The supervisor would be responsible for quality control.
Think how many people they could employ doing this? If premises were a problem they could use home workers. doesn't this sound like a pretty good business model?.
That might be a fine idea if Google were only a commercial index. But I don't think Google is ready to favor e-commerce and affiliate sites over academic, reference, hobby, and other information sites that wouldn't be able to justify paying extra for Business Class perks.
They wouldn't have to. They could allow non-commercial sites in for free. The ratio of commercial versus non commercial sites would ensure that this would not be a problem.
I am on my soap box again here because I firmly believe that the only way forward is through manual review. Spammers are very clever and there are millions of them around. Whatever the search engines do to combat spam the spammers will continue to find ways round it. This is a fact of life. No algorithm will ever be capable of effectively preventing this.
If Google want to be the "organisers of the worlds information" then they must bow to the inevitable, i.e. manual screening. We have had a few years of this now and it is clear that the spam situation is only getting worse.
Come on Google! Do yourself and the rest of the world a favour. Kick them out. If you like I'll manage it for you.
Good to hear you are thinking exactly the same as me - introducing a manual review element into the mix is the way forward - I can't understand why Google can't see it
Nobody does the job better than humans!
I think people also vastly overestimate the amount of work that would be involved manually policing the index.
I can't remember the actual figures - but isn't 99% of email spam in the world sent out by about 100 individuals - I expect web spam in the Google index would follow a similar pattern.
Anyway - it doesn't matter how many people it takes to keep the index clean - there is no limit to the amount of people that Google can employ
Employ as many as necessary and charge us accordingly.
introducing a manual review element into the mix is the way forward - I can't understand why Google can't see it
They have introduced a manual review element:
[webmasterworld.com...]
I think people also vastly overestimate the amount of work that would be involved manually policing the index.
Certainly it wouldn't be that hard to keep tabs on the biggest and most blatant offenders, such as autogenerated "user review" sites with millions of pages that often contain no content. (I can think of a couple of brand-name technology publishers that fall into this category.) But that would be a short-term solution, because it's so easy for spammers to take a MIRV approach (MIRV = "multiple independently targetable reentry vehicles," for those who grew up after the ICBM era.) In my opinion, it makes more sense for human evaluators to feed positive and negative examples to a "black box" that does the heavy lifting. See Ronburk's post #175 at:
[webmasterworld.com...]
(Mind you, that doesn't mean that penalties couldn't be applied manually when sins were brought to Google's attention or when a message needed to be sent to high-profile offenders.)
The suggestion that Google could hire unlimited numbers of evaluators doesn't sound very practical to me. The more people you have doing the evaluations, the harder it is to be consistent (DMOZ is a case in point).
They have introduced a manual review element
Not according to Google Guy who said, "The system that was up at eval.google.com was a console to evaluate quality passively, not to tweak our results actively. But when Henk van Ess submitted his own blog to Slashdot, he asserted "Real people, from all over the world, are paid to finetune the index of Google," and that made it sound like people were reaching in via this console to tweak results directly, which just isn't true at all."
It was not exactly news that they checked quality of their results and the only way to do that is manually.
The suggestion that Google could hire unlimited numbers of evaluators doesn't sound very practical to me. The more people you have doing the evaluations, the harder it is to be consistent (DMOZ is a case in point).
EFV I think it is completely practical, for the reasons I detailed in message 45 above. Regarding DMOZ, volunteers and paid staff are two different animals so it's not fair to use them as an example. Believe me ... it's only a matter of time before the penny drops. Manual screening must happen, if not with Google then with some new kid on the block.
The system that was up at eval.google.com was a console to evaluate quality passively, not to tweak our results actively.
I think that could be interpreted as gathering positive and negative examples to "train" a black box. In any case, whether it is or isn't being used that way, it could be used that way, and it would be more practical than relying on a massive corps of quality checkers to fine-tune the results manually day in and day out.
Google should manually review every site that displays Adsense. Other ad companies I work with require each and every domain on which ads appear to be submitted and reviewed.
This would not only undercut the vast majority of scraper spam, it would also help salvage Google's reputation as both an advertising aggregator and search engine.
Cuz, ya know, paying folks to screw up the SERPs just makes Google look cheesy and stoooopid.
Google should manually review every site that displays Adsense. Other ad companies I work with require each and every domain on which ads appear to be submitted and reviewed.
I agree 100%.
This would not only undercut the vast majority of scraper spam...
Would it? I'm not so sure. Instead of submitting a site on the life of Madame Curie, getting approval, and then slapping the AdSense code on a dozen unrelated "made for AdSense" sites, couldn't a spammer just as easily publish the made-for-AdSense pages within the approved domain? (Think life-of-madame-curie.org/viagra/ and life-of-madame-curie.org/mesothelioma/.)
I think Google is trying to give good results, but...
Collateral damage hurts the index. More specific guidelines would help (e.g. when does an affiliate booking system hotel page become a bad thing?).
I'm sick of hearing how specific guidelines give spammers what they need. Sure that's an issue, but the downside is swamped by the advantages of a more transparent system. In fact I agree with those who suggest that it's the black hat SEOs who benefit the most from the current mysterious filterings and penalties since they can spend their time figuring out the algo while the rest of us are creating content.
A tour of the forums indicates how much confusion there is by legitimate sites. Confusion hurts users by forcing webmasters to do exactly what nobody likes - try to second guess the algos rather than focus on quality content.
I think that could be interpreted as gathering positive and negative examples to "train" a black box. In any case, whether it is or isn't being used that way, it could be used that way, and it would be more practical than relying on a massive corps of quality checkers to fine-tune the results manually day in and day out.
EFV black boxes is what they already have and it is their inefficiencies we are discussing in this thread. In my opinion it would be neither practical or any more effective than what they are doing now. People continue to press the point that manual checking would not be practical but this is just defeatism.
Manual would be very easy to implement. I could do it myself. What's the problem? It really could make a massive difference to the quality of the results in a very short period of time. Perhaps the real problem is that Google just don't want to reduce the number of websites in their indices. Clearly doing so is not in their best interests because spammers contribute a massive amount to their profits so perhaps their policy really does favour spammers?
Perhaps the real problem is that Google just don't want to reduce the number of websites in their indices. Clearly doing so is not in their best interests because spammers contribute a massive amount to their profits so perhaps their policy really does favour spammers?
Questioning their motives is just trash talk; it doesn't contribute to the discussion. And expecting them to manually check millions of site just isn't realistic: Google is a spidered search engine that was built on algorithms, not a Yahoo Directory or a DMOZ. Trying to impose a new search model on a company that is committed to scalable algorithms isn't likely to be a productive use of anyone's time. (If the leopard doesn't want to change its spots, maybe there's an opportunity for a new breed of cat!)
Manual would be very easy to implement
Right. Scalable solutions are needed and manual does not scale well, but they don't need to review every site - only those that ask/pay for it until the filtering systems are working better.
How much would people object if Google ran a registry of site owners who supply complete contact info and agree to comply with the TOS?
Most junk sites have bogus or no contact info. Failing to register would not get you banned, it would just mean you'd be subjected to a higher scrutiny level. I'm sure they do some things along these lines using WHOIS data but again transparency is preferable to secrecy.
Google should manually review every site that displays Adsense. Other ad companies I work with require each and every domain on which ads appear to be submitted and reviewed.
I agree 100%.
And me. Though i still think domain pricing is a major factor. Interestingly i saw today that the new .travel domains are $240 each. Good move IMHO.