|What does Google consider breaking rules?|
everyone seems to have a different definition.
I have seen a lot of heated discussion about this topic, but it seems as if no one has a clear definition of 'spam'.
I looked around WW for a good definition but didn't find one. I'm new to the whole SEO thing and as far as I know I could be spamming myself.
Technically, wouldn't any alteration to a web page purely for the purpose of increasing its rank in the SERPs be spam?
On the other hand, wouldn't it be nice if a SE said 'this is exactly what we want in a web page' so that webmasters could build sites that they feel good about, make the SE's happy and actually have useful content?
I know that different search engines probably have different standards for detecting spam, and I am specifically interested in Google.
This could be a start..
"Search Engine Optimizers"
Reynard I think you're on to something here...I've been thinking along the same lines.
Yes, Google provides generic top level sorts of direction, but as the existance of this forum proves, there remain thousands of questions.
We don't need the details of the algo's...that's proprietary of course. But many of the questions that have been asked here come up over and over again, with varying opinions, and yet could be answered by G in ways that would help everybody.
I've seen questions on cross linking, keyword repetition, mulitple and similar URL's, and a littany of other topics. I personally havve been trying to sort out questions on redirects that (despite all the strings here), I still don't have an answer to.
We believe that in one case we had a site penalized. It was consumer friendly, never intended as spam (in fact was designed before we ever thought about Google algo's or SEO), and fits G's guideline of: "Does this help my users - Would I do this if search engines didn't exist?" But it had a component that we now think may have gotten us penalized (we still don't know for sure).
So GoogleGuy, how about it? What are proper redirects and improper ones? How can we manage multiple but similar domain names? When is cross linking bad and OK? Give us more details in what is spam and what is not. It will help stop hand to hand fighting. It will help serious Webmasters who want to do well in search (a practical reality for many of us) but do not want to spam. It will allow us to focus on content - as everyone says we should and as we would like to do - and stop worrying so much about what we think we might know that could in certain circumstances possibly get us in trouble... ;-)
>Technically, wouldn't any alteration to a web page purely for the purpose of increasing its rank in the SERPs be spam?
Google even recommends this to some degree in their webmaster guidelines:
"Think about the words users would type to find your pages, and make sure that your site actually includes those words within it."
This is another way of saying to use appropriate keywords, with reasonable density. Crossing over the line involves things like hidden text. Or, pages created to draw people to your site who wouldn't be interested in the content. Such as if I put up a page specifically to do well on a search for "purple penguins", then there had better be content of that site people searching about purple penguins likely would want to see.
Another problem with Google is the issue of crosslinking sites. Google has never given specific guidelines. This is highly relevant because this is peculiar to Google, because of PageRank. Search engines not based on link popularity don't consider cross linking spam.
Is there anything in WW that would come close to being a single, definitive and comprehensive list of do's and dont's regarding Google's definitions of spam? Sorry if it's here and I haven't found...
|We don't need the details of the algo's...that's proprietary of course. But many of the questions that have been asked here come up over and over again, with varying opinions, and yet could be answered by G in ways that would help everybody. |
But the answers to those questions are no less proprietary, and no less risky for any search engine to reveal.
Some number of webmasters and optimizers, if they knew a precise definition of what is and isn't considered "spam," would take their techniques right to the limit. As it is now, some number of people err on the side of discretion and avoid anything that might be considered "spam."
Revealing a precise definition would probably mean more spam -- even if it might mean fewer people finding out too late that they've made "innocent" mistakes.
<But the answers to those questions are no less proprietary, and no less risky for any search engine to reveal. >
JayC, with respect, not sure I buy it. Google's superiority lies in the brilliance of their algorithms and the relevance of their results. If I can build sites that follow their rules, I am by definition also creating relevant sites for the users that find me.
Competitors, including Webmasters, push to the edge all the time...and rules (i.e, laws) exist to keep them from going to far. That's the rule of law.
The problem so many of us have is not knowing the law according to Google, thus we play conservatively for both ethical and practical reasons...and get hammered by those who go much further.
Example. I've been trying to get an answer on a situation for redirecting that would keep more PR on my site, and also substantially reduce our workload when we change links. Less work, better PR...sounds OK yes? If I knew what I wanted to do was OK, I'd do it. If not I wouldn't...I'd find some other answer. Right now I think it's OK, but I might do it only to find that I suffer as a result. How is that good for Google or for anyone?
|Some number of webmasters and optimizers, if they knew a precise definition of what is and isn't considered "spam," would take their techniques right to the limit. |
It seems to me that this is already happening.
A certain level of spam is good for a search engine. The goal of a search engine is to provide web pages that are the most relevant to the search terms entered, right? But 'relevant' is a tricky word. It is probably easier to define what is not relevant. Spam makes it easier for SE designers to say 'ok, this isn't useful' and tweak thier algo, to exclude this. Then someone creates a web page that exploits the holes in the new algo. So it's tweaked again. Rinse. Repeat.
Ideally at some point down the road the SE would have a 'perfect' algo that only delivers pages with relevant content. Of course, I do not think this will ever happen.
Wouldn't it be easier for Google to say specifically what it wanted. What exactly is considered relevant content? The guidelines listed on thier site are rather vague; you can get a PR0 ban for doing things not listed there. If there were better guidelines, then anything that fell outside the guidelines would be irrelevant or spam.
The reason I asked the question initially is because I see so many complaints on this board from people being knocked down in the SERPS because of 'spammed' web pages. The definitions of spam given seem rather odd. Such as using keywords in your domain. Is this spam? Or using keywords in your alt tags? Is that spam?
Let me ask this of everyone who has ever filed a spam complaint on Google: Have you ever filed a complaint about a site that ranked below yours?
|Let me ask this of everyone who has ever filed a spam complaint on Google: Have you ever filed a complaint about a site that ranked below yours? |
I have a few search combinations I track during/after the dance each month. Most of those I'll check (at least at a glance) the top 100 entries.
I strive to check at least the same number above and below my position, within reason. (I have an oddball ranking for a page of mine at 183 for a very unusual keyword pattern that happens to catch a ton of spammers -- I stop checking at 200.)
Thing is, unless there's a decent number of problem sites in the SERP I don't bother even reporting it -- it's not worth the time. And in alot of those cases I can find a different search term combination that will tend to catch the spammers more as a group. When I get about a 10%+ density of SPAM sites I'll report. (Note I'll report any site that's really way off the mark, full of hidden text, cloaking (except sessionid cloaking) etc.)
I also don't always direct the reports to GG / include my nick here either. When I find a real bad site -- then I'll call it out. (Case in point - a site with about a 60K page, 10K of content, 50K keyword spam + hidden links + duplicate pages across 4-5 domains. Sure as hell mentioned GG on that one.)
Here's a question for you. (And GG too if he would like to comment).
I've seen a rise in 'almost-but-not-quite hidden links/text'. I think we can all say that when the text color is identical to the background color it's a hidden text/link. What about off-white on white? Dark dark grey on black? (90% grey)
Personally, I consider those hidden, only because they are obviously that color as to attempt to conceal them. I've reported some sites that use this, but they also use other 'tricks' so I'm not sure if Google wants to consider that a valid reportable item.
I doubt you will ever see Google, or any SE, offer a fixed, black-and-white definition of spam. As long as Joe Lackofethics continues to roll out new tricks and stunts to beat the system, I suspect Google will refuse to roll out a strict definition of what constitutes spam.
In other words -- as long as the industry continues to push the limits of SEO, I think Google and other SEs will reserve the right to push the limits of how they define spam.
>>If there were better guidelines, then anything that fell outside the guidelines would be irrelevant or spam.
If there were better guidelines, Google would loose the battle with spam. Good spammers dream of the day when Google publishes detailed guidelines because those guidlines would be a road map of Google exploits.
The only thing that gives Google a fighting chance is the fact that spammers have to invest a decent amout of time to figure out the holes in the algo.
Perhaps they shouldn't publish the anti-monopoly laws (hmmm...wonder what Microsoft would do then).
Also...baseball...who needs rules for baseball? Driving? Taxes?
Of course people push the limits when there are rules...that's no argument for not having them. Without clear rules how the does one keep the game fair? Google knows this on some level - they publish some rules.
I just wish they'd go a bit further. Not details; really just more like answering more of the questions that repeatedly come up in here:
Can a network of sites cross link?
Which kinds of redirects are OK and which are not?
What are the best ways to handle domains (similar domains) for a single site? Does all this have to be the matter of speculation and consensus among those who don't really know for sure?
It's like Google is a government that can arrest people without telling them why.
|Does all this have to be the matter of speculation and consensus among those who don't really know for sure? |
Random Thought: Perhaps it is in fact this consensus that forms, and reforms, Google's policy on a topic.
Hypothetically speaking, perhaps if Google sees the webmaster community reach an agreement that almost-but-not-quite hidden text is taboo, they would then act more strongly on it.
Mabye they're just trying to figure out as they go -- same as us.
>Hypothetically speaking, perhaps if Google sees the webmaster community reach an agreement that almost-but-not-quite hidden text is taboo, they would then act more strongly on it.
Now define what almost-but-not-quite hidden text is exactly?
|Can a network of sites cross link? |
Try reading this discussion:- it's not definitive but you should get a general idea of what is frowned upon by google.
I really cannot see why it is beyond the wit and guile of Google to come up with a comprehensive but fluid guideline reserving the right to change or add to their own rules at will or at a moments notice.
Many authorities including countries have rules/laws that change as prevailing conditions change or when someone discovers a loophole or takes unfair advantage so why can Google not come up with a list of exactly what they do or do not consider as spam and add or subtract from it as they see fit on a day to day basis.
The current mayhem and anarchy help absolutely no one especially Google whose reluctance to properly guide encourages the use of questionable tactics. This course would also validate the spam reports if the person reporting could point to a specific breach of the rules.
I have had some success in emailing competitors threatening to report them before actually doing so but those threats would carry much more weight if I could refer to a specific section of Googles TOS. Most webmasters would be concerned about such a threat particularly as Google is the only game in town but they may not remain so if they cannot conquer spam.
Perhaps Google should consider more seriously the massive free resource that they have here in these forums as unpaid policemen to help regulate and make the serps more equitable.
Would would be great is to have "example pages" of sites that actually got banned and why. Remove the site name of course but keep the basic layout. Have it in the webmaster section of Google. That way we can see actually "see" the types of pages that are no no's
|I really cannot see why it is beyond the wit and guile of Google to come up with a comprehensive but fluid guideline reserving the right to change or add to their own rules at will or at a moments notice. |
It isn't a matter of it being beyond their wit.
It's about them maintaining the overall quality of their SERPS. The fact the the guidelines are somewhat vague and extremely subjective, helps keep the majority of Webmasters in line.
Publishing what amounts to a "how to spam Google" handbook does just the opposite. The diehard spammers continue on as they always have, and everyone else runs out and adds a paragraph of "almost hidden text" that is one shade darker that Google's published hidden text guidline.
Thats how Stalin ran Russia for a long time.
Change the rules, dont tell anyone, penalise them for breaches, keeps them on their toes.
Not exactly fair or reasonable or a recipe for a successful future.
|Now define what almost-but-not-quite hidden text is exactly? |
See msg 9 in this thread.
<It's about them maintaining the overall quality of their SERPS. The fact the the guidelines are somewhat vague and extremely subjective, helps keep the majority of Webmasters in line.>
Huh? Can you imagine running a company this way? Nothing that is effective operates without some clear understanding.
I'm not advocating excessive details. But how exactly does it hurt Google's quality to publish guidelines on what is acceptable redirection and what isn't? What is acceptable domain name management and what isn't?
Does it occur to people how much time is wasted in this forum? Don't get me wrong...I LOVE this forum... but here's my point. I've spent several hours researching how to handle a redirection question we have. Two people gave fairly definitive answers. Two more people took opposite positions later. Nothing in Google's guidelines makes it clear. I don't want to spam...but what I want to do MIGHT be considered spam or MIGHT be fine... That's two or three hours of researching and I still don't know. What a WASTE of time! It kills me when I think I could have used that time to actually build more useful pages, etc.. Google - give us simple rules here on FAQ's, and I will follow it...I WANT to follow it.
And if that rule, once published, is pushed or broken by someone else - throw a PR0 at them! Why is this difficult?
BTW: <Thats how Stalin ran Russia for a long time.>
|And if that rule, once published, is pushed or broken by someone else - throw a PR0 at them! Why is this difficult? |
But what if the 'rule' is subjective by nature? SPAM in the SERPS is pretty much subjective as it is, but lets take a specific example.
(I recently got a sticky about my almost-but-not-quite hidden text so I'll use the example they gave, thanks 'you know who you are')
Take two websites. Lets say, for the purpose of duscussion they are equally squeeky-clean (guideline wise) sites. They both put in almost-but-not-quite hidden text. 10% grey on white background.
Site B uses it to link to Google and to a few unrelated domains.
Progmatically they are both breaking the guidelines with the link color they've chosen. Site A is using the color to maintain the layout and design of his site, while addressing a shortcomming in GoogleBot. (Remember Google tells us to make pages for people - not spiders.) The net result of his 'guideline-breaking' activity is the same as removing his JS menus and using A HREF tags. (Which he doesn't want to do)
Site B is using it to exchange pagerank amoung his domains and attempting to increase his position in the SERPs.
With a flexible, intentionally vague, guideline Google can PR0 site B and not have to deal with site B saying "site A is doing it too!" and having to PR0 them as well.
With a slightly different spin, and agreeing with WebGuerrilla: If there has to be more then a 10% color change between text and background color all that will happen is every site with this kind of hidden text will simply switch from 10% grey to 11% grey and be completely 'legal', so to speak. Meanwhile everyone else, in order to be equal, has to pull similiar stunts to increase their rank in step with the spammer. It's a vicious circle.
While I agree with the statment "Power corrupts; Absolute power corrupts absolutely." I have yet to see an example of a PR0'ed site for anything other then deserving action (or government/legal involvement -- see also Google Germany)*
(* - Exceptions for transferred domains, or new owners inheriting a PR0ed domain -- but I think GG has hinted they're working on that)
One last thing to note about this discussion. I suggest everyone take a trip over to the IETF and read RFC3514 -- It deals with similiar subjective issues and is quite appropriate. [ietf.org...]
Plain and simple.
Google won't publish "rules" of what is and isn't spam because those rules would constantly change as people find new ways to spam.
Google won't give examples of this is how to build a "Google Optimized Web Site" because then the value of all those web pages would be essentially the same.
Google probably won't give any more details as to what is spam because it would allow people to better optimize for Google thus making Google a place of mediocre SERPs. Do remember Google has trademarks for the way it does things so telling people how it does what it does so well would basically negate those trademarks.
No one is expecting Google to disclose their algo but merely to disclose in detail what they consider to be unacceptable.
The two are completely different and such disclosure would not lead to any banality in the serps but would at least level the playing field and also assist those who inadvertantly transgress. It would also put some depth and meaning into the spam reports which in the main seem to be ignored.
|Make pages for users, not for search engines. Don't deceive your users |
That takes care of a lot of stuff right there. Hidden links? Buh bye. Keyword stuffing (like a 6:1 ratio of keywords:content) - cya.
|and such disclosure would not lead to any banality in the serps but would at least level the playing field and also assist those who inadvertantly transgress |
For the purposes of discussion lets say there's 4 levels of 'spamality' of a page. White on one end - pristene, clean page. Black on the other, claking, spam, hidden text and links, etc. And throw in spam-lite and spam-heavy in the middle....
Google's gonna whack the black pages no matter if they publish a detailed, specific guideline. The other 3 are where this gets interesting.
If for the sake of argument presume webmasters, as you allude to, are hesitant to leave the white area for fear of bans due to 'inadvertant transgressions' we can say there are a very significant (possibly majority?) number of pages that fall into that category now.
If Google publishes a spec that legitimizes the spam-lite category (or most of that category with slight modifications) how does this help those "white" pages who have followed the old guidelines, "make pages for users, not search engines"? It doesnt, as those pages will most likely, fall in the SERPS as more pages go and become spam-lite types of pages.
Then, as webmasters update their pages and convergence is reached you've got a SE fill of spammy-ish pages. Why? All becuse they opened their mouth.
Then what if they have to make a change, how much notice do they give? And what about the (bank on it) bitching from those who 'didn't have long engough'?
Sometimes not saying anything is the best responce.
I think you will see what is unacceptable on here faster then any published guideline, judging by all the why has my site disapeared threads.
Well said Daroz.
--Not exactly fair or reasonable or a recipe for a successful future. --
Geee! It seems to be pretty successful so far!
The only people griping about it are webmasters!
The users get what they search for and that is what it is all about.
Sure, there are problems and Google is working to address them. I am confident that they will.
Publishing a road map is just like posting a speed limit. How much over it can you go with out getting a ticket. I like the approach they use now. If you make a clean site, using the basics you will have not a problem. If you try funny stuff, then you do so at you own risk and people know it.