Forum Moderators: open
I'm programmer first and a reluctant webmaster second. I'm used to having to find workarounds for bugs in Windows. Complaining to Microsoft is truly a wast of time.
It seems to me that there has been much discussion lately about panalties arising from hidden text - even to the point of pages being removed from the Google index. Since I don't use hidden text, I cannot comment from my own experience, but I can look at this issue objectively.
There are three possible policies a search engine can take when encountering hidden text/links, etc.
1: Take no special actions.
2: Ignore all such text, etc. Quite literally, pretend it isn't there.
3: Apply some sort of penalty.
Now lets look at how/why hidden text might be created.
1: As a deliberate attempt to spam search engines.
2: By accident. (Probably applies mostly to zero-length links).
3: Deliberately but without any intention to spam. For instance, a Webmaster might leave himself a todo list in white text in an empty table cell.
Now lets look at how a spammer might fool a search engine.
1: Use very bright gray text against white, etc. (Presumably search engines that worry about hidden text implement thresholds).
2: Use javascript to change background/text colors. (I imagine this can be done with tables but I've never tried it.)
Let's be realistic about this. There is not a snowball's chance in hell of a search engine being able to analyse all the javascript on a page to see if it is being used to hide text. It therefore follows that a spammer will be able to fool search engines using hidden text with no difficulty at all.
It therefore follows that applying penalties where hidden text is found is a total waste of time (if the intention is to combat spam). This just leaves two sensible policies for hidden text.
1: Take no special actions.
2: Ignore all such text, etc. Quite literally, pretend it isn't there.
Now none of this is rocket science. Indeed, if the boffins at Google can't work this out for themselves, they must be suffering from a combined IQ of a retarded chimpanzee.
So, let's see what GG has to say. Do Google boffins shuffle around on their knuckles? Judging from the mess Google is in right now it's easy to laugh and just say yes but the answer in reality is probably no. Nevertheless, given that it is child's play to use hidden text in a manner that can't be detected by search engines, I think it behoves GoogleGuy to say definitively what the policy is. After all, as I have explained, penalties are more likely to catch the innocent than the spammer.
I had coded style="color=silver" instead of style="color:silver".
It had been there for ages - simply because I couldn't see it myself and forgot it was there. Over the weekend the site went completely. I checked it for problems and there it was, clear as daylight, on the index page.
What a sickener.
No - I don't accept these penalties at all. Yes, one reason is that innocent people get hung. However, another is that the concept of banning sites unnecessarily is just not right.
Why not just ignore the links, so that no benefit is obtained? Wouldn't that be better? In that scenario the spammer just wouldn't know what was working and what was not.
To dish out 'punishment' like this, to the innocent as well as the guilty, is surely not the role of a search engine, especially one in such a dominant position.
I still feel absolutely sick about the whole thing. I have no real idea whether I will be back in 30 days or not. In the meantime, it's cut back time, as our income has been badly shaved.
This is definitely an area where I think Google is dumb to penalize instead of just to ignore. I think it's probably impossible to get it 100% right, and they just create a ton of email to answer for themselves by removing people from the index.
It seems like if they just approached text and link visibility holistically, like grade it on a scale from huge <h1> black text links on white (very visible) down to low-contrast 1x1 graphic links (invisible) it'd take care of itself. And it seems like that'd be more in line with the random surfer model, they'd be more likely to go through more visible links.
It doesn't seem like they'd have to spend as much time answering emails or "recertifying" sites this way, I doubt many people would email in to complain that Google seems to ignore their invisible text and links.
If I project a server-side application for istance, it's not rare the use of some hidden text as a way of transmitting variables. This is certainly functionality, not spamming. But what if in a form there is an hidden field full of keywords?
In the case of hidden fields, the reasonalbe thing to do is just not index them. By the way, hidden fields are incredibly easy for an algorithm to find! BigDave's blue text over tropical birds in the jungle is another matter entirely!
Now, how far should they go with it?
Bringing a JavaScripted page with nothing but hidden text into the google index isn't a big achievement. As you said it: it's a child's play. It's not even worth to make a bet on it (erm, ... as long as you're not under the age of, say 12).
But if they'd one day penalize the use of special java script syntax, i'd *make a bet with you* that people would complain about all this collateral damage and all the innocent webmasters.
I agree with you that there are only a few options how google can decide. However, *give* them a option.
So what is a good decision?
Well, one of my best sites got zapped over the weekend for hidden text. A complete error.
-snip-
To dish out 'punishment' like this, to the innocent as well as the guilty, is surely not the role of a search engine, especially one in such a dominant position.I still feel absolutely sick about the whole thing. I have no real idea whether I will be back in 30 days or not. In the meantime, it's cut back time, as our income has been badly shaved.
Napolean,
To be fair to Google, people here (and elsewhere) have been warning for weeks about the introduction of a hidden-text algorithm; it's not as though they've introduced it out of the blue.
Webdoctor
I'm suprised google can't detect this automatically. This type of linking cannot be for the benefit the user.
I'm sure that if they could, google would choose this option. They already do it with comments and meta keywords.
Yeah, people make mistakes. And sometimes you have to pay for your mistakes. And other times you have to pay for someone else's mistakes. A 30-day penalty sure seems stiff when it is applied to you, but it isn't so bad compared to what used to happen when you were caught by a manual check.
Just wondering Napoleon, did you run a HTML or CSS validator on your page? I would be interested to hear if iw was caught or not by some of the common tools. If you did validate it and the error was not caught, I think that would cause some serious concern.
We also should not assume that google will penalize legitimate usage of hidden text in DHTML or javascript until we find cases where that happened. I expect that there will be corner cases that they will mess up on, but I would be willing to bet that those cases will not be very common if they implemented it how I think they did.
incywincy...
I have DOZENS of sites linked to 1 of my sites through a single invisible 'period.' This transparent GIF goes back to my dedicated ad/tracking site.
... I would certainly assume that Google has and will continue to allow for this, or there will be TONS of sites hitting the dust.
But what about blue text on a picture of macaws in the forest? You might have one letter over the blue tail feathers, and the rest over the green leaves. Is that enough to get you banned?
I dunno. But it'd be enough to make your site look really amateurish at anything other than the monitor resolution, installed fonts, font sizes, gamma correction, window size and possibly browser version that you used. Many people are likely to see the text overlaying the image -- unless it is really tiny: but that would trigger another check.
Napoleon: that's a bummer. But, hey, there's a business opportunity here.
How much would you pay for a site validation program that checked for all known issues that might get you googledumped? If you write it, they might pay you for it :)
BigDave...
Maybe you've hit on something new for Google Labs. An online (anonymous) service that you paste your code in, and Google uses a portion of its algo to determine a 'thumbs up' or 'thumbs down'...
It would probably get overworked though... People stuffing in everything under the sun to see what they could get away with. ;)
To right... and the worst thing is like everyone else I knew it was coming (no Webdoctor, I'm not an idiot). I also knew I had no hidden links... except I didn't know I was an incompetent coder!
But this really supports the point. If I was spamming I would accept the penalty with good grace. But I wasn't: meaning that this can and will be happening to plenty of other innocent people as well.
In my 'spammer history' thread ( [webmasterworld.com...] ) I mentioned that I don't think 'collateral damage' is acceptable when it is unnecessary. In the case of hidden links it is certainly unnecessary.
PS: I like the positive outlook Victor... made me smile when I havn't done a lot of that today!
I'm sure they will try hard not to penalize sites inappropriately, because being as accurate as possible is in the best interest of their SERP's. But this is hardly a court of law, so just because some given site may be "innocent" of intentionally spamming doesn't mean that it's unjust for Google to omit them.
Why not just ignore the links, so that no benefit is obtained? Wouldn't that be better? In that scenario the spammer just wouldn't know what was working and what was not.
Without penalties, Webmasters would have no incentive to walk the straight and narrow. They'd just keep lobbing their latest tricks at Google to see what worked.
If you *see* the text, then an algorithm based on the "visual rendering" of the page (just as any web browser does) will be able to "see" it too, and it could easily find almost any hidden text.
Then, a person could decide if the invisible text is used for SE spamming purposes or not.
As for those that are using hidden text "innocently", Google has stated for a long time that having hidden text is a reason to be removed from the index. Therefore the "innocence" of the use has nothing to do with it. Use hidden text, risk not showing up in google. It's that simple.
I disagree - it is NOT that simple. There are ways of using dhtml and css that you might call hidden text, but are valid part of the design features of the site. There are many large and high profile sites which use these features effectively and without any intention to manipulate the SERPs.
Google is a business and they live or die by their relevance. Their priority is to give relevant results - they are not going to throw out massive numbers of relevant sites just because they use a particular design feature (say a layer based menu system, or a scrolling layer of information). Their algo will likely detect and ban some types of 'hidden text' which can clearly be identified as 'manipulation' whilst allowing (to a greater or lesser degree) some other techniques that may have valid uses.
That means the 'innocence' or intent of the usage IS a factor, and it is the main challenge they face in filtering 'hidden text'.
Google are not really bad guys, and their motives for introducing penalties for hidden text may be laudable but the simple fact is this : some good sites with good content are going to suffer. If those good sites suffer, then, so do the users that fail to find them. So, by penalising the wrong sites, both the users and the webmasters are unfairly punished. If Google is our friend and not our enemy, the metaphor of friendly-fire works, because they will hit good sites as well as bad.
The government cannot abridge your freedom of speech. That does not mean that I cannot kick you out of my house for saying that you think disco is the music of the gods.In fact, freedom of speech is one of the best protections for google's right to exclude any sites that they wish for any reason that they wish. They are expressing their opinions about which sites are good.
What you say is partially true but utterly irrelevant. As a private individual, I have the right to throw a party in my own house and only invite white people. If I ran a nightclub on the same basis I would rightly be prosecuted. In other words, companies are bound by a different set of standards. I am not a lawyer, but I am quite certain that Google are on thin ice here. The argument that you are not obliged to use their services is also wrong. In practice, if you run an internet site you are obliged to use Google. EVERY webmaster will agree on that. How many expert witnesses would testify otherwise in a court case? ANSWER - not many.
When I previously described how I use keyword lists in my site I was also criticised for bad style and accused of spamming. Well here's something to chew on. On my most visited page, 50% of visitors proceed to download the software thereon. Given that some of these visitors will be people looking to see if I've released an upgrade, that is an extraordinarily high approval rating for my site. (Of course it says nothing about the quality of the software).
If you are selling goods or services, it is often the case that different people use different words/spellings to describe the same thing. Given the international nature of the internet, this is more true now than ever before. Now, you can either try to fit all these words and spellings into your body text, or you can use consistent and simple phrasing and place the similies in a keyword list. THIS IS NOT SPAM, it is simply a good and tidy practice. Some webmasters who've spent days or even weeks working on the visual design of their site might reasonably say, "lets put the keywords in hidden text." Again, THIS IS NOT SPAM.
Personally, I think site functionality is important and am not worried about the visuals so I'm happy to keep my keyword lists in plain view where everyone can see them. However, some webmasters may have a different idea to as to what constitutes the best look and want to hide their keywords (without devious intent).
In banning the use of hidden text, without a doubt, Google are restricting freedom of expression. Whether they are breaking US law, I am not certain, but they might be.
So, should Google ban hidden text? Well the argument boils down to this.
Will spammers be adversly affected? Answer: in the short term maybe but in the long-term no because Google's best algos will never be good enough. I am 99.9% certain that I could defeat anything Google could come up with and I don't even consider myself to an expert in web design.
Will legitimate websites be adversely affected? Answer: yes in the short-term and possibly yes in the medium to long term. There are many websites out there that are not frequently maintained, and many of these will have been optimised under the old rules that allowed hidden text. Presumably, many of these websites will vanish without trace. This cannot be good for users.
If Google continue to ban hidden text, users will suffer. Spammers will just move on to modified strategies. Google are not sorting out a problem by banning hidden text, they are creating one.
There are many better strategies that could be used to counter spamming. Look at the problem from the scratch. There are basically two forms of website spam. Over-use of of keywords and over-use of links. Links are tricky because you need to perform recursive searches that could get quite deep. Keywords, in comparison are trivial to deal with. You can set limits on the number of repeats per paragraph, per sentence, per page, etc. Once the limit is reached, then you simply ignore the rest. Alternatively, go too far over the limit and the relevance counter could be decremented for each keyword occurence instead of incremented. Dealing with keyword spam is not the big deal that Google would have us believe. Links are definitely tricky little b*****s but keywords are not.
Once again, I don't know how search engines work but I am a programmer and I solve far more difficult problems than overcoming keyword spam on a regular basis. It isn't rocket science, it's pretty simple stuff - far more simple than working out how to implement a search engine index from scratch. So why have Google gone down this road? I am at a loss to answer that one. I suppose there are two possibilities. Either an idiot has reached a high position in Google or else they are just too close to the problem and need to stand back to see the big picture.
That's it. I've said my piece. Unless I get penalized by Google for something or other I'm not going to add further to this discussion. Hmm, I think I can already hear the Yippees.
You make some good points as to why Google would want to be careful about how they go about it, and i suspect thaty they have been very careful about what they have done. Only time and penalties will tell.
But I stand by my statement that google has long had a policy against hidden text, innocent or not. They are not going on a witch hunt. They would rather have your innocent site in their index.
They have been running the hidden text software for over a month now, and I have yet to hear about any sites using hidden layers for DHTML complaining. Shouldn't there be some complaints if google programmers didn't take all these common cases into account? Does the text that you are talking about *ever* become visible? That is the test.
Unlike someone else here, I suspect that there are very few idiots working at Google. I would bet there is at least 3 man-years in this project, with at least that much more in process reviews and testing.
I don't. So seeing this is an integral part of your argument my "case of one" has just demolished it! There are thousands of webmasters out there who dont "use google". you are talking only of those who use google in some way for promotion. Many of them read WebmasterWorld. There is a whole world of the web out there which has very little interest in their exposure in google.
I am not a lawyer...
Precisely.
We've got automatic algorithms that detect hidden text, despite a number of tough cases. - GoogleGuy
Dropped in SERPs reporting a website using Hidden Text using the form yesterday...I guess that automatic algorithms that detect hidden text doesn't work any more...It's a case of white font color used on white background.
But I stand by my statement that google has long had a policy against hidden text, innocent or not.
Their guildelines on 'intent' are as clear as any other part of the quality guidelines [google.com] - in short, they suggest you ask yourself whether you would feel comfortable explaining why you did something to one of your competitors.
... and I have yet to hear about any sites using hidden layers for DHTML complaining. Shouldn't there be some complaints if google programmers didn't take all these common cases into account?
Which seems to support my argument - they are not penalising 'hidden text' just because it is hidden, but because of its intent ie. is it hidden with the intent of being revealed during the viewing process.
If they are merely testing whether the text becomes visible at some stage of the viewing process, then it is fairly easy to 'fake' that with visibility being decided by an event that is unlikely to occur - say clicking on a fairly obscure area of the site. Easy enough for a hand inspection to spot, very difficult for an algo. Dreamweaver 'timelines' muddy the water even more, and this is just scratching the surface, there are many other ways this can be obfuscated. Imagine the processing involved in checking that kind of thing on every site in the index.
FWIW I seem to recall that at the last pubcon, Google stated that they would be using the hidden text filters on sites that had been spam reported. I guess this is so that spam reported sites can have a more aggressive "hidden text' filter thrown at them.
The reason I would not use these techniques is the danger of a hand inspection or 'aggressive filter' brought on by a spam report.
re: Google Statements
We all welcome Googleguy's participation here, but in amongst the undoubtably useful advice there is still an element of 'corporate information'. A more cynical mind than mine might think some of it was good old FUD, personally I think they tread a fine line of 'being popular' and 'serving Mammon', and are to be congratulated for doing it so well.
But Google IS a business, and it would be wise for us to read all Googleguy's postings with the same critical eye as any other press release by a company. It also pays to read between the lines - sometimes there are gems of information, other times you can 'see the strings'.
Its a bit like doing a cryptic crossword puzzle;)
Google is a leader among SEs and they surely consider the responsability of their position.
Be realistic, if the site is out of Google index, yours is a ghost-site at 99% of the cases. This is not a bold statement, expecially for little or relatively little companies with local businesses and with no great resources (or to be more precise, politics) for building their net-popularity.
Ok Ok, if you are non-english company you can by-pass google and direct your attention to local SEs with the goal of achieving a certain web-presence.
But this is again not true (at least in my country), since the most popular SE, after checking a result in its own little index, goes to Google index.
After those considerations, take a look at the situation now: We all are debating about what is "safe" hidden text and what can be damaging.
I do not agree with the sentence: "simply don't use hidden text and you are ok".
This is quite simplicistic and a forced choice of not using some technologic possibilities.
A so radical dark "middle-age" position taken by Google is not realistic.
Of course they can do what they want with their index, but why not give some simple informations at the same time of penalties?
I think a simple "facts & fiction" corner would be very appreciated.
Without entering algos secrets, simple informations like:
"the background of the page has the same color of some text positioned in a table-row that has another color modified by external .CSS"
Fiction: It causes an automatic 30-day penalty of exclusion from the index.
Fact: It does not affect the ranking and the text is entirely indicized.
And so on...
(Obviously the above "facts & fiction" statement is shown just as a simple example and I've got no evidences about).
Bye
Now I'm curious to see how long it takes or if the sites will be penalized.
From one of my earlier postings :-
There are many better strategies that could be used to counter spamming. Look at the problem from the scratch. There are basically two forms of website spam. Over-use of of keywords and over-use of links. Links are tricky because you need to perform recursive searches that could get quite deep. Keywords, in comparison are trivial to deal with. You can set limits on the number of repeats per paragraph, per sentence, per page, etc. Once the limit is reached, then you simply ignore the rest. Alternatively, go too far over the limit and the relevance counter could be decremented for each keyword occurence instead of incremented. Dealing with keyword spam is not the big deal that Google would have us believe. Links are definitely tricky little b*****s but keywords are not.
Some people like mathematics and statistics. If this is the case at Google, why not apply Gaussion Distribution theory to keyword page analysis? (Or perhaps one of many other similar-looking curves.) With this sort of approach, you have plenty of parameters to tweak and no need to adopt strategies that could penalise the innocent.
See [mathworld.wolfram.com ]
PS This page came first in Google when I typed in 'Gaussian Distribution'. For all my criticisms, I do recognise that Google is a good search engine. I find it especially good for non-commercial stuff.