Forum Moderators: open
I'm programmer first and a reluctant webmaster second. I'm used to having to find workarounds for bugs in Windows. Complaining to Microsoft is truly a wast of time.
It seems to me that there has been much discussion lately about panalties arising from hidden text - even to the point of pages being removed from the Google index. Since I don't use hidden text, I cannot comment from my own experience, but I can look at this issue objectively.
There are three possible policies a search engine can take when encountering hidden text/links, etc.
1: Take no special actions.
2: Ignore all such text, etc. Quite literally, pretend it isn't there.
3: Apply some sort of penalty.
Now lets look at how/why hidden text might be created.
1: As a deliberate attempt to spam search engines.
2: By accident. (Probably applies mostly to zero-length links).
3: Deliberately but without any intention to spam. For instance, a Webmaster might leave himself a todo list in white text in an empty table cell.
Now lets look at how a spammer might fool a search engine.
1: Use very bright gray text against white, etc. (Presumably search engines that worry about hidden text implement thresholds).
2: Use javascript to change background/text colors. (I imagine this can be done with tables but I've never tried it.)
Let's be realistic about this. There is not a snowball's chance in hell of a search engine being able to analyse all the javascript on a page to see if it is being used to hide text. It therefore follows that a spammer will be able to fool search engines using hidden text with no difficulty at all.
It therefore follows that applying penalties where hidden text is found is a total waste of time (if the intention is to combat spam). This just leaves two sensible policies for hidden text.
1: Take no special actions.
2: Ignore all such text, etc. Quite literally, pretend it isn't there.
Now none of this is rocket science. Indeed, if the boffins at Google can't work this out for themselves, they must be suffering from a combined IQ of a retarded chimpanzee.
So, let's see what GG has to say. Do Google boffins shuffle around on their knuckles? Judging from the mess Google is in right now it's easy to laugh and just say yes but the answer in reality is probably no. Nevertheless, given that it is child's play to use hidden text in a manner that can't be detected by search engines, I think it behoves GoogleGuy to say definitively what the policy is. After all, as I have explained, penalties are more likely to catch the innocent than the spammer.
ARIES
Unless you're asked, don't give your opinions, as others can't take your brutal honesty. Just try to be kind and more considerate.
It would have been prophetic but the week was wrong!
Also (not that I believe in such things) I don't think ARIANS do "kind and considerate", except me, of course.
Kaled.
PS
Though some people have claimed otherwise, I have not called GoogleGuy an idiot, but I do think the hidden text policy is idiotic. If GG is responsible for the policy then he must accept the flak, if not, there is no need for him to feel insulted. In any case, he is free to take my arguments apart point by point and make me look like the idiot. I have made my case logically from several different viewpoints. In turn, I have been maligned, accused, amongst other things, of publishing a private reply. Such things will not cause me to loose so much as a wink of sleep. I have the feeling that GG won't be losing any sleep either.
People who use hidden text by mistake would not be punished, and (most?) people who spam via hidden text would no-longer be as effective.
Why would anyone not be satisfied with that?
This "lets ban as many people as possible" attitude is not a good one.
Reading between the lines of what he has said, it appears like Google itself is not 100% confident about the automatic penalties which is why it's only for 30 days. And maybe this is only a text run.
Also with so many webmasters complaining...they probably felt it nessessary to *do something.*
I just don't think they're going about it the right way.
Yes that would be a sensible policy - I mentioned it as an alternative in my first post.
The fact remains that if Google's page content analysis were as good as they claim, keyword spamming would be impossible and the need for such penalties simply would not arise. I have outlined strategies for this, but absolutely no one, including GG have commented on them.
Of course, it is entirely possible that this policy is more to do with establishing a (legal?) precedent than combatting spam. After all, the people at Google cannot possibly be so stupid as to imagine that search results will magically become more relevant as a result. So, logically, if they are not stupid, there must be another motive. Only time will tell.
Kaled.
Is layers with negative coordinates considered hidden text?
This is part of the problem. Google can't say what is and is not considered "legitimate" hidden text, because, to do so would be would be to make the entire policy meaningless.
In another thread, someone asked whether text within <NOSCRIPT> </NOSCRIPT> tags would be penalised. In this case, the only sensible policy for a search engine implementing hidden text penalties is to ignore the contents of the tag pair, otherwise a large portion of the index would vanish. However, that may not be Google's policy.
Kaled.
Signing off : time for some shut-eye.
I -don't- think that the hidden text should be given a value. Just ignore the text like it's not there.
In the ideal world, that is exactly what google would do. In fact I am certain that is what they wish that they could do. They do that already for meta keywords and comments.
The problem here is that the hidden text, as has been pointed out by those of us on both sides of the argument, is very processor intensive to spot *correctly*. Google is totally unable to apply this test to every page in the index. Right now it is a peanlty because it *has* to be a penalty.
So what they can do for now is to apply the software to the sites that are the biggest problem and make them pay a little bit for their infractions.
The penalty has actually dropped drastically in its severity. Under the hand check rule you would receive an actual ban instead of 30 days.
If computing power advances enough that they can actually implement this in googlebot, what should they do about all the sites that block their images, css and js files in robots.txt? Would it then be unfair to not allow any of those sites in google?
However, some forms of "hidden text" are justifiable, such as placing it within NOSCRIPT tags, provided the effect of doing this is to make the actual words that would be seen by someone with JavaScript enabled visible to someone who hasn't, and provided the robots see, as far as is practicable, more or less the same words as the visitor. Another justifiable reason (IMO) would be to allow robots to read the same words that are contained in a Flash file, which otherwise they wouldn't be able to read and index. There are still millions of people who don't have JavaScript enabled or the required version of the Flash player... why should these millions of people be denied the content they're searching for? It could even be argued that in these circumstances, using a stylesheet to hide an optimised text-only version of a Flash site is entirely justifiable.
What counts is how and why text is hidden and not the mere fact that it is.
Patrick
[edited by: Patrick_Taylor at 11:16 am (utc) on June 1, 2003]
Exactly - and at this point one could think that everybody should agree that it's impossible to detect "real cheating" hidden text by algo - that it's simply impossible to differentiate webmasters motivations by any algo. I'm surprised how long and how far this discussion goes allthough it's been said again and again, that there's no general solution.
a site that did get banned, but 10 days later they had it back with all the hidden text and links intact.
Unfortunatley, I'm seeing a site that went the full 30-day ban, but is back in all its glory without so much as one less word of hidden text.
I've learned not to whine much about cheesy techniques, but this one uses our's and our competitors' company and actual employee names in their hidden text. Works like a charm!
We given hidden text an automatic penalty that expires after a few weeks, so if the site owner cleans up his site, he can get back into the index
So does this mean NO HIDDEN TEXT AT ALL?
There is a very cool technique for Making pages that look great on all browsers/blind people/bots, and looks evern better on a site that has css. Now I know of a famous design site that uses this technique that seems to have been penalized. It goes like this :
It involves using CSS to replace a <h1> with an image of the text inside that H1, but in a nice font, with maybe a piture. This means that the site works on all browsers, but can use nice typography on browsers that support CSS. And it makes sence to search engines too.
An example for the Google home page:
<h1><span class=h>Google</span></h1>
with css :
span.h {display:none;}
h1{ width: 400px; height: 100px; background: url(logo.gif); }
So text only browsers & NN4 see the heading Google, and decent browsers see Google, but as a GIF in Google's Font. Why? Because the font Google uses for it's logo doesn't come with my browser, so they have to use an image. They have alt text of "Google" on that logo, but that really should be a <h1>Google</h1>, because Google is the page heading.
To penalize this is crazy, as everyone will be doing it soon ( due to this new technique, heavily publicized ) , unaware that the'll drop off the index.
Is that considered as hidden text as I seen sites with top 10 placing with that type of optimization.
Like what GG and other alike have mentioned in the past, if you can't see it then it's considered spam.