Forum Moderators: open
[webmasterworld.com...]
>>new hidden text algo
Anyone who's got any hidden text really should get rid of it now. It'll hit by surprise, and making it through an update is no indication of safety. Also, it wouldn't be a good idea to have it still on if the pages go and it's still there for a re-check.
[edited by: Marcia at 10:54 pm (utc) on April 29, 2003]
Do you think hidden divs would be as easy to detect as hidden text?
Probably easier...
div.hidden{position:absolute;top:-800;left:-800;}
It is a tough call with dynamic menus that use a hidden div to hide content until a user activates that content and then the div becomes visible. This same process is used to hide text and that is what I'm thinking Google is referring to. There may be many people who get caught in the crossfire if this is something automated.
I have purposely stayed away from using those types of navigation menus just because of the "hidden" aspects of the <div>s. I have many clients asking me why I won't set up menus like that. Well, maybe I can show them after a few updates. I'll take them back to the same sites they showed me as an example and we'll see if they were affected.
I must say, I am extremely curious to see how this filter is applied...theres going to be lotsa of newbie threads in this forum askling what happened to their site.
Hide the div with CSS and viola!
Actually I think this is what Google is referring to. I don't see much of hidden text anymore, you know, white on white or whatever colors are used.
Some of the more advanced on the edge marketers might be employing the above hidden <div> strategy. There are different ways to go about this, all with the same results, hidden text.
I've reviewed quite a few sites over the past year and have seen this in use and it was working then. Sounds like G has received too many complaints and is now going to take action. The problem is, there are situations where hidden <div>s are used from a design standpoint such as the dhtml menus that I refer to above.
Matt (aparently) stated that just because you make it through an update, it doesn't mean that you are safe. This tells me that the filter is not applied to all the sites crawled, at the time of the crawl. I suspected that this would be the case, as that would be very computationally intensive.
If they are not going to be hitting every site at the time of the crawl, they could very easily be checking sites constantly from systems with user agents and IPs other than those used by googlebot.
They do not necessarily have to honour robots.txt exclusion protocol if they do not use a bot. It could very well be a "browser" that renders the entire page, CSS, JS and images. This browser could be monitored by a person and fed the URLs to check out automatically. One person could "check" thousands of pages an hour this way. Google never claims that there will not be a person involved, they just say they are implementing a filter that checks for hidden text algorithmically.
If the hidden text filter is fetching new copies of the page instead of working from the cache, they could easily add an additional check against what is in the cache to look for cloaking.
The point I was trying to make is that you should not consider robots.txt exclusions, CSS or JS tricks to keep your site safe.
To my thinking it would be relatively easy to detect hidden divs automatically but other techniques might be harder unless they are actually going to use a human to review sites that have been reported.
[fireflyfans.net...]
Is a classic example.
Also - there are tons of ways webmasters make sites. Some use dreamweaver and front page.
You can have cels inside colums inside other tables with different backgrounds and the like. Any change to one of these can make or break a word being visible (In some browsers).
They don't work the same all the time. Imagine a website where the webmaster has black text in a table. He links some of the text and now it is blue [because it is linked].
He then changes the background to black - he now has black on back text - but doesn't realize it because his links are blue. He then unlinks on of the words as he is no longer interested in it, but forgets to delete the word - as it is now invisible to him.
Sure - people can say this is unlikely - I see stuff like this all the time - people forget things they can't see. Google will take thousands of innocent sites - all so the people that whine and complain will shut up.
Soon these people will fix their sites - and the whiners and complainers will be back, but with nothing to whine about....
And of course - it doesn't take a genius to put black text on top of a black gif - no filter is going to be able to do that. This is just a tremendous waste of time to appease those that think invisible text is important to begin with.
And of course - it doesn't take a genius to put black text on top of a black gif - no filter is going to be able to do that. This is just a tremendous waste of time to appease those that think invisible text is important to begin with.
Text on top of an image is easy. You just render the page and compare the color of the text bits as they are placed in relation to the background that they are placed on.
This is the same way that you look at text with CSS inside tables inside divs ......
Putting checks on a rendering engine is a lot of work, but it would be the most accurate way to do this.
As people have mentioned, outside a hand check, who can tell the difference between a background .gif and text? How does your bot read a css file called by .js? (Common dhtml) How does your bot deal with alternate style sheets? (Common dhtml) How does your bot deal with @import? (Very common dhtml) How does your bot deal with sliding menus? (Common dhtml)
IMO it's far too much effort for far too little effect. Far more effective to follow the usual Google methodology and blow smoke...
It makes sense to suggest to a robot that they should not index your css, cgi, admin or other directories and should be seen as helpful.
But Robot.txting out css directories is useless if you are only trying to "hide" CSS invisible text tricks from Search engines.
[edited by: chiyo at 5:57 am (utc) on April 30, 2003]
js can also be used to "hide links" if not the text itself. Was any comment made on js as a "hidden text" or "hidden links" method?
[edited by: chiyo at 6:00 am (utc) on April 30, 2003]
All the text I want google to see absolute positioned
An image, table, anything solid, over the top using z-imaging and absolute positioning
Would never be able to be detected. Even a hand check would be hard pressed to see the problem.
Much ado about nothing if you ask me. Google would never do anything for a few hundred (thousand) people who complain that would create several thousand (million) complaints that perfectly innocent sites were banned.
Spam in one form or another is here to stay. You just have to beat it. And remember the common SEO saying that it's spam if your competitor is ahead of you; it's great SEO if you're ahead of your competitor.
This is what I tell the guys who work for me. Stop crying start optimizing. Good sites rise to the top... eventually.
yes its common, and a good way to motivate staff and get them concentrating on positives and negatives - but wrong.
it could be that site is actualy better in non SEO areas like content, usability and value.
we all have a responsibility as people who make a living out of the web to ensure that what the actual searchers see in SERPS are useful and as devoid of spam as possible. Otherwise people will just stop visiting and using the Web and nobody will read your stuff anyway.
Trying to outspam or ourseo your competitors may work for a short time, but not for a long time. The logical long result is more spam, or more sites that are there because of good SEO rather than good utility, value, quality or content.
Thats why i report what i see as spam - but leave it up the SE to decide for their puposes whether it reall is. Yes im a white hat, but for good reason other than being a complainer. We have to approach it both ways