Welcome to WebmasterWorld Guest from 188.8.131.52
Forum Moderators: open
we used a css to change the colour of the text on one site it started off the same colour as the background and in a layer then the style sheet would throw the layer to the front and change the text to be visible, it was part of the design the site as over 200 pages index will this get burnt?
So far so good.
But what is with the text I am hiding from modern browsers, something like:
"This site's design is only visible in graphical browsers that support web standards...."
I do hide text like that in browsers that can handle @import, and it will show up in browsers that don't. So I have a linked CSS file with something like:
and an imported CSS file with something like:
There it is: "display:none;". Well great, there must be a thousand other legitimate reasons for "display:none;".
I saw suspicious activity on my counter stats last week.
On a few pages I had this:
<DIV style="LEFT: 0px; VISIBILITY: hidden; WIDTH: 650px; POSITION: absolute; TOP: 0px; HEIGHT: 1px">some copy</DIV>
When checking counter stats I saw the pages that carried this code (about 5, for 5 different keywords) being hit by 4000 clicks, one per day. Usually, hits max at 200/300.
On a closer look, there was only one referral URL for the hits. It was the URL for the search result page for each specific keyword (all page one results).
Counter company killed the referrals. After I complained, they came back as 'bookmarks'. The URL in question was nowhere around.
I have never had 4000 hits from just one single referring URL. Could have been a hitbot, but too much of a coincidence (time of occurrence and pages using Div:hidden).
Needless to say, the style is gone.
He said things would already be starting to happen as of a week ago. (would that be immediately after the deep crawl?).
I am sure they will start out with the obvious stuff and see how the web world reacts.
Basically it was proposed the penalty will be a 30 day thing, until correction.
Pagerank and crawling will not be effected by the hidden text penalty (as I see it, they have to crawl to see your corrections, and once corrected normal PR is back).
That means that rather than dedicated a bunch of humans to look at all the sites that are using hidden text manually, they will dispatch a separate bot (more than likely posing as a browser) to crawl the sites in question. A separate algo will be used to parse the pages. (including external style sheets).
If a violation is found, the site will automatically be dropped. Once it is dropped, the bot will revisit on occasion to check and see if the offending content has been removed. If it has, the site will be reincluded automatically.
I have nothing to hide on my clients sites after reworking to my criteria. If a potential client has these "attributes" on their site and insist on keeping them, then I decline the contract.
I welcome the new algo (as rumoured) providing Google safeguards the innocent better than they have when introducing previous innovative changes.
Presumably WebGuerilla this would only be used on sites which have been reported for spam?
Yes, that is how it was explained to me. (Although I do seem to remember the words "for now" being thrown in).
It would not be used on sites unless they are reported. Is this correct?
I wouldn't go quite that far. There are certain high crime neighborhoods on the web that could certainly see some randome patrols. But for the most part, I think the goal is to combine the power of human spam reports and automated technology.
If this is true then User Agent & IP cloaking goes out the window :(
I'm not sure I follow what you mean. But I think the thing to keep in mind is that the goal really isn't to completely prevent any pages using shady techniques from entering the database. The goal is to lessen the negative impact those pages may have on the user experience.
The biggest single thing that contributes to meeting that goal is to reduce the average shelf-life of content that doesn't meet your standards.
So even if the automation is only applied to content listed in a spam report, it will still have a huge impact because the turnaround time from spam being reported and spam being removed will be dramtically shortend.
Does this mean there'll be no manual spam checks by Google?
No. It just means that humans who are checking will have some powerful tools at their disposal to speed up the process.
It would be great if the google bot could parse js
According to Matt at the conference, you shouldn't think that they cannot parse it. Well, relative to links anyways.
From what I can recall, he mentioned that Gbot will read through it all as if it were plain text. Obviously, anytime you'd see "http://" you'd know that if not a link, there's referencee to something.
Should be interesting how all of this plays out though. Online time will, and should, tell.
Who talked to matt about guestbooks?
Could be true but i can't see any changes to the positions of the linked pages yet. Because of my observation that some sites are continiously doing amazing fine with guestbook only links and hidden text / links to a max, i'm really curious to see the spam detection plans going reality. After the last update i observed a big drop in guestbook backlinks for the above mentioned sites - but no drop in postions at all (even the oposite)... this could mean that many guestbooks are now pr0 but the full impact to the positions of the linked pages needs some more update(s) ...
vitaplease and webguerilla said that some filters (hidden text / css / layer) are allready active, if i understand correct, right!? Is this confirmed by others? And, again, did matt or any other google rep say something about guestbooks? vitaplease, did matt explain, what the changes were in January?
No, he only noted that it seemingly came about unnoticed.
I can still see sites with 900 plus guestbookish backlinks doing well (even after the last update with general Pagerank drop), but then there could always be other normal links in play as well.
>>vitaplease and webguerilla said that some filters (hidden text / css / layer) are allready active
Only relaying what I thought Matt said, something to the effect of:
"And you will see these effects taking place as of approx. a week ago".
I saw a few sites drop from the serps last week, real spammers. They also got greybarred.
As I have mentioned in other threads, I have been seeing different results on www-sj serps for my keywords than all of the other datacenters since the last dance (interestingly, these serps only appear on Yahoo! and Alexa).
The main difference that I can see between these serps and the normal google.com serps is that the blatent spammers are nowhere to be seen. I have a feeling that they may have unleashed spambot on these serps (at least the ones for my keyphrases!), or at least something quite remarkable was done to them wrt spam.
I've found in the past that Google has been slack at updating sites and has cached the old versions of websites when moved from a fixed IP to a virtually hosted server. In this respect Google could simply check for the existence of the URL in it's database (thereby filtering out all the spammers that would no doubt see this as a quick route in to the Google index) and, if present reset, whatever data it has in it's database to look for the new DNS data.
If I have made a mistake in the terms above I've used you'll note that it is because it is getting nearer server/dns maintenance and I'm no pro on that.
I did not discuss hidden css layers with Matt.
During his panel comments to the entire audience he said that the filters would start hitting a week to week and a half after the conference.
A note about punctuation links: I don't believe Google has dealt with these previously as Matt thought. I know of a few well-ranked sites that are still using this old technique. Frankly I think the technique offers no value these days due to the importance of anchor text in the link. Nonetheless, some people haven't gotten around to changing it.
My conversation with Matt was very brief. WebGuerilla appears to have had a much more in depth discussion (see his comments earlier in this thread).
I think the spam report page is going to see an increase in traffic. Instead of someone knowing of a real abuse and reporting it, I can imagine some people submitting all of their competitors' websites just to have the google spambot visit them maybe get lucky and find something.
I have had people submit sites for my directory that have index pages over 800 lines long, jammed full of this type of stuff. Technically, the stuff isn't HIDDEN, just placed not to display.... sort of like stuffing the <noframes> section of a frames page.
In the end, it is an attempt to make googlebot see things that the public doesn't see, and should be punished.
There really isn't any way to know at this point. However, I think this thread shows how difficult building such a tool actually is. There are many different ways to hide stuff. And there are also many different legitimate design elements that could easily get mistaken for an attempt to spam.
We won't really know if it will be a problem or not until we have some examples of sites that have been evaluated by the SpamBot.
Now if you'll excuse me, I'm off to build a site with every hidden text trick I know of so that I can report it and see what happens!
Seriously, this idea isn't so bad. Better than switching completely over to the dark hats and change all your sites to beat (what i call) spammers by using their tactitcs. I said multiple times, i'd start to spam if spam seems to win over nospam. Allthough i really don't mean this 100% seriously i'd try to build a test site instead (something like sj, lol) and see if this really is the solution. Sadly but one possible way ... i guess that's how the today #1 (in my field, not in general!) started. Build a site, see how far you can go with google and how long you can survive with it. Than go for it or don't. Sigh.
fyi: what really frustrates me after more than 12 month observing the results within my field and putting a terrible huge effort into building quality sites, is the fact that building a site that get's #1 without guidelines in mind takes 0,1% of the time and 0,1% of the work and obviously doesn't bring you so much in trouble ... i don't speak about the sj results and no, i'm not a looser - sometimes my site's listed above spammers. But it took me much much more work than just putting my knowledge into a dark hat and building a #1 with a few clicks and a nice guestbook signing script.
Despite all the discussion about the definition of spam - i see people that are really skilled in seo things, building quite "intelligent" artificial web site structures that are pure spam, stay for a very long time and beat quality sites ... *and* receive traffic that should go to others. My further fear is that if google can't successfully fight spam, they'll someday change to pfi and cut all free traffic. You can see this changes at all of the other (formerly major) se's.
So i'm asking myself again and again (even after a year of WebmasterWorld'ing): should i or shouldn't i? Will google nevertheless stop free traffic one day - so should i take all i can get now? I'll bite myself in the a** if that'd become true ...
please, veterans, tell me to cool down and gimme some of your patience.