I'm hoping that a "false positives" period after any particular change isn't too extreme. At any rate, if I do see any dramatic ranking drops this coming year, high on my checklist will be "Google suspects cloaking".
headers & redirects? sound like it might be cloaking for the sake of link juice. if that is the case will google penalize the site with the 301 redirect or the site that it points to?
Maybe its also related to this post [webmasterworld.com] about the hacked site messages in serp's. When I tried the example search in the post (the site:.edu example), it looked like several of the resulting sites had a referrer based header redirect exploit in place, since you could type the URL into your browser and it was fine. Its obvious Googlebot sees something else, I thought that kinda stuff was already well addressed by Google years ago.
All I can ask as a Webmaster is for better clarification of what Google defines as malicious cloaking and redirecting.
- What if I send a mobile user to a mobile site?
- Is cached content vs. live content treated different? (random visitors sees cached copy, logged in users never see cached content, does google toolbar report the size difference, etc?)
- Geo based ads?
- What if an advertisement link is nofollowed and redirected through a tracking tool that uses a header redirect?
etc etc... just a bunch of ways a Webmaster could inadvertently pee in Googles cheerios without knowing it! I know its really judged on the spirit of intent.
|I thought that kinda stuff was already well addressed by Google years ago. |
Me too, but the increasing amount of spam in the SERPs may be a sign that their historical solution needs a tune-up, something more appropriate for Caffeine. Spammers seem to have found some kind of loophole.
First thing in my head when I saw this was the WordPress pharma hack.
That sounds like a very worthwhile target, netmeg. That hack in its many variations has been a plague for most of 2010. Here's a good resource for webmasters: How to Diagnose and Remove the WordPress Pharma Hack [pearsonified.com].
The devious cloaking of both the title element and the links in the content area are worth some of Google's resources.
|What if I send a mobile user to a mobile site? |
Here's a related discussion on mobile sitemaps - and the difference between a new smarphone page and alegacy mobile website: [seroundtable.com...]
This does look like an area of potential chaos and false positives. Even the BBC's UK webmaster has questions.
|What if I send a mobile user to a mobile site? |
Just hope they have a process in place to reinclude a site when their algo does a false positive and drops the site. I forsee a problem for those that have mobile versions based on the fact that Google is quick to release buggy code and deal with the fallout after the fact.
|...Google is quick to release buggy code and deal with the fallout after the fact. |
I've noticed that a lot with almost everything Google does these days. Reminds me of the old Windows 95/98 days when it seemed like MS used the public to test their beta updates. At least back with that stuff worst case scenario was maybe blue screens or restore a backup after an update went bad.
But now with Google's irresponsible business methods our websites are at risk when we follow their recommendations that weren't well thought-out beforehand (by them). It can almost put someone out of business if they rely heavily of Google traffic as some folks attest to here in the forum from time to time.
Google is the one who should be more heads-up.
I do appreciate this note from Matt Cutts - but Twitter might not have been the ideal place for it. Maybe he'll write a blog post.
Any algorithmic treatment of a complex area is bound to have some false positives out of the box. So in some sense, it always requires using the general population as a beta test bed.
I think one of the worst algorithm efforts in recent times for generating false positives was the heuristic Google tried for automatically identifying link buyers and sellers. I hope that cloaking should be an easier goal, and the pitfalls should be easier to catch during R&D.
One that concerns me is the practice of adding a welcome message to a user based on the referrer - "welcome Google user", "welcome Twitter user" and the like. An even heartier example would be the way WebmasterWorld captures the referring keyword on Google and adds highlighting to the page as a visitor aid.
I'm going over any site I work with that somehow customizes the content - by referrer, by visitor, and especially by user-agent or IP address - to make sure that googlebot and googlebot-mobile get straightforward handling, the same as the visitor with a common browser.
I wonder if Gatekeeper IP blocks in the htaccess will have any adverse effect as well. We block (what we feel are) a few questionable countries to one site, that is designed to cater only to North America traffic anyhow. (US/CAN shipping only)
Googlebot has free reign... but I can understand how it would be seen as "One thing for Googlebot, another thing for some set of users"
mhansen, that is another type of common situation that's on my radar. What response do you give to those banned IP addresses?
If googlebot doesn't specifically come from a banned IP, how will it even know it's banned?
We ban a lot of IPs due to hack attempts and more often than not, they are foriegn. The server does it automatically with BFD at the server level.
If I've got to open up my servers to possible hackers just to make Google happy, I'll just go forward without any Google traffic.
|If I've got to open up my servers to possible hackers just to make Google happy... |
Good news, you don't. Just treat the googlebot user-agents the same as any regular visitor. And to help protect your server further, you can take the steps described in this thread: How to Verify Googlebot and Avoid Rogue Spiders [webmasterworld.com]
I often read between the lines when Google gives us a heads up. It always comes across as a warning to me, but why give any form of warning if the technology exists to actually do something about it.
The warning in itself, might be enough to eliminate at least part of the problem.
|It always comes across as a warning to me, but why give any form of warning if the technology exists to actually do something about it. |
If this is about what I think it's about, the warning is being given because there's a gray area that's existed, most likely with Google's knowledge, and Matt's being nice about it now that Google has decided to address the situation. I think it is about distribution of link juice.
whoops sorry guys I no longer look to you for any kind traffic so your heads up falls on deaf ears!
Parsing Matt's tweet (short as it is) it looks like 4 issues g will fiddle with. I'd be happy with ONE TRANSPARENT fiddly at a time. :)
So it can be dealt with.
That said, I just run a website (mine... my clients are a different kettle of fish and most of the time they don't listen to me anyway) with no perks for smartphones, pads, humongous screen res or other twaddle. K.I.S.S. Then again, it is not a bread and butter site so ignore me for all above except: It would be nice if google revealed which hammer and chisel is about to be applied. We know why they don't... give too much info and the bad boyz will figure out a way to scam it.
|What response do you give to those banned IP addresses? |
We return a 403.
|If googlebot doesn't specifically come from a banned IP, how will it even know it's banned? |
I'm not sure on this, but the way I assume many of these are found are through the toolbar.
- Googlebot sees a 100k page, 26 links, 6 images, etc
- Visitor A Toolbar, Sees a 100k page, with 26 links, and 6 images, etc
- Visitor B Toolbar, Sees a 2k page, no images, no links, no content.
Just a guess though...
One issue that Google needs to address is cloaking penalties for websites that give different layouts for mobile users versus regular browsers.
Will that result in a cloaking penalty?
Many forums are now offering 'Mobile' skins that automatically recognize mobile user agents.
The mobile skin is a stripped down version of the default skin without extraneous links to forum functions.
Will this cause trouble?
Can you elaborate a bit on what you mean by 'distribution of link juice'?
|Will that result in a cloaking penalty? |
No, I don't think the problem is mobile versus regular browser versions of a site. It's more about serving up totally different content.
I believe we may also see more crawling activity by Google where they don't identify themselves as Googlebot. That's probably the only way they're going to catch the folks that are really cloaking.
|I'm hoping that a "false positives" period after any particular change isn't too extreme. |
It will be. It always is. I'll have to avoid webmaster forums for a few months once this kicks in as there will be the usual deluge of people screaming and crying to no avail.
P.S. Google: I currently serve different content to Googlebot on several sites. That's because Googlebot tends to come from a US IP address and my content is altered depending on what country the visitor is coming from. The ads change, some of the links change too. It's called "providing a better user experience" ... you will undoubtedly call it "cloaking" if the IP addresses you use to check my sites conflict with my database of geo-ips ...
|One issue that Google needs to address is cloaking penalties for websites that give different layouts for mobile users versus regular browsers. |
I think 'everyone's doing it because they have to' implementations are probably safe...
As a search engine, they really have to get that piece right to return a large number of high-profile (and 'everyday') sites in the results, so I would guess they have taken that into account.
I think one of the problems people have with understanding these types of statements is not thinking 'What do they have to "let through" to have a chance at getting the results right?' or 'What eliminates too many sites (pages) from the results to be advantageous?'
If people ask themselves those (or similar) questions more often I think they'll have a better understanding of what is acceptable and what is not acceptable to do on (or with) a site.
frontpage: Not trying to pick on you at all; just share an example of how I try to look at things a bit differently to 'keep it reasonable' because IMO that's what Google has to do... There will obviously always be 'oopses' from any change, but eventually, they have to 'get it right' in some aspects or they won't have a search engine any more and I think not penalizing a site for having a mobile version is one of those areas they have to 'get right'.
|No, I don't think the problem is mobile versus regular browser versions of a site. It's more about serving up totally different content. |
My question is technical.
Is the Google algorithm sophisticated enough to realize that a stripped down "Mobile" index.php page with minimal internal links at 4kb versus the normal browser version of index.php with multiple links, images, and 12kb in size is not cloaking? You can't say the content is similar.
see two different page sizes and different linking structure, is it smart enough not to penalize the great unwashed masses?
Well the only thing I have to worry about is if Google's "test" is correct. we dont cloak.
What I was saying is it pretty much has to be...
And you're really not showing two different versions of content to visitors and Google, IMO, you're showing two different versions to both Google and your visitors, which, again IMO, is a HUGE difference... You're doing it the right way.
I think you would get into way more of an issue if you showed Googlebot and Googlebot-Mobile the 12k page and redirected mobile phone visitors to the 4k mobile version of the page... Then you would be showing Google one version of the content and visitors another. You're not doing that if you show the mobile version to the mobile bot and the regular version to the regular bot, you're showing Google both versions of the content visitors can see.
|"No, I don't think the problem is mobile versus regular browser versions of a site. It's more about serving up totally different content. " |
To be honest I thought Google were reasonably adept at handling this - given some smallish tests I have run on this - certainly not any industrial strength testing.
Can't help thinking the focus should be diverted elsewhere - certainly greater analysis of link profiles wouldnt go amiss.
| This 35 message thread spans 2 pages: 35 (  2 ) > > |