|would we get dinged for this.|
| 3:56 pm on Aug 22, 2009 (gmt 0)|
we require login for some pages but want those pages crawled. we were thinking detecting the agent and allowing them to view the full page but have real users login to see the content.
not sure if that falls into the whole cloaking realm or not.
| 5:07 pm on Aug 22, 2009 (gmt 0)|
It is cloaking in the broadest sense, and many people tend to over-simplify when talking and posting about cloaking on-line. But if it's not an attempt to mislead either the search engines, search engine users, or visitors to your site, then it's not against the search engines' rules as long as you're up-front about requiring a log-in or pay-to-play.
User-agent-based content delivery is very common on sites which serve separate content to mobile devices and desktops -- Sites such as Google, Yahoo, MSN/Bing, Ask, etc. :)
Requiring an account in order to "View the full article" is very common on News sites, although less so now than in the recent past; Users --including me-- tend to get mad if everything I find in the search results leads to sites that require an account (whether free or not) to view anything and tend to go right back to Google and search again or select a different search result. Therefore, this will show up in your pages' bounce rates, which Google tracks.
So be very careful not to put too much content 'behind the wall,' or to try to get too much protected content indexed in search. Bear in mind that if half of your visitors from Google refuse to create an account, then Google will see a 50% bounce rate, and therefore decide that your site must be pretty useless -- even if your content is great and highly-relevant to the search terms used to find your page. If not done very judiciously, requiring an account can have a bad effect on your pages' rankings.
So caution is indicated here, but not because of the user-agent-based content delivery (cloaking) per-se.
| 5:39 pm on Aug 22, 2009 (gmt 0)|
Ok, so I should be safe then. Thank you.
to address your concern: our entire site is pretty much open for free. we have several sections and anything that's really pertinent requires no login. we only have one forum that we keep like that as a teaser and it's our general discussion / off-topic forum. if someone is looking for something specific like an event or a how-to then more than likely it'll be available. if someone is talking crap (excuse the language) about someone else then they're required to login to see the juicy drama. that's not a bad deal, right?
Follow up question, I will google this when i have more time but if you happen to know where i can find a quality list of good search engine agents that would be great.
thanks again for your reply!
| 7:59 pm on Aug 22, 2009 (gmt 0)|
> to address your concern
Not really my concern, since the answer wouldn't change regardless of the site... :) Someone else reading this thread in the future may have a site that differs remarkably from yours (or mine).
I tried to answer in-depth because it's important to be able to ake a step back and evaluate your site and content and links objectively. Sadly, this is the ability most lacking among Webmasters taken as a group, and the reason we get a lot of complaints directed at search engines in posts like, "Google banned my (spammy copied-misspelled-stolen-content stuffed-with-keywords and plastered-with-PPC-advertising) site for no reason! - No reasons at all!" A degree of detachment and pragmatism is needed.
> a quality list of good search engine agents
Well, it would start with googlebot, Slurp, msnbot, Teoma, etc.
If you want to get more specific, then a bit of time spent mining your server access logs for the past year will likely give you a fairly good list of these 'bots and their many incarnations.
That's really the problem: It's possible to use just the single, general names as in my short list there, or to find dozens of variations of many of those 'bots; MS and Yahoo especially favor the 'bot du jour approach, apparently exercise little control over naming conventions used by their various dev groups, and therefore tend to unleash multiple, inconsistently-named 'bots at a fairly high rate. (In fact, I had a laugh just yesterday when I got some logged requests from a 'kitchen sink' 'bot from Yahoo, incorporating just about every 'bot name they (or their predecessor Inktomi) had ever used.)
| 8:42 pm on Aug 22, 2009 (gmt 0)|
thanks jim. that's some quality info. it gave me a lot to think about.
| 5:21 am on Aug 23, 2009 (gmt 0)|
If you're letting search engines see your "login" content, you're also setting yourself up for caching.
"I have to log in for this? Forget that, I'll just go back to the search results and view the page cache."
| 2:17 pm on Aug 23, 2009 (gmt 0)|
yea, i don't really care if people view the content. if they use the cache, more power to um. we don't have anything to hide, it's just a minor annoyance to entice someone to join the party. you know, for the everyday lurkers.
| 1:21 pm on Aug 24, 2009 (gmt 0)|
i don't suppose there's a maintained up-to-date list of agents somewhere that we can query or something?
| 11:28 am on Sep 24, 2009 (gmt 0)|
you might start here:
Search Engine Spider and User Agent Identification forum Charter [webmasterworld.com]