It's well documented that the New York Times does exactly this... and Google allows it.
However, I'm not sure of whether Google is making a special exception for the NYT, or if perhaps Googlebot has a "subscription" to the NYT...
You could always program your cloaking software to give "free subscriptions" to search engine spiders ;)
Well yes, I could :) But I don't want to run the risk of a ban or penalty for our clients.
WebmasterWorld posts are indexed in Google. Click on the search result and you end up at the WebmasterWorld login page (unless of course you are already logged in).
You can test this, just copy some text from an older post and paste it into Google.
Interestingly (and this is something you would want to do) the search results do not offer a cached version of the page.
How to do this?.Is there any special code I should insert in my site?. Is there online tutorial on how to do this?
cowcool, check out the Updated cloaking primer [webmasterworld.com] for information on cloaking techniques. There are several commercial applications available to help you cloak, or you could write your own in PHP or perl or some other scripting language.
I believe it is quite clear from googles terms and conditions that this is not accepted white hat optimizations. From a user point of view it can only be seen as spam as they would never have clicked the link, had they known it had to be payed for. The customer is surely right that it is possible, and maybe also right that everyone is doing it, but nothing in Googles terms and conditions would stop them from banning the site if they felt like it. Show them Googles T&C and ask if they still insist on having a cloaked site. There is no way you can give a guarantee that their site will not be banned, and they will have to take this risk into their considerations.
I have a similar situation to that of webmaster world where the data i would like the search enignes to spider is password protected but however you do not have to pay to access the data just set up an account.
Has anybody had any bad experiences with using cloaking for this purpose? Or are there any 'official' guidlines for this from any of the SEs etc?
Does webmaster world and NYT get 'special' treatment or do the SEs really not find this a problem?
As I am not trying to fool the search engines in anyway, I have a mass of data I would like to be indexed as it is of use to people, they just have to sign up to access it.
IMO, the best way to accomplish what you are trying to do is to get your hands on a good Spider IP list. A great free list is updated at www.iplists.com. Make your members script first check to see if the ip of the visitor is from the spider list or not. If they are, then pass them through making the session or whatever your script currently does. If you want to protect your member's area than I recommend not just checking the User Agent string of the visitor as this is easily forged.
Another thing to think about is the fact that almost all engines cache the pages they visit. This would make it "easy" for human visitors to see your cached data simply by clicking the "Cache" link for your page. To protect against this, make sure to use the appropriate NOCACHE meta tag on these pages.