Hello. First time post. I hope this is the most appropriate forum.
I have read that Google Site Search is not able to index documents behind a login.
I have also read that for the general Google index, the googlebot can be allowed to index behind a login, and that Google requires a First Click Free policy is in place.
I have read that for the general index, the googlebot is permitted by examining the user-agent.
My question is 3-fold: Can the Google Site Search crawl be allowed behind a login by the same method? If so, can the crawled content be added to the Site Search catalog and not the general Google index? Finally, since it is site search, is the First Free Click required if the crawl can be made to work?
If so, can the crawled content be added to the Site Search catalog and not the general Google index?
Heh, that's funny, I was just speculating about the same thing myself yesterday. The sad conclusion was that if I want a site search to include content that isn't available in the general Google search, I'd have to code my own. Site search is basically a more elegant form of the "site:" operator. It looks different to the user, but under the hood it's just ordinary google search results, constrained to material on the present site (or material from a hand-picked list of sites, if you want to get fancy).
You can include login-required pages in a search engine's index. Details depend on your server, but the underlying concept is "Satisfy Any": visitors have to either log in or be the Googlebot.
You probably don't want to do it, though. Making login-required material visible in search results to non-logged-in humans is a good recipe for creating annoyed users.