Welcome to WebmasterWorld Guest from 54.146.217.179

Forum Moderators: anallawalla & bakedjake

Message Too Old, No Replies

Google Site Search behind Login

     
9:32 pm on May 23, 2014 (gmt 0)

New User

joined:May 23, 2014
posts: 1
votes: 0


Hello. First time post. I hope this is the most appropriate forum.

I have read that Google Site Search is not able to index documents behind a login.

I have also read that for the general Google index, the googlebot can be allowed to index behind a login, and that Google requires a First Click Free policy is in place.

I have read that for the general index, the googlebot is permitted by examining the user-agent.

My question is 3-fold: Can the Google Site Search crawl be allowed behind a login by the same method? If so, can the crawled content be added to the Site Search catalog and not the general Google index? Finally, since it is site search, is the First Free Click required if the crawl can be made to work?

Thanks!
1:57 am on May 24, 2014 (gmt 0)

Administrator from US 

WebmasterWorld Administrator incredibill is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Jan 25, 2005
posts:14624
votes: 88


If you want Google to index behind a login remember everyone will still be able to view your documents via Google cache so use the meta robots NOARCHIVE on all your pages to avoid this issue.

Search is Search, the spider will need the same considerations for a site search or it won't be able to crawl and the world can see the content via cache without logging in without NOARCHIVE.
3:31 am on May 24, 2014 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month

joined:Apr 9, 2011
posts:12717
votes: 244


If so, can the crawled content be added to the Site Search catalog and not the general Google index?

Heh, that's funny, I was just speculating about the same thing myself yesterday. The sad conclusion was that if I want a site search to include content that isn't available in the general Google search, I'd have to code my own. Site search is basically a more elegant form of the "site:" operator. It looks different to the user, but under the hood it's just ordinary google search results, constrained to material on the present site (or material from a hand-picked list of sites, if you want to get fancy).

You can include login-required pages in a search engine's index. Details depend on your server, but the underlying concept is "Satisfy Any": visitors have to either log in or be the Googlebot.

You probably don't want to do it, though. Making login-required material visible in search results to non-logged-in humans is a good recipe for creating annoyed users.