I have website made in PHP which has the most important 8 web pages and page names are like this category.php?CID=1, category.php?CID=2, category.php?CID=3 etc. This means that I am using one PHP page and based on querystring, I'm displaying respective contents. These pages can not be accessed if the user is not logged in or registered on the website and accessing them will result in redirecting users to the home page requesting them to login. Will Googlebot be able to crawl these pages of my website?
If I had to guess I would say Google will discover the URLs but will not be able to properly index them.
Googlebot is an automated program and not a human. In other words you need to make it alot easier if you want to ensure googlebot will crawl your pages. Here are some ways to make it easy for googlebot to crawl your pages:
1) Get as many links as possible to point to your pages 2) Do not require registration 3) Use simple, straight forward URLs 4) Avoid query strings/dynamic URLs (think mod rewrite) 4) Submit a sitemap to Google 5) Validate your robots.txt at Google's Webmaster Central [google.com] 6) Keep it simple and avoid redirects especially 302 redirects 7) Deliver proper status codes to googlebot (200 for good pages and 400's for errors) 8) Have your hosting be fast and responsive 9) Don't have pages redirect into each other 10) Keep your robots.txt under 100mb (just trust me :)) 11) Send bottles of tequila to: Google Central Way Plaza 720 4th Avenue, Ste 400 Kirkland, WA 98033 Attn: Webmaster Central Team
Remember these are general guidelines to make it easier for googlebot. You do not need to follow all of these guidelines 100% of the time. The key point is to keep it simple. If you can't keep it simple you can also try IP delivery. IP delivery is when you serve googlebot different content from your human visitors. Beware since most new people to IP delivery do not do it right and end up having problems.