homepage Welcome to WebmasterWorld Guest from 23.20.220.79
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Pubcon Platinum Sponsor 2014
Visit PubCon.com
Home / Forums Index / Search Engines / Sitemaps, Meta Data, and robots.txt
Forum Library, Charter, Moderators: goodroi

Sitemaps, Meta Data, and robots.txt Forum

    
Will Googlebot be able to crawl these pages of my website?
saqitude




msg:3246611
 5:59 pm on Feb 8, 2007 (gmt 0)

I have website made in PHP which has the most important 8 web pages and page names are like this category.php?CID=1, category.php?CID=2, category.php?CID=3 etc. This means that I am using one PHP page and based on querystring, I'm displaying respective contents. These pages can not be accessed if the user is not logged in or registered on the website and accessing them will result in redirecting users to the home page requesting them to login. Will Googlebot be able to crawl these pages of my website?

 

goodroi




msg:3246675
 6:45 pm on Feb 8, 2007 (gmt 0)

Welcome to WebmasterWorld saqitude!

If I had to guess I would say Google will discover the URLs but will not be able to properly index them.

Googlebot is an automated program and not a human. In other words you need to make it alot easier if you want to ensure googlebot will crawl your pages. Here are some ways to make it easy for googlebot to crawl your pages:

1) Get as many links as possible to point to your pages
2) Do not require registration
3) Use simple, straight forward URLs
4) Avoid query strings/dynamic URLs (think mod rewrite)
4) Submit a sitemap to Google
5) Validate your robots.txt at Google's Webmaster Central [google.com]
6) Keep it simple and avoid redirects especially 302 redirects
7) Deliver proper status codes to googlebot (200 for good pages and 400's for errors)
8) Have your hosting be fast and responsive
9) Don't have pages redirect into each other
10) Keep your robots.txt under 100mb (just trust me :))
11) Send bottles of tequila to:
Google
Central Way Plaza
720 4th Avenue, Ste 400
Kirkland, WA 98033
Attn: Webmaster Central Team

Remember these are general guidelines to make it easier for googlebot. You do not need to follow all of these guidelines 100% of the time. The key point is to keep it simple. If you can't keep it simple you can also try IP delivery. IP delivery is when you serve googlebot different content from your human visitors. Beware since most new people to IP delivery do not do it right and end up having problems.

ajitkumar




msg:3261095
 5:42 am on Feb 23, 2007 (gmt 0)

Hello goodroi

I don't know how to deliver status code as u mentioned in point no. (7).

"Deliver proper status codes to googlebot (200 for good pages and 400's for errors)"

please can you describe.

goodroi




msg:3263266
 3:40 pm on Feb 25, 2007 (gmt 0)

your status codes are probably ok. to test them you can use a "header checker".

improper status codes usually happen when someone tries to do something advanced and does not verify their work. if you have not done anything advanced on you server you are probably safe.

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Search Engines / Sitemaps, Meta Data, and robots.txt
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved