Page is a not externally linkable
- Google
-- Google SEO News and Discussion
---- How does Google determine which pages to crawl?


Web_Savvy - 8:03 am on Dec 27, 2006 (gmt 0)


> How does google determine the number of pages to crawl. Is it based on page rank?

My short answer would be yes, it is largely based on PR but this may not be as simple as that. Please read on for a longer version:

In my experience, the initial crawl of your site will/should begin shortly after the GoogleBot detects an external inbound link pointing to your domain.

> I have submitted to yahoo and some directories

This should start getting your site crawled nicely, IMO.

> it has about 150,000 web pages.
> ...estimate that my page rank will be between 2 and 3.

I doubt if anything close to 150K pages will get crawled with a PR of 2 or 3. You might want to get it up to 4 or 5, or even higher ;-)

After this, the internal navigation structure of your site will play an important role. This issue is (much) more involved than what I can mention here in brief. There have been some excellent posts and discussions here on this matter and I'm sure someone will be kind enough to post the link/s here shortly. (I'm not too good at 'searching' here yet :-)).

To put it in a nutshell, you might want to link to the important sections of your site from your home page and/or make these links a part of your global navigation. Then, link to less important (secondary) sections of your site from those pages which are directly linked from the home page, and so on. A well-structured site map section should also come in handy in terms of crawl maximization.

Assuming you publish pages off the database using a script, it'd pay to keep your query strings as short as possible i.e. with the minimum possible number of parameters/arguments etc. Alternatively, you can mod rewrite the query URLs to a static form. Again, excellent guidence on this available here too.

Finally, some strategically placed external inbound links pointing to deep / internal sections of your site should come in handy too.

HTH


Thread source:: http://www.webmasterworld.com/google/3200455.htm
Brought to you by WebmasterWorld: http://www.webmasterworld.com