Page is a not externally linkable
Web_Savvy - 8:03 am on Dec 27, 2006 (gmt 0)
My short answer would be yes, it is largely based on PR but this may not be as simple as that. Please read on for a longer version: In my experience, the initial crawl of your site will/should begin shortly after the GoogleBot detects an external inbound link pointing to your domain. > I have submitted to yahoo and some directories This should start getting your site crawled nicely, IMO. > it has about 150,000 web pages. I doubt if anything close to 150K pages will get crawled with a PR of 2 or 3. You might want to get it up to 4 or 5, or even higher ;-) After this, the internal navigation structure of your site will play an important role. This issue is (much) more involved than what I can mention here in brief. There have been some excellent posts and discussions here on this matter and I'm sure someone will be kind enough to post the link/s here shortly. (I'm not too good at 'searching' here yet :-)). To put it in a nutshell, you might want to link to the important sections of your site from your home page and/or make these links a part of your global navigation. Then, link to less important (secondary) sections of your site from those pages which are directly linked from the home page, and so on. A well-structured site map section should also come in handy in terms of crawl maximization. Assuming you publish pages off the database using a script, it'd pay to keep your query strings as short as possible i.e. with the minimum possible number of parameters/arguments etc. Alternatively, you can mod rewrite the query URLs to a static form. Again, excellent guidence on this available here too. Finally, some strategically placed external inbound links pointing to deep / internal sections of your site should come in handy too. HTH
> How does google determine the number of pages to crawl. Is it based on page rank?
> ...estimate that my page rank will be between 2 and 3.