Forum Moderators: open

Message Too Old, No Replies

Page Rank vs. number of deepcrawl listings.

         

Jesse_Smith

5:57 pm on Feb 7, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



For each page rank, how many files will Google index?

WebGuerrilla

6:20 pm on Feb 7, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I don't think there is a specific number of pages tied to your PR. It is more about the order of the crawl and how much time gets allocated to each site.

If you have a PR8 site, Google will generally show up earlier and stay longer than it will if you are a PR4.

bobmark

9:22 pm on Feb 7, 2003 (gmt 0)

10+ Year Member



I am assuming you are referring to the 64.xxx freshbot WebGuerilla as I have never noticed any deep crawl bias by Google's 216.xxx crawl related to PR.
I have had sites that were totally crawled from the day they were they were found by 216.xxx but I have noticed that over time, the number of pages freshbot will update seems to increase.

ruserious

12:06 am on Feb 8, 2003 (gmt 0)

10+ Year Member



I wrote this in another topic earlier, from my experience it does depend on PR, although I am not sure wether its directly the number of pages or just the level of depth (number of links followed) which indirectly affects the number of pages, that is affected.
As to the reason: At some point our PR was rather fluctuating, and one months it fropped from 6 to 4 and the number of pages in the index dropped quite considerably (from 6,300 to 4,600 or something), when PR increased our number of indexed pages increased again, too.

ciml

1:26 pm on Feb 8, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



The depth of crawl for a site is very much related to PR; not only that but IMO the crawling behaves as if it follows PR fairly closely.

I don't think that it is a PageRank thing as such, just that the order of the crawl as WebGuerrilla describes isn't much different from the random surfer model.

Jesse_Smith

6:21 pm on Feb 8, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



All my new sites are geting a max of about 175 pages crawled. So it's either the page rank, because there new, because this is only the second month that they have been partly deepcrawled, or it's because I did the boo-boo of linking to all of the pages right in the main index, instead of having an index for each category.

My vBulletin board that has been up for almost two years and is listed in ODP, has had about 320 pages crawled so far, even though this is the first month that it's ever been deepcrawled, because it used to have the session IP in the URL.

bobmark

9:26 pm on Feb 8, 2003 (gmt 0)

10+ Year Member



I wonder if we're talking about 2 sets of rules here.
My experience with static page sites is that Google dutifully adds all new pages it finds on each crawl.
I am wondering if there is a different algo for dynamic page sites where some limit is placed on the often mind boggling number of generated pages that are indexed.

rogue

12:24 am on Feb 9, 2003 (gmt 0)

10+ Year Member



I have a site almost two years old and have only been crawled by the 64.xxx bot until this month when I was also crawled by the 216.xxx bot for three days, day and night, and most pages were crawled more than once. A previous post mentioned these two bots and I would like to know the difference. This is my first post and have learned a lot by just lurking but I guess you have to come out of your shell sometime. The site is PR4 on all pages and is a small site.
Thanks

Jesse_Smith

2:03 am on Feb 9, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



freshBot: 64.68.82.* bah, listed for only a few days, short dinner date.
deepcrawler: 216.239.46.* Good bot, it likes you, listed until death do you depart. So don't make him mad or it will divorce your site.

bobmark

2:25 am on Feb 9, 2003 (gmt 0)

10+ Year Member



To put it is a bit more detail - although jesse certainly expressed it succinctly - results from the crawl by the 216. "deep crawl" bot make up the traditional Google index. The results are there until refreshed during the update once a month. The newer 'freshbot' augments the index by crawling selected pages - and here PR seems to have an effect in how many - which are added to the index for a variable period during the month.
Pages added by freshbot can have a ridiculously high serp for the duration of their appearance. They also seem not to be integrated into the update (e.g. if you add pages after the 216. crawl and they appear in the index after a freshbot crawl, they never seem to be there after the update).
welcome to webmasterworld.

rogue

4:26 am on Feb 9, 2003 (gmt 0)

10+ Year Member



Thanks bobmark. Even so, freshbot has always been the only one to visit my site and all pages have always been crawled and most every page has always been indexed on the first page of serps for most every keyword on each page of the site. Does this mean I have been lucky until now?

globay

5:17 am on Feb 9, 2003 (gmt 0)

10+ Year Member



> or it's because I did the boo-boo of linking to all of the pages
> right in the main index, instead of having an index for each
> category.

Google will only follow a certain number of links on one page. May be it just stopped following the urls and thus does not know the other pages. Try to spit it up in categories!

--
globay