Forum Moderators: open
Based on the following example site structure -
Index page - homepage (1 page)
Root/top level content - product/category index page, etc - (8 pages)
Misc content - about us, contact us, privacy, etc (5 pages)
Section A - Products - (20 pages)
Section B - Solutions - (10 pages) (I love seeing these 2 categories used to navigate ;))
Section C - Media, Press - (5 pages)
The number of pages above are those that are included in Google's index. Only those pages count towards the calculations.
Based on a theory that has gained some acceptance here, a log scale of 6 is used for the next part (the number does not really matter for these purposes);
PR 3 - 1
PR 4 - 6
PR 5 - 36
PR 6 - 216
PR 7 - 1296
PR 8 - 7776
PR 9 - 46656
PR 10- 279936
I've started at PR 3 to keep the numbers manageable. These numbers are just an arbitrary scale and are not an estimate of raw PR.
For each section of the site, I then view the PR for one of the pages. If the link structure is similar to the other content within that section, we can assume the other pages will have the same PR. This saves checking the PR for each page. From that, we get the following: (does not display as well as Excel)
Index page - 1 page @ PR7 = 1x1296 = 1296
Root content - 8 pages @ PR 6 = 8x216 = 1728
Misc content - 5 pages @ PR 6 = 5x216 = 1080
Section A - 20 pages @ PR 5 = 20x36 = 720
Section B - 10 pages @ PR 5 = 10x36 = 360
Section C - 5 pages @ PR 4 = 5x6 = 30
Total "Google Pulling Power" number = 1296+1728+1080+720+360+30 = 5214.
I've been plotting this number for one site for the last 3 updates and the increase in traffic is mirroring the increase in this number. I thought that was interesting.
It's easy to see the percentage of PR each section has:
Index page = 1296/5214 = 24.8%
Root content = 1728/5214 = 33.1%
Misc content = 1080/5214 = 20.7%
Section A = 720/5214 = 13.8%
Section B = 360/5214 = 6.9%
Section C = 30/5214 = 0.6%
In the example above, the product pages only have about 21% of the PR in the entire site. Armed with this info, you could alter the site linking/navigation to ensure PR exists where you need it most.
The "Google Pulling Power" metric is of no value if content is not suited to the Google algo, ie you don't get traffic from google. If the content is google friendly, then I think it has some value as a means of overall site effectiveness at Google. (although the best is still raw traffic ;))
Another way it could be used is for competitors sites. Obviously you'll never view their log files so it's hard to estimate their traffic levels. But by performing this analysis on their site, one can get a better feel for how much "pull" they have in Google. Again, the quality of their on-page optimization needs to be considered also. This method is a bit more thorough (and time consuming) than judging a competitor's effectiveness based on a handful of SERPS.
Working back from the total we get log6(5214) = 4.777
Add the 3 back in (because you stared at PR3) and we get 7.777
The total PR is roughly 3/4 of a notch higher than the PR7 home page. For a site with no external links this is the kind of estimate I tend to get when simulating PageRank with plausible log base and decay factor (d).
Another thing to consider is whether you want most of the external incoming links point to the home page or a site map (not a pretty thing, but makes a lot of difference).
Those are all known things, but it was nice to actually test a particular site.
With a starting absolute PR value of 1, with about 1k internal pages and a generally accepted linking structure, the home page gets an absolute PR of 20 to 40, depending on the case. All without accounting for any external links.
>>Obviously you'll never view their log files so it's hard to estimate their traffic levels
Rather than wait for people to type in stats try searching for IP numbers on Google, (pull the IPs from your own logs) you'd be suprised at the number of log files that are open... webalizer, awstats, etc. :)
I like larger samples for data crunching.