Forum Moderators: open

Message Too Old, No Replies

Identifying Pagerank allocation within a site

         

dantheman

6:08 am on Oct 2, 2002 (gmt 0)

10+ Year Member



For the last few updates I've been calculating in Excel some site info, an example of which is below. It's been quite a useful way for me to assess how Pagerank is allocated throughout a site. I've also used it to plot a rough (and arbitrary) number that indicates "Google Pulling Power". Let me explain by example.

Based on the following example site structure -

Index page - homepage (1 page)

Root/top level content - product/category index page, etc - (8 pages)

Misc content - about us, contact us, privacy, etc (5 pages)

Section A - Products - (20 pages)

Section B - Solutions - (10 pages) (I love seeing these 2 categories used to navigate ;))

Section C - Media, Press - (5 pages)

The number of pages above are those that are included in Google's index. Only those pages count towards the calculations.

Based on a theory that has gained some acceptance here, a log scale of 6 is used for the next part (the number does not really matter for these purposes);

PR 3 - 1
PR 4 - 6
PR 5 - 36
PR 6 - 216
PR 7 - 1296
PR 8 - 7776
PR 9 - 46656
PR 10- 279936

I've started at PR 3 to keep the numbers manageable. These numbers are just an arbitrary scale and are not an estimate of raw PR.

For each section of the site, I then view the PR for one of the pages. If the link structure is similar to the other content within that section, we can assume the other pages will have the same PR. This saves checking the PR for each page. From that, we get the following: (does not display as well as Excel)

Index page - 1 page @ PR7 = 1x1296 = 1296

Root content - 8 pages @ PR 6 = 8x216 = 1728

Misc content - 5 pages @ PR 6 = 5x216 = 1080

Section A - 20 pages @ PR 5 = 20x36 = 720

Section B - 10 pages @ PR 5 = 10x36 = 360

Section C - 5 pages @ PR 4 = 5x6 = 30

Total "Google Pulling Power" number = 1296+1728+1080+720+360+30 = 5214.

I've been plotting this number for one site for the last 3 updates and the increase in traffic is mirroring the increase in this number. I thought that was interesting.

It's easy to see the percentage of PR each section has:

Index page = 1296/5214 = 24.8%
Root content = 1728/5214 = 33.1%
Misc content = 1080/5214 = 20.7%
Section A = 720/5214 = 13.8%
Section B = 360/5214 = 6.9%
Section C = 30/5214 = 0.6%

In the example above, the product pages only have about 21% of the PR in the entire site. Armed with this info, you could alter the site linking/navigation to ensure PR exists where you need it most.

The "Google Pulling Power" metric is of no value if content is not suited to the Google algo, ie you don't get traffic from google. If the content is google friendly, then I think it has some value as a means of overall site effectiveness at Google. (although the best is still raw traffic ;))

Brett_Tabke

5:36 am on Oct 3, 2002 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



Very interesting and worthwhile task. Just goes to show you how important quality inbound links are.

dantheman

9:00 am on Oct 3, 2002 (gmt 0)

10+ Year Member



I just hope I've explained it clearly. There's been a lot of talk of buying PR, various cross-linking issues and other nefarious techniques recently. One factor that can never offend Google and is totally up to the webmaster is how he/she links internally. I think it's critical that a site distributes its PR to best effect, and site/link structure is the way to do this. If I knew a bit more programming, I'd post a form that people could use to enter stats for their own sites.

Another way it could be used is for competitors sites. Obviously you'll never view their log files so it's hard to estimate their traffic levels. But by performing this analysis on their site, one can get a better feel for how much "pull" they have in Google. Again, the quality of their on-page optimization needs to be considered also. This method is a bit more thorough (and time consuming) than judging a competitor's effectiveness based on a handful of SERPS.

ciml

3:51 pm on Oct 3, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



IMO many sites give too much weight to "Misc content", compared with products or solutions.

The interesting thing, for me, is that working backwards you get a total PR of about 7.78. This tallies well with my estimates.

dantheman

12:12 am on Oct 4, 2002 (gmt 0)

10+ Year Member



ciml - what do you mean by this?

"working backwards you get a total PR of about 7.78. This tallies well with my estimates"

ciml

12:18 pm on Oct 5, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Our staring point: The home page has PR7, there are no external links, the effective log base is six and the total PR is 5214.

Working back from the total we get log6(5214) = 4.777

Add the 3 back in (because you stared at PR3) and we get 7.777

The total PR is roughly 3/4 of a notch higher than the PR7 home page. For a site with no external links this is the kind of estimate I tend to get when simulating PageRank with plausible log base and decay factor (d).

bcc1234

12:46 pm on Oct 5, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



A while back, I've written a small app that calculated pr distribution for a site. Got some pretty interesting results for a 2k page site with different linking structures.
Things like "contact us", "about us", etc linked from all pages have huge impact. Also, linking from a product page to the parent category page only or linking to all categories in the path affects a lot of stuff.

Another thing to consider is whether you want most of the external incoming links point to the home page or a site map (not a pretty thing, but makes a lot of difference).

Those are all known things, but it was nice to actually test a particular site.

With a starting absolute PR value of 1, with about 1k internal pages and a generally accepted linking structure, the home page gets an absolute PR of 20 to 40, depending on the case. All without accounting for any external links.

digitalghost

12:50 pm on Oct 5, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



>>If I knew a bit more programming, I'd post a form that people could use to enter stats for their own sites.

>>Obviously you'll never view their log files so it's hard to estimate their traffic levels

Rather than wait for people to type in stats try searching for IP numbers on Google, (pull the IPs from your own logs) you'd be suprised at the number of log files that are open... webalizer, awstats, etc. :)

I like larger samples for data crunching.