Forum Moderators: open
There is much talk of PR being devided equally amoung the links off a page. More links on a page = less PR each.
There is also visual evidence that PR simply drops a point for each level of depth on a site.
Sooooo.... Let us assume that I have a page on my site ( lets call it page "A" ) that has a PR of 8. There are three links off that page, two go to "content directory " areas of my site ( lets call them pages "B" & "C" ) and the third goes to what we could ( but wont ) call a links page ( lets call it page "D" ) Visual evidence would suggest that each of these three pages would be a PR 7.
If page "D" contained a list of catagories and sub catagories within my relevant niche area. ( all very focused and to the point ) clicking on any of those links ( catagories or sub catagories ) would take you to a page which is now one more level doen and would be a PR 6 ( lets call this page "E")
Page "E" obviously contains a list of relevant, high quality, interesting relevant ( just to stress the point ) sites.... but the link for each site takes you to a page with a full description ( and possibly a critique ) of just that one site. ( lets call this page "F".... there would be lots of page "F"s ) This page "F" would have a PR of 5 which would go only to this one site described and linked to on the page.
Is this right? or am I just talking scribble?
The reason for this often coinciding with the URLs, is that many pages have approximately 20 links on (with a wide margin) and this is approximately the number of links that cause PR to reduce by one notch.
The numbers are affected by breadcrumbs, sibbling and cousin links, general links (eg. contact or copyright), PR being injected from outside the site, 404s, /robots.txt, etc.
many people would disagree with the actual numbers used, but hopefully you get the idea
There is also visual evidence that PR simply drops a point for each level of depth on a site.
As already mentioned by ciml, it depends on the number of links on the page. If a page has few links, ToolbarPR will drop by one or is unchanged. (It depends on the exact number of links and if the page has a high or low Toolbar PRx. Of course, the real PR is decreased in any case even if the ToolbarPR is unchanged.)
On the other hand, if there are numerous links on the page, PR will drop at least by one (but can also drop by 2 or even more).
have approximately 20 links on
It seems that this is not the latest value. After the new PR update, it seems that this number was decreased while the damping factor was slightly increased. (Although, I have to analyze this in detail.)
PR(A) = (1-d) + d (PR(T1)/C(T1) + ... + PR(Tn)/C(Tn))
Note that the PageRanks form a probability distribution over web pages, so the sum of all web pages' PageRanks will be one.
PageRank or PR(A) can be calculated using a simple iterative algorithm, and corresponds to the principal eigenvector of the normalized link matrix of the web."
(Sergey Brin-Lawrence Page, The Anatomy of a Large-Scale Hypertextual Web Search Engine [www-db.stanford.edu]. This is the original paper that started Google at Stanford University).
So,
"1. PR(Tn) - Each page has a notion of its own self-importance. That’s “PR(T1)” for the first page in the web all the way up to “PR(Tn)” for the last page
2. C(Tn) - Each page spreads its vote out evenly amongst all of it’s outgoing links. The count, or number, of outgoing links for page 1 is “C(T1)”, “C(Tn)” for page n, and so on for all pages.
3. PR(Tn)/C(Tn) - so if our page (page A) has a backlink from page “n” the share of the vote page A will get is “PR(Tn)/C(Tn)”
4. d(... - All these fractions of votes are added together but, to stop the other pages having too much influence, this total vote is “damped down” by multiplying it by 0.85 (the factor “d”)
5. (1 - d) - The (1 – d) bit at the beginning is a bit of probability math magic so the “sum of all web pages' PageRanks will be one”: it adds in the bit lost by the d(.... It also means that if a page has no links to it (no backlinks) even then it will still get a small PR of 0.15 (i.e. 1 – 0.85). (Aside: the Google paper says “the sum of all pages” but they mean the “the normalised sum” – otherwise known as “the average” to you and me."
(Ian Rogers, The Google Pagerank Algorithm and How It Works [iprcom.com]).
This is course speaking of the original PR algo, but it is prolly close enough to get a basic idea. :)
Jordan
Note that the PageRanks form a probability distribution over web pages, so the sum of all web pages' PageRanks will be one.
This statement (as well as seeing the PR calculation as the determination of eigen vectors) is only valid, if there are no dead ends.
Also, for completeness you have to mention the relation between real PR and ToolbarPR (which is showing an integer on a logarithmic scale).
The logarithmic base was what was discussed above.
I'm not a math wiz by any means (I could only barely follow the page I linked to), but I think that granted the theoretical possibility of eigen decompositions for all matrixes (based to the possibility of singular value decompositions in case of non-square matrixes), then even with dead links, calucated PR(A) is still valid. Of course, I'm probably wrong...again, I'm no math expert...but that is how I was understanding the paper. Also, if Google* filtered out dead link before calculating PR(A), then the assumed convergence would always be valid.
Ps. I'm aware that the toolbar is based on a log. of actual PR, but I was posting the info in regards to Perplexed's original question about the distribution of PR to hypothetical page "F" (i.e., points 1-2 in the citation from Mr. Rogers (Mr. Rogers?!? Hey, that is cool! 'Won't you be my neighbor?' ;D ), or point 3 if it is backlinked to "E").
Jordan
* The original Brin-Page Google.
Jordan, regarding the logarithmic nature of the scale it's important only when we consider the flow of PR (i.e. the PageRank equations you posted) along with the Toolbar PR values. In that context, it becomes very important (as with quantitative research where we need to know what scale we use in our analysis).
One of the early papers discusses lack of convergence with dangling links, but I get the impression that they're not a problem now. Dangling links include /robots.txt excluded URLs and 404s (although 404s were ignored didn't suck PageRank last month).
Marcia and others have said repeatedly that PR is divided equally amoung the number of links it is passing to. I have no doubt that all of you could provide wonderful formulas like MonkeeSage's above to prove this is true. BUt this is not what I am seeing on the toolbar and when it comes down to it, it is only the visible PR on the toolbar that matters. Let me give an example.
One of my sites ( which is entirely done in very basic HTML, apart from a phpbb forum there are no fancy dynamically driven pages, databases, javascripts or anything else ) has a homepage PR of 5. This homepage has seven links on it to other pages in the site, each of which retains the PR5 ( ergo... you can have at least 7 links without losing any PR)
One of those 7 pages is a "contents" page with links to 40 other pages, each of which is a PR4 ( ergo... you can have at least 40 links and only drop 1PR point )
Now then... if page PR is stand alone ( ie does not inherit link depreciation from previous pages ) and that "contents" page only had links to 7 other pages instead of 40 then each of those would be a PR5 as well... If each of those 7 pages had links to 7 others etc etc then the whole site could be PR5.
Now the point of this is that one of those pages could be a links page with a PR5 and if you then kept dividing in the same way every link going off your site could carry the full measure of PR5 with it making your link much nore attractive.
As I said. This is probably much to simplistic, but I cannot understand where it falls down.
Speaking as a newbie, I think that:
1) There's a "damping factor". Not all of a page's PR is redistributed.
2) The process is iterative. Your index page passes some PR to your topic page. If that links back to your home page then some PR is passed back. And round and round it goes...
3) The "PR" on the toolbar aint real. It's just a visualistion tool and is logarithmic. Thus a "PR5" site could have a real pagerank of 100000, a "PR4" site a real pagerank of 10000, a "PR3" site a real pagerank of 1000 (they're random numbers to make the point)
I'd second MonkeeSage's recommendation of the Ian Rogers article, worth a read.
I've spent most of my time in the last year working open source software. I've checked a number of the OS projects. They have a PR 6 thru 8 without any effort at optimizing page content or focusing on getting links only from certain level PR sites or those only on pages with X amount of links. Their active users are also doing okay in PR without giving any special attention in this area either. I've seen some given a PR 5 with links from as little as 10 other sites.
I guess what I'm saying is some of you have gone way, way down into the weeds. Try focusing on areas to improve your ranking in the search engine and PR will take care of itself.
The reason for my interest is in building reciprocal links. A huge percentage of webmasters only want to link with high PR sites, or at least want as high a PR return as they can get. Thus the interest in how PR distributes around the site. PR is a bit like paying tax. I don't believe in cheating but there is nothing wrong in arranging your affairs to achieve the most beneficial result.
Due to the very low resolution on the Toolbar, this analysis probably woudln't tell you anything interesting.
> Now the point of this is that one of those pages could be a links page with a PR5 and if you then kept dividing in the same way every link going off your site could carry the full measure of PR5 with it making your link much nore attractive.
I'm afraid not. If your home page is PR5.99 then, divided by seven links it might be somewhere between PR5.2 and PR5.4 (by my numbers, other people may reach other results!). The Toolbar still shows 5, but PR was lost.
BlueSky, I don't think anyone here is arguing that you can't get good rankings without a deep understanding of Google. If part of your job or hobby is helping Web pages to do better in search engines, then the more you understand the better your chances. True, the average surfer doesn't look at the PR of a site, but if you want to understand how other aspects of Google work, then each aspect that you can isolate helps. For various reasons, isolating PR and unserstanding the Toolbar scale helps very considerably.