Forum Moderators: open

Message Too Old, No Replies

PR levels of depreciation.

         

Perplexed

6:27 am on Aug 11, 2003 (gmt 0)

10+ Year Member



Sort of thinking aloud here, and looking for opinions/confirmation.

There is much talk of PR being devided equally amoung the links off a page. More links on a page = less PR each.

There is also visual evidence that PR simply drops a point for each level of depth on a site.

Sooooo.... Let us assume that I have a page on my site ( lets call it page "A" ) that has a PR of 8. There are three links off that page, two go to "content directory " areas of my site ( lets call them pages "B" & "C" ) and the third goes to what we could ( but wont ) call a links page ( lets call it page "D" ) Visual evidence would suggest that each of these three pages would be a PR 7.

If page "D" contained a list of catagories and sub catagories within my relevant niche area. ( all very focused and to the point ) clicking on any of those links ( catagories or sub catagories ) would take you to a page which is now one more level doen and would be a PR 6 ( lets call this page "E")

Page "E" obviously contains a list of relevant, high quality, interesting relevant ( just to stress the point ) sites.... but the link for each site takes you to a page with a full description ( and possibly a critique ) of just that one site. ( lets call this page "F".... there would be lots of page "F"s ) This page "F" would have a PR of 5 which would go only to this one site described and linked to on the page.

Is this right? or am I just talking scribble?

Perplexed

12:15 pm on Aug 11, 2003 (gmt 0)

10+ Year Member



Just scribble then :)

ciml

12:24 pm on Aug 11, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



If your incoming links all point at the home page, if none of your 'downward' links link back up in the hierrarchy (excepting to the home page) and if each page has about 3 links on, then you can expect to drop one notch of PR with each two or three levels.

The reason for this often coinciding with the URLs, is that many pages have approximately 20 links on (with a wide margin) and this is approximately the number of links that cause PR to reduce by one notch.

The numbers are affected by breadcrumbs, sibbling and cousin links, general links (eg. contact or copyright), PR being injected from outside the site, 404s, /robots.txt, etc.

many people would disagree with the actual numbers used, but hopefully you get the idea

doc_z

1:07 pm on Aug 11, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



There is also visual evidence that PR simply drops a point for each level of depth on a site.

As already mentioned by ciml, it depends on the number of links on the page. If a page has few links, ToolbarPR will drop by one or is unchanged. (It depends on the exact number of links and if the page has a high or low Toolbar PRx. Of course, the real PR is decreased in any case even if the ToolbarPR is unchanged.)

On the other hand, if there are numerous links on the page, PR will drop at least by one (but can also drop by 2 or even more).

have approximately 20 links on

It seems that this is not the latest value. After the new PR update, it seems that this number was decreased while the damping factor was slightly increased. (Although, I have to analyze this in detail.)

MonkeeSage

1:30 pm on Aug 11, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



"We assume page A has pages T1...Tn which point to it (i.e., are citations). The parameter d is a damping factor which can be set between 0 and 1. We usually set d to 0.85. There are more details about d in the next section. Also C(A) is defined as the number of links going out of page A. The PageRank of a page A is given as follows:

PR(A) = (1-d) + d (PR(T1)/C(T1) + ... + PR(Tn)/C(Tn))

Note that the PageRanks form a probability distribution over web pages, so the sum of all web pages' PageRanks will be one.

PageRank or PR(A) can be calculated using a simple iterative algorithm, and corresponds to the principal eigenvector of the normalized link matrix of the web."
(Sergey Brin-Lawrence Page, The Anatomy of a Large-Scale Hypertextual Web Search Engine [www-db.stanford.edu]. This is the original paper that started Google at Stanford University).

So,

"1. PR(Tn) - Each page has a notion of its own self-importance. That’s “PR(T1)” for the first page in the web all the way up to “PR(Tn)” for the last page

2. C(Tn) - Each page spreads its vote out evenly amongst all of it’s outgoing links. The count, or number, of outgoing links for page 1 is “C(T1)”, “C(Tn)” for page n, and so on for all pages.

3. PR(Tn)/C(Tn) - so if our page (page A) has a backlink from page “n” the share of the vote page A will get is “PR(Tn)/C(Tn)”

4. d(... - All these fractions of votes are added together but, to stop the other pages having too much influence, this total vote is “damped down” by multiplying it by 0.85 (the factor “d”)

5. (1 - d) - The (1 – d) bit at the beginning is a bit of probability math magic so the “sum of all web pages' PageRanks will be one”: it adds in the bit lost by the d(.... It also means that if a page has no links to it (no backlinks) even then it will still get a small PR of 0.15 (i.e. 1 – 0.85). (Aside: the Google paper says “the sum of all pages” but they mean the “the normalised sum” – otherwise known as “the average” to you and me."
(Ian Rogers, The Google Pagerank Algorithm and How It Works [iprcom.com]).

This is course speaking of the original PR algo, but it is prolly close enough to get a basic idea. :)

Jordan

doc_z

2:54 pm on Aug 11, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Note that the PageRanks form a probability distribution over web pages, so the sum of all web pages' PageRanks will be one.

This statement (as well as seeing the PR calculation as the determination of eigen vectors) is only valid, if there are no dead ends.

Also, for completeness you have to mention the relation between real PR and ToolbarPR (which is showing an integer on a logarithmic scale).

The logarithmic base was what was discussed above.

MonkeeSage

3:35 pm on Aug 11, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



doc_z:

I'm not a math wiz by any means (I could only barely follow the page I linked to), but I think that granted the theoretical possibility of eigen decompositions for all matrixes (based to the possibility of singular value decompositions in case of non-square matrixes), then even with dead links, calucated PR(A) is still valid. Of course, I'm probably wrong...again, I'm no math expert...but that is how I was understanding the paper. Also, if Google* filtered out dead link before calculating PR(A), then the assumed convergence would always be valid.

Ps. I'm aware that the toolbar is based on a log. of actual PR, but I was posting the info in regards to Perplexed's original question about the distribution of PR to hypothetical page "F" (i.e., points 1-2 in the citation from Mr. Rogers (Mr. Rogers?!? Hey, that is cool! 'Won't you be my neighbor?' ;D ), or point 3 if it is backlinked to "E").

Jordan

* The original Brin-Page Google.

ciml

5:02 pm on Aug 11, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



doc_z, I saw a difference in either PR flow or the Toolbar scale with Esmerelda, but things are very much back to normal for me so I assume that incomplete spidering was to blame before.

Jordan, regarding the logarithmic nature of the scale it's important only when we consider the flow of PR (i.e. the PageRank equations you posted) along with the Toolbar PR values. In that context, it becomes very important (as with quantitative research where we need to know what scale we use in our analysis).

One of the early papers discusses lack of convergence with dangling links, but I get the impression that they're not a problem now. Dangling links include /robots.txt excluded URLs and 404s (although 404s were ignored didn't suck PageRank last month).

F_Ali

5:49 pm on Aug 11, 2003 (gmt 0)

10+ Year Member



Hi,

I think the new google update is underway, the links seem to have changed (well for my site anyway)

Perplexed

6:56 pm on Aug 11, 2003 (gmt 0)

10+ Year Member



Well thanks guys.
I can see that I have a lot of homework to do coz most of that went right over my head.

I will try to digest it all and maybe come back for some clarification.

Perplexed

7:19 pm on Aug 14, 2003 (gmt 0)

10+ Year Member



Ummm.... still trying to digest that fellas.

Simple question.... Which is better for the person being linked to.
A... to be one of 40 links on a PR6 page
B... to be the only site linked to on a PR 5 page. ( or even a PR 4 page )

ciml

7:49 pm on Aug 14, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



In my opinion but I think that people would disagree...

> A... to be one of 40 links on a PR6 page
> B... to be the only site linked to on a PR 5 page.

B.

> ( or even a PR 4 page )

A.

(assuming that the pages all inhabit the same position within their respective notches)

Perplexed

6:26 am on Aug 15, 2003 (gmt 0)

10+ Year Member



OK, now maybe I am just to old, or to thick, to understand this but I am heading in a direction that seems very obvious to me but is probably to simplistic to be true.

Marcia and others have said repeatedly that PR is divided equally amoung the number of links it is passing to. I have no doubt that all of you could provide wonderful formulas like MonkeeSage's above to prove this is true. BUt this is not what I am seeing on the toolbar and when it comes down to it, it is only the visible PR on the toolbar that matters. Let me give an example.

One of my sites ( which is entirely done in very basic HTML, apart from a phpbb forum there are no fancy dynamically driven pages, databases, javascripts or anything else ) has a homepage PR of 5. This homepage has seven links on it to other pages in the site, each of which retains the PR5 ( ergo... you can have at least 7 links without losing any PR)

One of those 7 pages is a "contents" page with links to 40 other pages, each of which is a PR4 ( ergo... you can have at least 40 links and only drop 1PR point )

Now then... if page PR is stand alone ( ie does not inherit link depreciation from previous pages ) and that "contents" page only had links to 7 other pages instead of 40 then each of those would be a PR5 as well... If each of those 7 pages had links to 7 others etc etc then the whole site could be PR5.

Now the point of this is that one of those pages could be a links page with a PR5 and if you then kept dividing in the same way every link going off your site could carry the full measure of PR5 with it making your link much nore attractive.

As I said. This is probably much to simplistic, but I cannot understand where it falls down.

valeyard

8:46 am on Aug 15, 2003 (gmt 0)

10+ Year Member



Perplexed:

Speaking as a newbie, I think that:

1) There's a "damping factor". Not all of a page's PR is redistributed.
2) The process is iterative. Your index page passes some PR to your topic page. If that links back to your home page then some PR is passed back. And round and round it goes...
3) The "PR" on the toolbar aint real. It's just a visualistion tool and is logarithmic. Thus a "PR5" site could have a real pagerank of 100000, a "PR4" site a real pagerank of 10000, a "PR3" site a real pagerank of 1000 (they're random numbers to make the point)

I'd second MonkeeSage's recommendation of the Ian Rogers article, worth a read.

BlueSky

9:37 am on Aug 15, 2003 (gmt 0)

10+ Year Member



I really don't understand why many here want to analyze to the nth degree on how PR is obtained and what percentage is passed on to other pages. The average surfer doesn't even look at the PR of a site. Until I built a site very recently and read this forum, the little green bars meant absolutely nothing to me either. They still mean nothing to me. Being brand new, mine has a PR0 but appears on the first page of the search results I've targeted so far. To me, that is what is important -- getting high enough to have people find my site.

I've spent most of my time in the last year working open source software. I've checked a number of the OS projects. They have a PR 6 thru 8 without any effort at optimizing page content or focusing on getting links only from certain level PR sites or those only on pages with X amount of links. Their active users are also doing okay in PR without giving any special attention in this area either. I've seen some given a PR 5 with links from as little as 10 other sites.

I guess what I'm saying is some of you have gone way, way down into the weeds. Try focusing on areas to improve your ranking in the search engine and PR will take care of itself.

Perplexed

9:46 am on Aug 15, 2003 (gmt 0)

10+ Year Member



PR will not take care of itself. Serps will if you build good pages and chase the PR.

The reason for my interest is in building reciprocal links. A huge percentage of webmasters only want to link with high PR sites, or at least want as high a PR return as they can get. Thus the interest in how PR distributes around the site. PR is a bit like paying tax. I don't believe in cheating but there is nothing wrong in arranging your affairs to achieve the most beneficial result.

ciml

10:50 am on Aug 16, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Perplexed, as valeyard points out the PR graph on the Toolbar uses a logarithmic scale. If your pages only get PR via the home page, if all the links you describe are counted, and if they don't link to each other (apart from to the home page which won't matter), then you have just discovered that, if MonkeeSage posted the correct equation (which he did), the effective log base of the Toolbar with a coefficient of 1 is x, where logx(x^n (1-d) / 7) < 1 (in your case, n=5 but with a log scale it could be otherwise) and that 1 < logx(x^n (1-d) /40) < 2

Due to the very low resolution on the Toolbar, this analysis probably woudln't tell you anything interesting.

> Now the point of this is that one of those pages could be a links page with a PR5 and if you then kept dividing in the same way every link going off your site could carry the full measure of PR5 with it making your link much nore attractive.

I'm afraid not. If your home page is PR5.99 then, divided by seven links it might be somewhere between PR5.2 and PR5.4 (by my numbers, other people may reach other results!). The Toolbar still shows 5, but PR was lost.

BlueSky, I don't think anyone here is arguing that you can't get good rankings without a deep understanding of Google. If part of your job or hobby is helping Web pages to do better in search engines, then the more you understand the better your chances. True, the average surfer doesn't look at the PR of a site, but if you want to understand how other aspects of Google work, then each aspect that you can isolate helps. For various reasons, isolating PR and unserstanding the Toolbar scale helps very considerably.