Forum Moderators: open
Also, the stuff they put in their anchor text is probably going to be different than the stuff you put in your anchor text, and what your site is about is probably a bit different, or completely different than what their site is about. So the link relevence is going to be different as well.
But all things being equal, I still think two identical sites linking to each other in an identical manner would still benefit each other. And to take it a step further, even if they did cancel each other out then you should take into consideration whether or not it will benefit your visitors, which is what you should be considering in the first place.
A reciprocal link means traffic. Sometimes we all forget that good old traffic stuff. Before there were any search engines we just linked to show our visitors other useful stuff, and so we sent this way traffic.
I'm assuming two pages, A and B, linking to each other and not to any other page. Both have a PR of 4.
As far as the PR algo goes (I won't cite it :), page A linking to B definitely leaks PR. Because that link is reciprocated, B also leaks PR to A, so in the end they remain at the same level.
If both have different PR, the PR flows a little bit from the higher PR to the lower, very simple. If each page also links to other pages, it gets more complex, because they leak more or less to other pages as well. Reciprocal linking is no special thing, it is covered in the simple basic PR formula without exception.
So in the example the PR will stay the same in this simplistic example. But the main benefit is traffic AND the message in the anchor text, signalling google that the page at the other site is relevant for "widget".
Of course, there are other benefits apart from PR.
Statements like "Sites absolutely cannot lose PR by linking out." are wrong.
Statements like "Sites absolutely cannot lose PR by linking out." are wrong.
This may be correct if you define loosing PR as not distributing it as effectively as possible to other internal pages. But I think the statement "A page absolutely cannot lose PR by linking out." is certainly correct.
Please correct me if I am wrong.
Every page that has a certain PR before linking to another site, then links to another site, will lose some PR (all other things remaining equal).
So the sentence:
"A page absolutely cannot lose PR by linking out."
is wrong.
It will always lose some. But of course, if the linked page links to another page, which links to another page, which links eventually back a.s.o., this leak can be very minimal.
I don't want to bring in the PR algo, I promised it :)
A page absolutely cannot lose PR by linking out
This statement isn't correct too. In any normal case(*) not only the PR of the other pages of the site but also the PR of the linking page is decreased.
(*) normal case: the page has also some internal links (i.e. it isn't a dead end when all external links are removed)
Just so I get this straight:
If I have two "dead end" pages with the identical inbound links from identical external pages, they will have the same PR. Then if on one of my two "dead end" pages I link to 20 external sites, that page will now have a lower PR then my other page?
I thought Google considered a link as a vote. I don't recall ever voting and having to give up something for the priveledge (insert your comment here about wars to protect freedom / forefathers / etc.). Doesn't this discourage voting?
Note that the PageRanks form a probability distribution over web pages, so the sum of all web pages' PageRanks will be one.
So now if you give a link to another site, someone somewhere loses a tiny bit (or probably more) of his Pagerank. Now it depends on your web graph to finally determine if you lose some PR or not. These are just my views.
Having said that and as pointed by innumerous others in this thread give more importance to Anchor Text than PR. If google increases it's web index then you would generally see a drop of PR across Board.
[edited by: mil2k at 6:49 pm (utc) on Aug. 27, 2003]
I'm always referring to 'normal' (realistic) situations (as mentioned in msg#15), i.e. link pages that have also internal links. Of course, in the case you mentioned, PR is not decreased. However, this is only be true because you compare with a situation where PR is wasted (due to the dead link scenario).
doc_z, what do you mean by "pages with internal links"?
Now I throw it in, the algo :)
PR(A) = (1-d) + d (PR(T1)/C(T1) + ... + PR(Tn)/C(Tn))
(search for page rank explained on google)
A is the site for which to calculate the PR, T1...Tn are sites linking to A. C is the number of links on the given page T1..Tn.
The algo is very simple, we can for the moment omit the unnecessary stuff:
PR(A) = PR(T1)/C(T1) + ... + PR(Tn)/C(Tn)
That's it! So the PR of every page in the system is calculated from the PR of the linking pages, very simple. The sum of all PR is always the same. If the number of pages grows and you don't get more inbound links (you just stay the same with your page), you lose PR, because your relative share gets smaller.
From the formula above and the fact that the sum is always the same, you can deduct that every time you link to another page, this page's PR gets higher, right? (look at the formula). And if any page in the system grows in PR, other pages' PR will decrease, because the sum stays the same. Does that make sense?
So if you link out, you will always enable other sites to increase PR, and the logical consequence is you loose some.
I'm not saying that you should not link! In fact, you know that you win far more than the little PR you lose. You get traffic, and eventually even the lost PR comes back.
In order to grasp this, it's a good idea to just build a simplified net. Think of two or more pages that link in different sorts to each other, and for the moment assume that there PR is fixed at first (e.g., it's 1 or whatever). And then you can calculate the PR of one page through the PR of the linking one, and so on. Google has to do this about 50 times I guess, until all PRs don't move anymore :)
Hope that helps (as I said, search for "page rank explained").
nick420, what exactly do you mean by dead-end pages? Pages with no outgoing links, only inbound ones?
dirkz... yes that is what I meant. Not that I would ever have any "dead end" pages. I was really just trying to simplify my question. Apparently I just made it more complicated with an unrealistic scenario.
Thanks for all the info on the algo.
"dead-end pages" are pages with no link on it; "pages with internal links" are pages which link to other pages of the site.
The sum of all PR is always the same.
The total PR is N=the number of pages (if there are no dead end pages).
If the number of pages grows and you don't get more inbound links (you just stay the same with your page), you lose PR, because your relative share gets smaller.
That's not correct. The real PR in unaffected by the total number of pages. However, the ToolbarPR is probably related to the page with highest real PR and therefore the ToolbarPR normally (in realistic scenarios) depends on the total number of pages.
In order to grasp this, it's a good idea to just build a simplified net. ... Google has to do this about 50 times I guess, until all PRs don't move anymore :)
The number of iterations strongly depends on the algorithm used. For the original mentioned Jacobi algorithm, this is a valid guess. However, state of the art algorithms are much faster.
Imagine a site with 1 index page and lets say 10 content pages. Index site has a PR of 4 for example, while there are 10 links on the index page pointing to the content pages and all 10 content pages link back to the index page.
Now you add an external link on the index page pointing to a completely different site, the PR of of the index page as such is not influenced at this moment but now there are 11 links on it which means that the content pages are getting a smaller portion of the PR thus their PR is decreased a bit and at the same time they return a smaller PR back to the index page which actually decreases the PR of the index page a little bit. So if you look at it this way the PR of the index page got decreased by placing a link on it.
On the other hand, if you gained a new reciprocal link at the same time, your PR might actually increase if the link is good enough to make up for the loss.
doc_z, you are probably right about the iterations.
By total number of pages, I mean the total number of pages in the WWW. I think there is a misunderstanding here, you mean the pages of a site. So you are right when you say that your PR in fact increases when you have more pages (on your site).
From the Brin and Page paper, the average Actual PR of all pages in the index is 1.0!
By "paper", Brins and Pages original publication is meant. So the PR of any given page of your site will decrease as the web grows bigger (assumed that the number of backlinks etc. for your page stay the same).
The google toolbar always gives you the PR of the page (URL) you are viewing, NOT the average of your site (except for that at the moment it behaves very weird :))
Dead-End pages don't complicate the computation, they are just a valid part on the Web. In fact, they make calculation easier (that is, for human beings) :)
Btw, the more pages your site has, the more PR altogether you can reach (because your potential share of the total PR of the Web is much bigger). It's up to you to distribute it from your index.html or whatever you have as main page. Pages of your own site (and links from them to a page of your site) count in the calculation just like any other page on the Web, it doesn't make any difference. So the more pages link to your main page, the more PR it will reach.
of course, the average Actual PR of all pages in the index is 1.0. (assuming that there are no dead ends). And therefore the total PR is N=the number of indexed pages of the WWW.
So the PR of any given page of your site will decrease as the web grows bigger (assumed that the number of backlinks etc. for your page stay the same).
As already said, this isn't correct. The real PR (i.e. the PR of the formula you gave) of the page is unaffected (i.e. constant). The fraction of PR of the page to the total PR is only decreased (i.e. PR(A) is unaffected, PR(A) / total PR is decreased).
The google toolbar always gives you the PR of the page (URL) you are viewing, NOT the average of your site
Of course, the ToolbarPR is page related - no need to mention obvious facts. However, the toolbar is not showing the real PR. The toolbar shows in integer on a logarithmic scale. This scale is probably (not necessarily) related to the page with the highest real PR (e.g. by fixing this real PR with ToolbarPR 11.0). Therefore, a change in the total number of pages probably (i.e. in realistic scenarios) leads to a change in the ToolbarPR because it is rescaled (i.e. the same real PR corresponds now to a different ToolbarPR).
Dead-End pages don't complicate the computation, they are just a valid part on the Web. In fact, they make calculation easier (that is, for human beings) :)
I never said that dead end pages complicate the computation - I just said that dead ends lead to a decrease of the average PR (i.e. <PR> < 1 ). Also, in case of (originally) dead end pages you can link out without a decrease of PR, because PR was already wasted.
However, dead end pages can indeed lead to a complication. At least in the case when the calculation is done as computation of eigen vectors instead of an inversion of the transition matrix. For example, in most of those Stanford papers (e.g. by Kamvar et.al.) they are considering the calculation as the determation of eigen vectors. In this case dead end pages have to be treaten differently (i.e. they must be taken out of the calculation in the first step) and therefore are indeed more complicated.
Btw, the more pages your site has, the more PR altogether you can reach
Yes, according to the original PR algorithm you are 'producing' a PR of one when creating a new page. However, if you are studying the current implementation you'll find that Google has modified this algorithm significantly.
As already said, this isn't correct. The real PR (i.e. the PR of the formula you gave) of the page is unaffected (i.e. constant). The fraction of PR of the page to the total PR is only decreased (i.e. PR(A) is unaffected, PR(A) / total PR is decreased).
Ok, this is valid. But if your fraction of PR is decreased, then your "overall PR" decreases, doesn't it?
Maybe we have to keep real PR as you call it different from "PR share", which is then expressed on scale by the Toolbar. I think we talked about two different things.
Thanks for the discussion (what was the initial question?) :)
2. It depends on whether off-site links are considered to have "infinite sucking power" as I like to call it. This appears to still be an argument in some circles. Anyone who has not studied the original Brin and Page thesis and actually run spreadsheet PR simulations of their own will probably not grasp this. Basically the difference, when running interations, is whether the PR5 coming into the site REMAINS a FULL PR5 (does not lose it's PR) with each successive iteration. Which of course is what this thread is all about.
A. If it doesn't, then definitely there is an obvious LOSS of rank to it with respect to the rest of the index, as the rank of that site is "transferred" to the destination site instead of its own site, and basically becomes an extension of the destination site.
B. If it DOES remain a full PR5 at each iteration, that means it pocesses this "infinite sucking power" (ISP?) and thus DOES NOT EVER lose PR. On each iteration it is still PR5 no matter how many iterations are run and EACH TIME TRANSFERS (PR5-dampening)/number-links to the other site! If it is true that the total PR of the web is equal to the number of pages, as the thesis claims, then it must be "sucking" more PR from itself and all sites linking to it. This leads to simulations I've run, with sites constructed in a precise manner, which result in some pretty interesting PR MULTIPLICATION phenomena with only a couple incoming links as the PR is imported from the rest of the net, which I don't think Google would permit.
This whole difference is discussed and argued in detail by a guy named Ian Rogers on a very nice website with examples so you can decide for yourself. His documented conclusion is that the calling SITE DOES lose PR by linking.
Since we cannot post URLs, those who are looking for a "lower IQ" version for explaining Pagerank which was requested earlier in this thread, might find a very good site by googling for Pagerank Explained Correctly with Examples.
Mike