Forum Moderators: open

Message Too Old, No Replies

PR affected by robots.txt / noindex tags?

         

Umbra

2:47 pm on Aug 22, 2003 (gmt 0)

10+ Year Member



If you block a spider from indexing "Widget.html" (with robots.txt and Noindex meta tags) and yet there are numerous links pointing to "Widget.html", what happens to the PR flow? Is PR unaffected (as if the links didn't exit) or is it uselessly dumped into the page (like lemmings running off a cliff)?

Any ideas?

takagi

11:16 am on Aug 24, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



In the original PageRank document (The Anatomy of a Large-Scale Hypertextual Web Search Engine [www-db.stanford.edu]) you can find that the PR of page-A depends on the PR of the pages linking (page-B, page-C, etc.) to Page-A, and the number of links on the pages linking to page-A. So there should be no problem as long as you don't add 'nofollow'. With a 'nofollow' you will reduce the PR of pages being linked from page-A, and somehow that could reduce also the PR of page-A if there are some (indirect) reciprocal links.

However, please do realize that a lot of small changes has been add to PR calculation since that document was written.

Turning to the technology developments that Google has planned for the future, Sullivan asked Brin to elaborate on the work that goes into the constant development of the famed PageRank system. Brin said that it was still very much an important part of Google's ranking system and that more than half a dozen new ranking technologies are tested each month, with roughly half of these being integrated into Google's PageRank algorithm.
Inside Search Engine Strategies, San Jose - Day Three [seotoday.com] Aug 21, 2003

BTW, why should you worry about PR if the contents of the page cannot be indexed?

Umbra

7:12 am on Aug 25, 2003 (gmt 0)

10+ Year Member



Takagi, allow me to clarify the question to make sure we're on the same page. Referring to these 2 diagrams, representing the flow of Page Rank:

1) Page A ---> Page B ---> Page C
2) Page A ---> Page B

If Page A has a link to Page B which is linked to Page C, that is obviously the top diagram. If nothing links to Page C, that is obviously the bottom diagram.

But what happens if Page B has a link to Page C AND Page B has a NOFOLLOW metatag AND Page C has a NOINDEX,NOFOLLOW metatag? Robots would not see a link between Page B and C, but the HTML would. So in this case, does the PR flow according to robot crawling behavior (=bottom diagram) or pure HTML links (=top diagram)?

takagi

9:23 am on Aug 25, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



1. Let's presume Page A is linked from outside otherwise Google doesn't index these pages at all and consequently there is no PageRank flow.

2. If Page C has no links to it other than from Page B, then it doesn't matter if Page C has a NOINDEX,NOFOLLOW metatag or not because the page will not be spidered by bots like Googlebot that obey the NOFOLLOW metatag on Page B. For Google Page C doesn't exist (there are no known links to it) and therefore Page C has no PageRank.

3. Page B is only being linked from Page A and Page A misses the NOFOLLOW metatag, so Page B receives a PageRank that is related to the PageRank of Page A. There is no transfer of PageRank in the meaning that the PageRank of Page B is incremented at the expense of the PageRank of Page A.

4. Page B misses a NOINDEX metatag so it will be indexed and it can be found in the SERPs (Search Engine Results Page) for a relevant but not too competitive query.

5. Because there is a NOFOLLOW for page B, no other page will get some PageRank from Page B. So even if a link would be added to Page A, then Page A would not benefit from such a link.

6. Page A has a PageRank due to the inbound links (see point 1), but could have had a higher PageRank if Page B and Page C would both link to Page A AND there would be no NOFOLLOW and/or NOINDEX for Page B + Page C.

My two yen.

Valeriy

12:05 pm on Aug 25, 2003 (gmt 0)

10+ Year Member



This is what google algo developers say on here [www-db.stanford.edu]
We assume page A has pages T1...Tn which point to it (i.e., are citations). The parameter d is a damping factor which can be set between 0 and 1. We usually set d to 0.85. Also C(A) is defined as the number of links going out of page A. The PageRank of a page A is given as follows:
PR(A) = (1-d) + d (PR(T1)/C(T1) + ... + PR(Tn)/C(Tn))

Looking at the formula to get a higher PR you've got to select sites with highest possible PR and lowest possible number of outgoing links.

Val