Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

robots.txt vs meta robots noindex, and passing of PageRank flow

         

chaosas

3:11 pm on Jun 22, 2010 (gmt 0)

10+ Year Member



Hello,
I have a bit of a problem here - been searching the web for days and finding conflicting information regarding how passing of pagerank works for pages disabled in robots.txt.

Perhaps you could enlighten me as to which of the following is true:

1) Pages disabled by robots.txt file can accrue pagerank, but do NOT pass pagerank to other pages linked from them.

2) There's not much difference between a page with a noindex metatag, and a page disabled in robots.txt - both of them will accrue pagerank AND pass it on further to other pages.

I've read various opinions online, and some seem to think it's the first case, others the second. At the very least, I know that pages disabled by robots.txt can accrue pagerank, because I've found an old interview with Matt Cutts where he said the following:

Now, robots.txt says you are not allowed to crawl a page, and Google therefore does not crawl pages that are forbidden in robots.txt. However, they can accrue PageRank, and they can be returned in our search results.

Whether it can pass pagerank or not is still a question (because as far as I understand, googlebot can't crawl the page, so it won't know what links are on it).

What are your thoughts on this?

aristotle

5:33 pm on Jun 22, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



The sure answer is to enable crawling but put a noindex meta tag in the page header. This will allow it to accrue pagerank, and pass it on, even though it won't be indexed.

chaosas

5:44 pm on Jun 22, 2010 (gmt 0)

10+ Year Member



I suppose so. To be honest, one of the main reasons I'm asking about this is because I have several blogs on Blogger. As you might know, Blogger uses robots.txt to prevent label and archive pages from being crawled and indexed. So if it's true about pages disabled by robots.txt not passing pagerank, there's no way for older posts that are no longer on the homepage to get any "google juice" - besides external links, of course.

rainborick

6:11 pm on Jun 22, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



A page that's blocked by robots.txt can accrue PageRank, but unless it was indexed before it was blocked, it won't pass PageRank because Google hasn't seen any of the links that the page might contain. And as long as the block in robots.txt remains, Google will not crawl those pages and will never see the links, <meta> tags, or any other content in that page.

In essence, it would be a rare circumstance for blocked pages to be able to pass PageRank, and your situation with Blogger pretty much precludes it.

tedster

6:56 pm on Jun 22, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



it won't pass PageRank because Google hasn't seen any of the links that the page might contain


Exactly - it's technically impossible for a page to pass PR if it hasn't been crawled. Pass it to where?