Forum Moderators: goodroi

Message Too Old, No Replies

To use rel=nofollow or robots.txt for View cart links

         

nikhilrajr

12:09 pm on Jun 25, 2015 (gmt 0)

10+ Year Member Top Contributors Of The Month



We currently block our add to cart, other user only links (a total of 6 links) which are placed in our header menu using robots.txt. As far as I understand this means - the 6 links accumulate a lot of PageRank, never gets crawled and will not pass PageRank. BTW we have around 80 million webpages!

If we unblock the links, then Googlebot will get 302 redirected to a login page which is on a HTTPS subdomain; which again is not going to pass PageRank.

We cannot use a meta noindex, nofollow as requesting any of the 6 links get 302 redirected to the login page.

I think it's best for us to use the link rel=nofollow and remove the robots.txt block. So Googlebot will not index the 6 pages, and the pages will not accumulate pagerank. We are not worried about external links pointing to these pages.

We are more worried about Googlebot crawl activity. My belief is that link rel=nofollow is an indexer directive and it doesn't prevent Googlebot from crawling the 6 links? If I am right, then we might see a ton of 302s happening from the header which is present in the 80 million pages.

So to use rel=nofollow or to continue with the robots.txt block?

lucy24

4:39 pm on Jun 25, 2015 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



If you remove the robots.txt block, search engines will request the URL-- but if the request leads to a redirect, they will only request the new URL if this, too, is permitted by robots.txt.

nofollow doesn't mean "Pretend you haven't seen this link". It only means "Don't tell them I sent you"; its most common use is for links inserted by other people that you can't personally vouch for.

You are right that it's no use attaching "noindex" to pages the googlebot will never see. (And, to forestall the alternative approach: It's infuriating to users when you allow search engines to crawl and index pages that require a human login.) Meta nofollow is irrelevant, since it will only be seen if the visitor is already on the page. But you'd want "follow" (i.e. nothing, default) here wouldn't you?

Unless they've changed things recently, the search-engine difference between 301 and 302 is that 301=new URL gets indexed, while 302=old URL gets indexed. (In times past, 302 also meant that the search engine would keep requesting the page forever, in hopes that one day the Temporary Redirect would be gone. This is probably no longer the case.) Only you can decide whether those six pages have so many juicy links pointing to them that you risk having your login page show up all over the place in search-engine indexes. Sure you can force it to show up using search terms that only you know about; the question is whether it will show up a lot in human searches.

Why is it a 302 redirect instead of an immediate 401?

nikhilrajr

5:28 am on Jun 26, 2015 (gmt 0)

10+ Year Member Top Contributors Of The Month



Thanks. As you said the 302 goes to a robots.txt blocked subdomain. So it's ok to remove the disallow rules for the login pages from the main domain. I would have gone with the meta noindex, follow but it's not possible because of the 302.

"Crawl prioritization: Search engine robots can't sign in or register as a member on your forum, so there's no reason to invite Googlebot to follow "register here" or "sign in" links. Using nofollow on these links enables Googlebot to crawl other pages you'd prefer to see in Google's index." Taken from Google's help page for rel=nofollow.

401 is a good option. But will take time to implement.

These pages have around total 5 backlinks according to ahrefs data. I am sure they will not show up in user searches.

Does link rel=nofollow prevent crawling?

lucy24

6:04 pm on Jun 26, 2015 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Does link rel=nofollow prevent crawling?

No, so I'm mystified about Google's text. I have personally seen the googlebot requesting URLs that they only know about from nofollow links. It dates back to when I mistakenly believed that "nofollow" did mean "pretend you haven't seen this".

:: insert "noidea" emoticon here ::