Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

Noindex vs Nofollow vs Robots.txt vs Canonical

         

danimalSK

8:04 pm on Nov 11, 2010 (gmt 0)

10+ Year Member




Lets say I have a UGC heavy site (over 1 million pages), where there are a lot of links to signup and add content etc, all of which have a "return to" parameter, e.g.:

bluewidgets.com/widget1

has links to

bluewidgets.com/widget1?action=add_comment&return_to=widget1
bluewidgets.com/signup?return_to=widget1

etc.

What is the best way to handle this (assuming there are 5 of these a page, i.e. 5 million links)? There seem to be a plethora of options:

a) open up all these pages and canonical them (e.g. bluewidgets.com/signup?return_to=widget1 > bluewidgets.com/signup). downside of this is there are millions of these links, and letting the Googlebot crawl them seems like a pretty big waste of its time.

b) robots.txt block them. on the upside this should improves the Googlebot's crawl efficiency. on the downside these pages still get indexed and accumulate pagerank.

c) open them up but noindex them. on the upside this lets pagerank flow to other pages (rather than a deadend like the robots.txt). again the downside to this seems that allowing them to be crawled is a pretty big waste of the Googlebot's time.

d) nofollow the links. downside of this is that these links are "wasted" from a pagerank perspective (at least based on my understanding). upside is more crawl efficiency.

e) turn them into buttons...

So far I have opted to robots.txt block them. But I suspect this is not the most optimal choice. I'm leaning towards removing the "return_to" parameter, and opening them up, although I think this might be pretty hard for the dev team to implement.

Would appreciate any thoughts.

Cheers,

Dan

goodroi

10:40 am on Nov 12, 2010 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



welcome to webmasterworld!

there are many thoughts on this topic and i doubt you will find one clear "right way". that is because there are good points and some bad points to many of the different ways.

there is another option you might want to consider. option f) place the links in a frame, have the link page being framed hosted in a robots.txt excluded directory

here are my thoughts:

a) this is basically doing nothing and letting google sort it out. google typically prefers when webmasters do less manipulation. this is a safer option and also easier because you don't need to make changes.

b) robots.txt will lose some link pop. i wouldnt lose too much sleep over it. i assume you have 100 other links on the page so you are only talking about 1-2% loss. also not all links are equal. google places much less value on ros sidebar links when compared to higher valued embedded content links.

c) my biggest concern about noindex is that i have seen too many webmasters accidentally overuse noindex and kick out large sections of their content from google. be careful with using noindex.

d) nofollow is like saying you dont trust your internal pages. as stated for option b i wouldnt worry too much over the link pop loss.

e) buttons and text links are both links and each get a part of the outgoing link pop. i do like the idea of buttons because pretty buttons can help usability. better usability leads to better quality signals to google which leads to better rankings plus alot more happy visitors.

hope that helps

danimalSK

11:03 am on Nov 12, 2010 (gmt 0)

10+ Year Member




Thanks goodroi! Appreciate your help :)

Yes, as a proportion of the total, there are relatively few of these links. What that doesn't take into account, however, is their size - quite a few of them are large "Sign Up Now" style calls to action. Google seems very keen on crawling these links, so I'm worried we may be wasting more than a couple of % (GWT is reporting millions of URLs as restricted by robots.txt).

I also like the button solution, maybe i'll try buttonizing the large call to action links, and opening up / canonicaling the rest.

Thanks again!

Dan