Forum Moderators: Robert Charlton & goodroi
In particular in relation to potential duplicate contents/ links. Even though every page will be index but the nofollow will prevent link votes to potential pages that might be duplicates. Am I right in my thinking.
For instances, Knol uses it but to what aim..Ok spammers but Knol might run into Duplicate issues as well won't they? Because if two or three authors writes about the same thing/topic going forward in few years won't duplicate issue come up?
Please advise.
The specific combination you are referring to - index, nofollow - is used in cases where the author wants the page indexed, but does not want any outbound links weighed in on. This means that all outbound links from that page provide no ranking weight. In the case where the article links to a specific article on the author's website where the content is duplicated, then the tag may help the article directory stand a good chance of ranking for that specific article. However, there are many other metrics that take place in that assessment including the article directory ranking, inbound links, direct inbound links to the specific article, and more.
If the article directory is user-driven, my sense is that for the webmaster / owner of the website you are referring to, adding the nofollow attribute to the meta tag prevents the article from passing any weight to the authors' website, lowering the number of outbound links and preserving more pagerank.
Knol uses this combination simply because of spam - if you prevent the value of those outbound links as a ranking property, then you effectively reduce the overall amount of spammy articles submitted. This is seen quite often in blogs, for the same reason.
Knol may or may not run into duplicate content issue, but this would be based on a page assessment by Google of the content submitted, and not the attribute of links on that page. Google will compare the text and characteristics of the page at Knol versus any other page to determine if any other page crosses the threshold of what is determined to be a trigger for duplicate content.