Forum Moderators: Robert Charlton & goodroi
http://forum.com/index.php?action=someaction
http://forum.com/index.php?board=some_board_id
http://forum.com/index.php?topic=some_topic_id
These are working fine and googlebot have no problem crawling, and it is easy to prevent crawling the action urls with robots.txt
But the topic links can further have some of the following attributes:
http://forum.com/index.php?topic=some_topic_id;prev_next=next http://forum.com/index.php?topic=some_topic_id.msg454 So each page can have a number of different urls that link to the same page. SMF tries to correct the problem by adding
<meta name="robots" content="noindex" /> This seems to work. Google will only index the main topic pages and duplicate content should be minimized.
But (and this is my real question ;) ), I still see googlebot crawling all the different urls. I know I can stop this with either robots.txt or by including nofollow in the robots meta tag, but I am hesitant to interfere too much with googlebots natural crawling.
Would it be ok if most of the links that googlebot encounters on a page are closed to it, and only the main topics are open?
Thanks for any feedback