Forum Moderators: goodroi
e.g mywidgets.com/widget.html is the base url and all the urls for pagination of this url is are mywidgets.com/widget.html?page=2, mywidgets.com/widget.html?page=3 etc
So I am thinking to block the spiders from these paginated urls (which all share exactly the same metadata, title etc) I need to add the following line to my robots.txt:
disallow *?
What you guys think? is this the best method?
Using wildcards in the robots.txt is an easy way to deal with the situation but smaller bots do not support it. My personal preference would be to change the url structure of your site.