pageoneresults - 4:48 pm on Nov 7, 2010 (gmt 0)
How would you deal with paginated urls on your site?
These days I do most everything via the meta robots element. In the case of pagination, we noindex the paginated pages and allow the bot to follow links. Unless of course that is the end of the click path at which time those pages would be available for public indexing.
I prefer to keep all non-essential pages out of the index. Those pages that are paginated are usually just gateways to the final click path. It's those listings on the paginated pages that take you to the final records, that's what you want indexed and showing up for search queries, not the paginated pages or the click paths inbetween start and finish. ;)
Keep in mind that this method does not apply to all taxonomies. There are some sites where the paginated content is of importance from an indexing perspective and those would be allowed to get indexed. I don't see many like that, but they exist.
I like to conserve site equity and only allow the most important pages to get indexed. Everything else is noindex e.g.
<meta name="robots" content="noindex">
This allows the bots to crawl those pages and follow links to the final destination documents while keeping those inbetween docs out of the index. They are intermediary in the click path and they suck equity from the site.
On a side note, I noarchive everything these days - everything.