JS_Harris - 8:24 pm on Mar 15, 2010 (gmt 0)
I love reading juicy tidbits straight from the source, great article.
The best way to think about it is that the number of pages that we crawl is roughly proportional to your PageRank
the low PageRank pages on your site are competing against a much larger pool of pages with the same or higher PageRank.
Imagine we crawl three pages from a site, and then we discover that the two other pages were duplicates of the third page. We'll drop two out of the three pages and keep only one, and that's why it looks like it has less good content. So we might tend to not crawl quite as much from that site.
Eric Enge: Can you talk a little bit about Session IDs? Matt Cutts: Don't use them.
(on paid affiliate links:) "... we usually would not count those as an endorsement"