derrickwheeler - 7:45 pm on Dec 8, 2010 (gmt 0)
Just FYI... at one point we counted as many as 1.4 billion URLs on *.microsoft.com/*. Most of these were "junk" navigational URLs like I mentioned in the video. Many of these "junk" URLs are pages from various "solution finders" or "compatibility checkers" where you can select attributes via links to get search result style pages.
Some search engines are better than others at avoiding them during a crawl and/or filtering them out at indexing.
We do our best to robots them out but some of them are structured so it is difficult to exclude the bad stuff without also excluding the good stuff.
I have to convince the "site owner" (if I can find them) to let me block their stuff. This can be difficult so sometimes I block things then wait for someone to send me a nasty email :)