Forum Moderators: bakedjake
Hi. Interesting analysis, but I think you’ve over-interpreted the meaning of our use of “excluded”. Excluded is chosen based on the html source, and it’s the first thing done after crawling; nofollow and link quality and 302 redirects are assessed elsewhere and don’t enter into the label “excluded”.
We’re being more transparent than the usual search engine, but the information we reveal is what we happen to generate internally, and not all of that is easy to explain. This html analysis system one of the oldest parts of our crawler. It is intended to get rid of the “clamp” around actual web content, without having to know anything about the links themselves. When you look at the “sections” for a page, you’re seeing the raw output from this tool.