I read that site:domain.com will give you an estimate of how many pages are in the index and that site:domain.com *** -kdkadfd will give you only supplemental results. So if you take the supplemental page number and divide it by the total page number, you'll have an idea of what percentage of your site is supplemental.
The problem with large sites is sometimes the site:domain.com is way off.
Is there a good way to know how much of a site is supplemental for a large site (over 100K pages).
Given the way that Google reporting works, especially when larger numbers of URLs are involved, I don't know of any truly accurate way to approach this. If your urls have a relatively logical directory structure, you can break down your analysis into sections -- site:example.com/directory/ -- and get a bit more granular.
When I'm working with a large site, I find that knowing the actual number of supplemental urls wouldn't usually give me any good action points on its own, at any rate. What I need to do is see patterns in the midst of the reported supplemental results, and address the issue that those patterns suggest.