Reposted censored version of post: WebmasterWorld does not allow any examples.
Try this on gfe-gv.google.com or on another datacentre that has "standard" results, and then try it again on gfe-eh.google.com which is the one that has the updated supplemental results.
First, look for: ZZZZZZZ AAAAAAA BBBBBBB
Notice that there are multiple results: these are duplicate content and most are marked as supplemental. The URLs differ by just a parameter or two (pf is "print friendly"; and tx is "text only"). Notice also the duplication caused by differing capitalisation (darn IIS).
The site should be using a meta robots noindex tag on all but one URL version of the page, but they do not do so. That would solve the "duplicate content" part of the problem - and would remove all but one URL version from the index.
The spelling mistake of "ZZZZZZZ" (instead of "YYYYYYY") was removed from the live pages many months ago (back around April I think); yet these URLs still appear for that old search query: so, not only are there supplemental results for URLs that have been removed from the main index for being duplicate content, they can also be for results that are simply representing a previous version of a page, allowing a page to rank for words that are no longer on that page.
Notice that the cache is newer and does NOT include the incorrect spelling. The pages rank for a word that is not on the live page and not in the Google cache. The word only appears in the snippet, and only appears in the snippet when the word was in the search query.
Next, search for YYYYYYY AAAAAAA BBBBBBB
Now you see the exact same URLs listed again. Again, several are marked as supplemental results because they are duplicate content. Those URLs differ by just a parameter value, but all lead to the same content.
Notice one important thing: these are the exact same URLs as returned for the earlier search. Notice that where some of the URLs were supplemental results for the "ZZZZZZZ" query, now the exact same URL is NOT supplemental for the "YYYYYYY" query.
Look at the canonical URL (the one without extra parameters). The "supplemental result" in this case is for the older "ZZZZZZZ" content; allowing surfers to still find your site long after that particular word was deleted from the page, and the current version of the page is in the normal index appearing for a "YYYYYYY" search. So, as far as everything goes, this page is in the normal index. It does not have a problem (ignore the supplemental, it is Google being helpful to searches, it will go away after a year).
What you want to happen is for there to be one normal result for the canonical URL for the page (the URL without any added parameters would be the canonical one), and that URL will always show as supplemental when you search for words from the previous content that was on that page. The duplicate URLs should not be appearing in the index. The webmaster should be designing the site so that the other URL variants cannot be indexed. A quick application of meta robots noindex tags to the other URL versions would fix the problem.
So, this is why I keep saying to not count supplemental results for canonical URLs. They are an artifact from your site, kept by Google to allow people to find your site for search terms that are based on older versions of your content. They clean these up only after holding on to them for a year or more.
This is also why I keep saying that "URLs go Supplemental, not Sites". The Supplemental tag is handed to URLs on a case-by-case basis, not on a site or domain basis.
So, count only normal results. Look to get all current content fully indexed. Make sure that old URLs that 301 redirect or 404 error really do return the right status codes: and ignore the fact that those URLs show up as supplemental results for some queries. Google will clean them away in their own time. They are not important. For URLs that redirect or error, once they are tagged as Supplemental they are not considered to be duplicate content; they are just archived artifacts that will eventually disappear. They are additional ways that surfers can find your site.
Sidenote: In fact for a .com site at #2, one that for the last year was redirecting both .co.uk and .net to .com, both of which had been partially indexed and dumped to supplemental just over a year ago, those other URLs always showed in search results at #8 and #11, so the site really had three entries in the top 11. The #8 and #11 were just redirects for the last year, and Google cleaned those up in the very recent supplemental update last week, after almost exactly one year of listing them.
If you do have "real" "duplicate content", the same content reachable by multiple URLs, then you must work to get all the duplicates out of the index using meta robots noindex or robots.txt directives.
I have just done this with a 40 000 thread forum that was exposing more than 750 000 URLs to Google. I can confirm all the effects that I have mentioned in this post. Now the forum has just the 40 000 threads indexed, each with one canonical URL (the 200 000 alternative thread URLs have been deindexed, it was a vBulletin forum), and just a few thousand pages of thread indexes showing too. All of the other URLs (some 450 000 pages that just say "Error. You are not logged in") are gone from the index. Those were simply URLs that a registered user would use to reply to a post, start a thread, send a PM, and many others that only a registed user should be accessing. Search engines should not even be accessing those pages, but most forum software makes no attempt to stop them. So, the forum has gone from 750 000 indexed URLs - with 680 000 of them being junk, and mostly marked as supplemental - to just 45 000 indexed URLs, all of which are proper content (threads), or index listings (thread lists) of that content.
So, there are several types of supplemental results:
1. Results representing old-content versions of a URL, where the same URL appears in the normal index for other (newer) search terms. These get cleaned up by Google after a year. Ignore them.
2. Results that are "duplicate content". Get the site architecture sorted so that each page of content exposes only one indexable URL. These alternative URLs will be dropped very quickly if they are not supplemental. However, you must wait a year for the alternative URLs to be dropped if they are already reported as supplemental results. In some cases the "normal" URL will be dropped, only to reappear as a supplemental result a few weeks later. Don't worry about those, they are not harming anything. They will be dropped eventually. It takes about a year.
3. Supplemental results for pseudo-duplicate content. These are cases where the page content is different, but not different enough for Google, or you have repeated the title and/or meta description across multiple pages. This is a "special case" of duplicate content. The fix is in your own hands; get more unique content on the page, and make sure that every page has a unique title and a unique meta description. Again the pages will show as supplemental results for some search terms (the older content, after editing) and as normal results for other search terms (the newer content, after editing).
I also see a number of sites that have a large number of Supplemental Results caused by PageRank issues. These are nearly always caused by one of several things.
1. The internal pages link back to /index.html sending the PR there, but Google chose to list www.domain.com as the canonical URL. Make sure that every page of the site links back to http://www.domain.com/ in exactkly that format.
2. Poor site architecture. Google recommends that you join sitemaps. I recommend that you implement breadcrumbs and run Xenu LinkSleuth over your site. Make sure that all internal links are in normal HTML links, and the site is easy to navigate. Think of your users as well as the search engines.
3. One other type of duplicate content that I didn't properly mention was of non-www and www URLs for the same "page". Fix that by using a site-wide 301 redirect, one that preserves the originally requested folder and filename in the redirected URL. The redirected URL will show as a supplemental URL for a year, but ignore it, it is NOT causing a problem.
4. Aligned to that is where you own multiple domains. Again, get the site wide 301 redirect in to the redirect everything to one domain. Don't serve the same content at multiple domains.
5. No-one linking to the site. If no-one else thinks the site is worth linking to, Google might relagate most of the pages to supplemental.