lucy24 - 5:01 am on Jan 15, 2012 (gmt 0)
By "part of" I meant the discussion about material from robots.txt showing up in searches. Google also indexes sitemaps and-- if you're careless enough to leave them lying around-- raw logs. They would undoubtedly index your htaccess if only they could get to it.
But if the single most common words are "Disallow" "User" and "Agent" it suggests that they haven't got around to counting the rest of the keywords yet. Feed in some random exact-text phrases and make sure they come up in search results. Then you'll know that your other pages are indexed.
Keywords seem to be processed entirely separately from indexing-in-general. And I've got a hunch they don't generate the list at all until you sign up with gwt. In my case I'd just gotten used to a list absurdly packed with words like "it's" and, yes, "word"... and then suddenly a whole slew of names crops up. They belong to a rarely-visited page that happens to be fatter (in html) than anything else on the site. So as soon as they threw it into the Keywords mix, everything changed.
For a while, one of my most common keywords was "thumbnail". I finally forced myself to sit down and make proper alts for all my, ahem, thumbnails ;)