1script - 8:47 pm on Apr 26, 2010 (gmt 0)
They may have no sitemap.xml, a sitemap.xml that only has "important" pages or a sitemap.xml that only has higher level pages and they are relying on the bots to find deeper pages from there e.g. an ecom site that has sub-cat pages listed but not product pages because links to all product pages are on the sitemap listed sub-cat pages.
OK, I see ... In my case I have an exact opposite of such "incomplete" sitemap - I have all the content pages but, looking at the sitemap.xml you would not have guesses exactly how you arrive at those pages because all the intermediate navigation steps (one of two, depending on how old the content is) are missing in sitemaps.xml. The thing that got me worried is that since, as you point out, sitemap.xml is not the only signal used for discovery/ranking, by excluding intermediate navigational URLs from sitemap, I am saying that these navigational URLs are not important and yet they may be important from the standpoint of ranking the content pages they link to.
So, the question then becomes: do you make your sitemap.xml smaller by leaving only essential content pages in it and hoping that higher percentage of the sitemap URLs will get crawled or you "cramp" as many URLs into the sitemap as possible with an aim to have maybe lower amount of content URLs crawled but have the supporting navigational structure crawled as well.