On one of our company's large sites we run a CMS that automatically generates a sitemap.xml file that we have tied into webmaster tools.
This is fine for the most part...updated pages are automatically posted to the sitemap.xml file as well as new pages added are also added to this XML file.
The problem is that this is a legacy site that has a lot of non-CMS pages on the server (like archived e-newsletters) that I would really liked to be crawled and ranked, but because they are not part of the CMS's DB files (they are just straight HTML) they aren't included automatically in the sitemap.xml file.
Am I hurting my non-CMS pages (there are a lot) by omitting them from sitemap.xml? So does google pretty much assume that what you have in your sitemap.xml file = your entire site? I know google can and does crawl these non-sitemap.xml files but am worried about the crawl frequency and subsequent weightings.
Am I better off just ditching the sitemap.xml file altogether so google doesn't make false assumptions on my incomplete sitemap.xml file?