Forum Moderators: open
The key is no obstacles -- good solid text based navigation, a detailed site map and/or product/service category pages with links into the various reports.
Previously they had 90% of the site blocked to robots -- within 2 months we went from about 20k pages to 1.5 million. Rest have been incremental as we have reduced other barriers.
This specific case has 3 site maps with about 50 entries on each of them -- products, services and solutions. The entries are logical and link to the start page for that product or service category.
For each category page there is another sub map in the left navigation that takes the users/spiders into the deeper information and specific product pages.
They just keep cascading that way. We have looked at the spiders move through the site and they seem to like this "terraced" approach to aggregating information.
No magic in this case -- just took a lot of time to think like a user and built a site allows users to find content. The original approach was to use pull downs etc to all them to see too many unrelated products/services at the same time.
I do not recommend a single site map or even a few for cataloging your 40k products. I would start with looking for logical buckets of information and develop an "org chart" type of navigation structure to get as many of them covered as possible.