> the optimal solution is one where WP is modified in some way
This is really the only bulletproof solution, since only WP knows whether a URL resolves to a 'valid' page or not. The solution is to take the requested URL-path, look it up in the CMS database, and if it does not *exactly* match the stored title of an existing blog entry, then return a 404-Not Found.
Unfortunately, unless you want to code this patch and then re-install it every time your WP is upgraded, about all you can do is reject the requests based on their taxonomy. For example "//" appearing in the requested URL-path, or "too many directory levels" in the requested URL-path. You could look for those and reject them out-right:
RewriteCond %{REQUEST_URI} // [OR]
RewriteCond $1 ^([^/]*/){2,}
RewriteRule ^(([^/]*/)+[^.]+\.html)$ - [F]
Or, if "legit-folder" is a single folder or an easily-enumerable limited number of folders, then
RewriteCond $1 !^(legit-folder1|legit-folder2|legit-folder3)/$
RewriteRule ^(([^/]*/)+)[^.]+\.html$ - [F]
The real problem is identifying a sufficiently-selective approach that *will not* affect other valid requested URLs on your site --the ones not associated-with/handled-by WP-- but that *will* handle all or most of the bogus URL requests.
BTW, to change the 403-forbidden response to a 404-Not Found response, simply change the substitution path to point to a filepath that you know does not (and will never) exist. For example, the first RewriteRule line above becomes:
RewriteRule ^(([^/]*/)+[^.]+\.html)$ /nonexistent-file-path.hmtl [L]
This 404-invocation method works on all versions of Apache. You could also use "RewriteRule ^(([^/]*/)+[^.]+\.html)$ - [R=404,L]" on Apache 2.0 and later.
These snippets are intended only as examples. Again, making the rules and conditions sufficiently-selective for *your site* is the challenge...
Jim