Don't confuse URLs with files. It would almost be accurate to say that they are two utterly different things. In fact, they are not at all related -- except because of the primary action of the server.
A URL "exists" as soon as you put it in a link on a page. It does not matter if that URL is valid or if it will resolve to a file somewhere, on some server. It has been defined and now exists.
A file exists as soon as you create it on or upload it to your server. It makes no difference if there is a URL associated with that filename on that server -- a link in other words. The file exists independent of the Web.
The server "associates" files with URLs. It has a 'default method' for doing this: Remove the protocol and domain from the requested URL, add the defined DocumentRoot for that domain, and use the result as a filepath. This is the basic function of a server: To translate requested URLs to filepaths.
mod_rewrite is a way to change this default URL-to-filepath translation. It can do three main things:
1) Modify the URL-to-filepath translation.
2) Redirect a request for one URL to another URL by responding with a redirect response and terminating the current HTTP transaction.
3) "Forward" the client's request to another server -- either out on the Web, or inside the server's local network -- perhaps a back-end application server. This is the reverse-proxy through-put function, which we'll retire at this point for the sake of simplicity.
Taking your example above. The correct steps to "change from html URLs to php files" without creating duplicate content would be:
1) Create a RewriteRule to internally rewrite requests for URLs ending in ".html" to filepaths ending in ".php"
2) To prevent duplicate-content, create a second RewriteRule, this one to externally redirect
only direct client requests for URLs ending in .php to URLs ending in ".html". Because of other requirements and basic organizational simplicity, this external redirect rule should --along with all other external redirects-- precede any internal rewrites. On a new site or on a site where the change is well-planned, this second rule isn't required. It would serve mainly to guarantee than none of the new .php filepaths would ever get listed as URLs due to coding errors or accidents, and that if they did, the redirect would signal search engines to quickly get rid of these "wrong" URLs.
So an external redirect is a URL-to-URL translation involving the client (the server sends a redirect response containing the new URL and terminates the current HTTP connection, and the client then usually issues a second HTTP request, now using the new URL just provided by the server).
An internal rewrite is a (non-default) URL-to-filepath translation occurring solely inside the server.
Now note that I've used the somewhat-redundant phrases "internal rewrite" and "external redirect" and I've been careful to distinguish URLs "out on the Web" from filepaths inside the server. If you adopt this (or a similar) framework, your experience with mod_rewrite will be much simplified, less "mysterious" and/or stressful, and likely more successful.
You *will* see phrases like "internal redirects" in Apache error messages and logs. Just understand that this is a reference to an internal rewrite, and carry on... They also mis-spelled "referrer" as "referer" in the HTTP header specifications, and this error was carried forward into the the %{HTTP_REFERER} server variable name -- No-one's perfect... :)
Jim
[edited by: jdMorgan at 1:30 am (utc) on Feb 17, 2010]