Forum Moderators: phranque
AddType text/html .html
AddHandler server-parsed .html In searching for more info on what AddType and AddHandler do, I found these are associated with the Apache mod_mime module... and here I'm at the limits of my understanding, and I'd like to get more comfortable.
First, can anyone clarify what this module, and AddType and Add Handler each do?
What I'm concerned about is that I believe I've encountered mod_mime before, on a host where I think it was set by default... perhaps set incorrectly. It seemed to look at a file, and if the file appeared to be an html file, to return it as html, but with whatever filename had been entered.
I discovered this when Google dropped a page called "domain.com/products.html" in favor of "domain.com/products" (without an extension)... for no reason that I could discover. Eg, there were no links to "domain.com/products" that I could find.
I was told by the host's "support" that this was due to the mod_mime module, and that it was a system wide setting that couldn't be changed. We've changed hosts, so I never did diagnose that problem.
I want to make sure now that, by modifying the SSI extensions with AddType/AddHandler as described above, I'm not also inadvertently creating an infinite number of mirror pages that might get indexed.
Where do I need to be cautious?
I think your host was confused. The "automatic" function that caused you problems was most likely mod_mime_magic, not mod_mime. Check out the Apache documentation [httpd.apache.org] for both.
AddType simply associates a file extension with a MIME-type. In simple terms, it tells the server what MIME-type header to return when a file with that extension is served. This is the header you can view with the Server Header checker. It tells the client browser what kind of file it is, so the browser can decide to handle it internally, use a plug-in, or pass the data to an external application for display. For example, html files can be directly displayed, but pdf files need to be handled by a browser plug-in, or passed to Adobe Reader for display.
AddHandler tells Apache that you want that file 'handled' in some special way prior to -- or instead of -- serving it directly to the client. For example, you might want .html extension files to be parsed (scanned) for SSI includes, so you use AddHandler to inform Apache of that fact.
mod_mime_magic, however, is a completely different beast, and could cause the problem you describe if there was just one link pointed to an extensionless version of your page. It 'infers' the MIME-type of a file by reading a few bytes from the file, and sets the MIME-type server response header accordingly. On a system with mod_mime_magic unconditionally enabled, file extensions become meaningless as MIME-type associations, and the MIME-type header in the server response is set based on the *content* of the files.
Jim
Thanks for another of your typically thorough and enlightening answers.
I think your host was confused. The "automatic" function that caused you problems was most likely mod_mime_magic, not mod_mime.
No, I'm the one who was confused. ;-(
I'd remembered the hosting situation from over a year ago, and I vaguely remembered the word "magic," along with "mod" and "mime"... but with those underscores in the module names, a partial search match didn't do it. I kept coming up with "use some htaccess magic" and the like when searching for the term in relation to the SSI and .htaccess info I was unearthing.
I think I understand the syntax of the AddType line...
In the second line, I assume that all .html files become server-parsed...
AddHandler server-parsed .html Does "server-parsed" apply only to SSI includes, or are there other things the server might look for when it scans the file?
In the case of the site we'll be using this on, all of the pages will have includes, so there aren't any unnecessary operations... but, if, say, only 10 of 1000 pages had includes, and the rest were plain vanilla hard-coded html, would the server still be scanning each file it served, even when most weren't SSI pages?
The way I had assumed this conversion would be done would be that the pages with includes would have shtml extensions, and that these only would be converted or renamed. This obviously isn't the way Apache does it. If there is simple further background info on why it doesn't work that way, I'd find it interesting and probably helpful.
In the second line, I assume that all .html files become server-parsed...AddHandler server-parsed .html
Does "server-parsed" apply only to SSI includes, or are there other things the server might look for when it scans the file?
In the case of the site we'll be using this on, all of the pages will have includes, so there aren't any unnecessary operations... but, if, say, only 10 of 1000 pages had includes, and the rest were plain vanilla hard-coded html, would the server still be scanning each file it served, even when most weren't SSI pages?
For 'sparse' SSI parsing, a better solution might be to use XBItHack -- See Apache mod_include
The downside of XBitHack is that your maintenance staff needs to be sharp -- or they need to work from a formal maintenance procedures script -- to make sure the files with Xbits set keep them set and those without don't get them set unexpectedly.
Another option is to set AddHandler on a per-directory basis (.htaccess) so that only files in certain subdirectories are parsed for SSI.
The way I had assumed this conversion would be done would be that the pages with includes would have shtml extensions, and that these only would be converted or renamed. This obviously isn't the way Apache does it. If there is simple further background info on why it doesn't work that way, I'd find it interesting and probably helpful.
Jim
The way I had assumed this conversion would be done would be that the pages with includes would have shtml extensions, and that these only would be converted or renamed....Not sure I understand this q.
I was just re-inventing Apache in my mind, in a way that ultimately wouldn't be useful. ;)
If the server changed all the ssi page extensions from .shtml to .html on the fly, you'd have a hell of a time building the site and testing the links without a server running. As it is, I assume you need a server to try out your includes.