Forum Moderators: phranque
I'm using the following to hide the .xhtml extension from a site's addresses.
AddType text/html .xhtml
DirectoryIndex perfil.xhtml
RewriteEngine on
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_FILENAME}\.xhtml -f
RewriteRule ^(.*)$ $1.xhtml The xhtml files themselves are valid, but the lack of extension gives me this (and other related) errors when trying to validate them with W3Cs validation tool:
1. Unable to Determine Parse Mode!The validator can process documents either as XML (for document types such as XHTML, SVG, etc.) or SGML (for HTML 4.01 and prior versions). For this document, the information available was not sufficient to determine the parsing mode unambiguously, because:
* the MIME Media Type (text/html) can be used for XML or SGML document types
* the Document Type (-//W3C//DTD XHTML; 1.0 Strict//EN) is not in the validator's catalog
* No XML declaration (e.g <?xml version="1.0"?>) could be found at the beginning of the document.As a default, the validator is falling back to SGML mode.
2. Warning Namespace Found in non-XML DocumentNamespace "" found, but the -//W3C//DTD XHTML; 1.0 Strict//EN document type is not an XML document type!
Validation Output: 78 Error
1. Error Line 237, Column 27: omitted tag minimization parameter can be omitted only if OMITTAG NO is specified.
... And the list goes on and on with omitted tag errors (again, if I check the xhtml files, they're valid).
Some over at Kirupa's forum said I should try:
AddType text/html .xhtml
DirectoryIndex perfil.xhtml
Options +FollowSymLinks +Indexes
RewriteEngine on
RewriteBase /
RewriteRule ^([^.]+)\.xhtml$ $1 [L] But this gives me a 404 error.
How can I:
1. Hide the xhtml extension?
2. Automatically add a slash at the end of the address, if one doesn't exist?
I've found a thread on this forum about it [webmasterworld.com], but it's not working either (gets stuck in an infinite loop, adding slashes to the end of the address).
Thank you very much for any assistance!
What were the results of adding the page and server headers (such as "<?xml version="1.0"?>;") requested by the validator?
Be aware that AddType works by filename, so the fact that your URLs are extensionless will not stop it from working properly. However the content-type you add should be a valid XML type, not "text/html".
Jim
I'm wondering about the slash because adding a slash gives me an internal server error (500). People will often type a slash at the end of addresses, so I don't want them to get a nasty error page thrown back at them. I suppose another way would be removing any slashes if the user is trying to reach a page and not a directory? That would work too, and if it's more correct, all the better.
The files themselves are valid (validated them before trying to hide the extension) and have all the necessary headers, so the problem is with the .htaccess settings I'm using. I added the text/html bit because I was having trouble with IE 8 not opening xhtml files and found an article that had that as a solution -- it works, but is it not correct? (It's forcing .xhtml files to be treated as text/html ones?)
Sorry, Apache is really not my area of expertise, this is just a personal site and I'm trying to get things done as properly as possible. :)
Just not quite sure if I understand all that I'm trying to accomplish.
Thanks very much for your reply.
XML and XHTML *are not* "the latest and greatest version" of HTML -- so you'll need to decide what you're doing and why, here. The fact that IE does not support XHTML should be telling you something.
Yes, you can redirect to remove trailing slashes on file requests that have a trailing slash, but as I stated, let's leave that for later... There is no use trying to solve two unrelated problems at once, and the result is often a very-confusing thread.
The first snippet of code you posted above, with the directory and file-exists checks, should work, assuming that you have "Options +FollowSymsLinks -MultiViews" already set on your server.
Jim
A URL that ends in a slash is for a folder.
XML and XHTML *are not* "the latest and greatest version" of HTML -- so you'll need to decide what you're doing and why, here. The fact that IE does not support XHTML should be telling you something.
The first snippet of code you posted above, with the directory and file-exists checks, should work
* the Document Type (-//W3C//DTD XHTML; 1.0 Strict//EN) is not in the validator's catalog
* No XML declaration (e.g <?xml version="1.0"?>wink could be found at the beginning of the document.
It wasn't clear from this error report whether or not you specified a DTD reference in your DocType declaration, either.
So unfortunately, I'm being vague because there are just too many loose ends to grab... :(
You need a valid DocType including a DTD link before your page <head>. The MIME-type (HTTP Content-Type header) should agree with that DocType, and you also need an xmlns namespace declaration before the page <head>. Only then can you avoid the validator going into fallback mode and throwing all those errors at you. Your extensionless-URL mod_rewrite code should work just fine, and I suspect that the reason you're now getting errors is that the validator can no longer fall back on the file extension to figure out what kind of page it's looking at in the absence of the other required headers.
I don't know if this will help, but I've only ever used xml+xhtml for mobile-device pages. Here's the stuff in the first three lines of each page before the <head> section, as described above. But again, these are mobile site declarations, and would need to be adjusted for your site:
<?xml version="1.0" encoding="UTF-8" ?>
<!DOCTYPE html PUBLIC "-//WAPFORUM//DTD XHTML Mobile 1.0//EN" "http://www.wapforum.org/DTD/xhtml-mobile10.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en-US">
There is, alas, no such thing as "Apache for Dummies." Web server configuration cannot be both flexible and simple at the same time.
Jim
I'm sorry if I wasn't clear enough in my reply. The valid doctype was there, but now I see that I didn't add the xml declaration on the page I was validating. Since it was validating before the .htaccess tinkering (guess if the file has the extension, the validator is happy enough to continue?) I didn't notice this. At least now I know the source of the problem and corrected it.
It's possible to explain practical examples to people who just want to tweak a thing here or there. But generally speaking, all information I've found on mod-rewrite on the web has two problems: too technical with cryptic explanations or too simple with no explanations given (very few examples are commented like: "this bit is doing _this_ which will achieve _this_ result" and sometimes that's really all that's needed for people to understand). Think somewhere in between would be ideal!
I figure if I stick with XHTML, later on I can start doing more complex things with XML. So I'd rather stay on this path and learn as I go.
Thanks for your patience in trying to explain this to me!
Yes, as I noted above: "I suspect that the reason you're now getting errors is that the validator can no longer fall back on the file extension to figure out what kind of page it's looking at in the absence of the other required headers."
Mod_rewrite is, unfortunately, rather cryptic. First, it uses regular-expressions, which are in themselves a big challenge to learn (learn well). Second, mod_rewrite code is unique in that there is no other 'similar' language -- only the most fundamental programming techniques carry over from previous experience. Third, mod_rewrite code modifies your server's behavior -- often in complex and unintended ways. Without a good grounding and lots of experience, it can be very difficult to diagnose and debug.
I'm not saying mod_rewrite is rocket science -- It is accessible with effort. But the documentation is terse and cryptic in order to satisfy two requirements: First, to *not* be 2000 pages long, and second, to avoid violating Einstein's dictum: "Make everything as simple as possible, but no simpler." :)
Jim
I've done some string manipulation with regular expressions, but it's been so long, I've forgotten most of it. Like a lot of people, the more complex expressions make my head hurt, as I'm not fond of maths and they start to look too math-like.
Thanks again for your explanation.