Forum Moderators: phranque

Message Too Old, No Replies

HTM loading as HTML and vice versa

works on some but not all Apache servers

         

smallcompany

7:12 am on Jan 28, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I always thought, regardless of the fact I saved all pages with HTML extension, that they would load just fine if I enter a page as with HTM extension.

And this is true with some, but not all hosting companies.

My .htaccess files are quite similar on all of those servers, so I would assume it would not be "my" configuration causing this.

Would server handler matter (CGI vs. Apache)... or something else?

Thanks

g1smd

1:36 pm on Jan 28, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Serving the same file via multiple URLs is not to be recommended.

Each URL should bring up different content. Non-valid URLs should return a 404 status.

Having said that, you can set up a 301 redirect so that if the wrong name is requested, the user is redirected to the correct version of the URL.

I use that to redirect "/(index¦home¦default)\.(html?¦php[45]?¦[ja]spx?¦cfm)" to just "/" for all requests.

I link to "/" as that is the "correct" URL. If you request one of the other versions then you are redirected and still get to see the content.

jdMorgan

4:55 pm on Jan 28, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



There are various reasons why a server might return .html files in response to requests for .htm URLs. Among them are mod_rewrite (RewriteRule directive), mod_alias (Alias directives), mod_negotiation (MultiViews option), mod_speling, and the AcceptPathInfo directive (on Apache 2.x).

So, the likely cause is a difference in your server configurations.

As g1smd recommends, any given 'page' or object on your site should be directly-accessible at one and only one URL; Any and all variations of domain, subdomain, FQDN (a period following the domain, as in "example.com."), port number (e.g. "example.com:80"), URL-path, path fragment (also called "named anchor") and query-string should be 301-redirected to the canonical URL. This includes "index.html" versus "/", and upper- and lower-case variations of the URL-path and the parameter order in the query-string as well.

To be clear, a URL is composed as <sub-subdomain(s)>.<subdomain>.<domain>.<top-level domain><period indicating FQDN><port-number>/<URL-path>#<fragment>?<query-string>. The sub-sub-domains, subdomain, FQDN, port number, URL-path, fragment identifier, and query string are all optional in different cases.

Jim

smallcompany

6:08 pm on Jan 28, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



So, in other simple words, the right way is to serve 404 if HTM version is requested as all pages are actually HTML (like in my case)?

jdMorgan

7:41 pm on Jan 28, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



You could do a 404 (or 410), but if there are any links to .htm URLs, then you would want to use a 301-redirect as I stated above, to recover both the traffic and the PageRank/link-popularity.

Jim

smallcompany

5:52 am on Jan 29, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Thanks.

All of my pages within any site are HTML, period. It is just that from time to time, I would see a request for a single HTM page on some, no referrer.

Now, with this better understanding that a request for a non-existing HTM should return 404, I wonder if one of my (shared) hosting packages that returns HTML on HTM request is actually a problem?

Thanks