homepage Welcome to WebmasterWorld Guest from 54.227.41.242
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member
Home / Forums Index / Code, Content, and Presentation / Apache Web Server
Forum Library, Charter, Moderators: Ocean10000 & incrediBILL & phranque

Apache Web Server Forum

    
Extensionless URLS
not working
Lorel




msg:4634890
 7:18 pm on Jan 3, 2014 (gmt 0)

I'm trying extensionless urls for the first time and it doesn't appear to be working. I loaded the files on my server (they are all in the same directory including the htaccess) but when I run the mouse over the URLs it shows up with the extension on the file (other items in htaccess are working). Is this correct? seems to me the file name extension should disappear once it's on the server.

I have the following in htaccess:

------------

ErrorDocument 404 /missing.html
AddHandler server-parsed .html

Options +Includes +FollowSymLinks
RewriteEngine on

RewriteBase /
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_FILENAME}.html -f
RewriteRule ^(.+)$ /$1.php [L,QSA]

----------------
Am I doing something wrong?

The redirect from non-www to www (and other redirects) is not included yet as the site is still on my server.

 

lucy24




msg:4634900
 8:37 pm on Jan 3, 2014 (gmt 0)

For starters, change the body of the rule to say

^([^.]*)$

This should eliminate the need for all conditions, because all non-page files contain a literal . in the name.

RewriteCond %{REQUEST_FILENAME}.html -f
RewriteRule ^(.+)$ /$1.php [L,QSA]


I don't understand this. Is there a typo? "If there exists a physical file whose name equals the request plus one wildcard (un-escaped period) plus 'html' then serve content from request plus '.php'".

Dideved




msg:4634907
 9:20 pm on Jan 3, 2014 (gmt 0)

> because all non-page files contain a literal . in the name.

Careful. That assumption isn't actually true. Real files are not required to have a period in their name (e.g., "README" or "LICENSE" are common), nor are generated pages forbidden from using periods (e.g., ".xml" or ".json" are commonly generated on the fly, minified versions of ".css" and ".js" are commonly generated on the fly, and even some generated pages still use ".html" if only for cosmetic reasons).

@Lorel

Probably what you want is something like this:

RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME}.php -f
RewriteRule ^(.*)$ $1.php


That is, if the current request isn't for a real file, but the current request ".php" is a real file, then rewrite to that php file.

JD_Toims




msg:4634972
 12:35 am on Jan 4, 2014 (gmt 0)

I haven't really thought all the way through the code above, because at-a-glance, if it's "the solution" I wouldn't do this with Apache -- Invoking a file-system walk for *every single request* [and all the extra code that goes with it], then another file-system walk if an actual file is not found will likely range from "slow" to "Are you kidding me?! Slow" on a site with any traffic at all.

Personally, I'd rewrite everything to a PHP file and manage things from there, because they can be accomplished much faster than they likely will with mod_rewrite doing the checks in this type of situation.

lucy24




msg:4634986
 3:36 am on Jan 4, 2014 (gmt 0)

if we're so concerned about file system reads, then we probably shouldn't be using htaccess files at all

And there should be no such thing as shared hosting?

Lorel, is it your own server or shared?

Even if it is your own server, htaccess in selected directories is absolutely appropriate when you're making major changes. You want to be able to test your new rules on the fly without having to restart the server every time.

seems to me the file name extension should disappear once it's on the server.

I am a little uneasy about this line.

incrediBILL




msg:4635015
 7:08 am on Jan 4, 2014 (gmt 0)

<mod>
Off topic benchmarking comments removed.
We're not starting this nonsense again, take it back to this thread:
[webmasterworld.com...]

If you want a benchmarking thread, start one, but it's inappropriate to add to every thread. All further off topic benchmarking remarks not in a thread designated for such a discussion will be removed without notice.
</mod>

[edited by: incrediBILL at 7:27 am (utc) on Jan 4, 2014]

incrediBILL




msg:4635019
 7:25 am on Jan 4, 2014 (gmt 0)

I run the following for extensionless URLs on my server and it works fine:

# check for file not existing
RewriteCond %{REQUEST_FILENAME} !-f
# check for directory not existing
RewriteCond %{REQUEST_fileNAME} !-d
# check for filename not ending in .php to avoid looping
RewriteCond %{REQUEST_fileNAME} !\.php$
# if the above conditions are met of no
# matching file, directory or .php file,
# then rewrite to a .PHP file
RewriteRule ^(.*)$ $1.php [L,QSA]


However, I don't have the RewriteBase set, don't remember why, so maybe that's the issue, but I suspect it's the "/" in "/$1.php" which might be throwing it a curve.

I'm no Apache guru, nor do I play one on TV, but I manage to muddle my way through and my sites magically run.

phranque




msg:4635021
 7:34 am on Jan 4, 2014 (gmt 0)

I loaded the files on my server (they are all in the same directory including the htaccess) but when I run the mouse over the URLs it shows up with the extension on the file (other items in htaccess are working). Is this correct? seems to me the file name extension should disappear once it's on the server.


it appears nobody has addressed your actual problem.
the primary purpose of your mod_rewrite directives is to redirect non-canonical requests or rewrite an external canonical request to an internal file path or to a script with query string.

your documents served should refer only to canonical urls.
it sounds like your documents are referring to the urls with file extensions.
.htaccess is too late - this has to be fixed in the document.

your .htaccess should have an external redirect to 301 the requests for urls with file extensions to the extensionless urls.
you should also have a redirect to canonicalize your hostname.
these redirects should precede the internal rewrite to the file name with extension.

RewriteBase /
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_FILENAME}.html -f
RewriteRule ^(.+)$ /$1.php [L,QSA]

the RewriteBase directive is unnecessary.
i'm as confused as everyone else about the rest.
please explain.

lucy24




msg:4635023
 8:00 am on Jan 4, 2014 (gmt 0)

I don't have the RewriteBase set, don't remember why

If you don't have a RewriteBase set, it defaults to / (root slash). A RewriteBase only kicks in if the target starts with, er, naked text. So there should be absolutely no difference between

RewriteBase /
RewriteRule blahblah target.php


and

RewriteRule blahblah /target.php

That's the theory, at least ;)

In the original post, he said "html" in one line and "php" in another. If this carried over from the actual htaccess file, rather than being an artifact of posting, it's enough by itself to make the rule fail.

Lorel




msg:4635055
 3:24 pm on Jan 4, 2014 (gmt 0)

My apologies to everyone. I found the code I used on another site that was designed for this purpose and changed it to fit my needs (my files are all html) however I forgot to change the .php in last line to HTML. However now that I've changed it to html it still doesn't work.

I also tried Incredibills suggestion (without the RewriteBase set) which didn't work either.

To clarify:

I have a canonical tag on the webpage as instructed in the original article so the original code (minus my typo should work).

I don't have a server on my computer. I loaded the files to my website in a folder to test them. the site isn't live yet.

I'm confused by some of the comments above and not sure they are related to my typo or not. Can someone please clarify as to what I should change.

Here are two different sets of code I tried:

------incredibills code set for html ----------

# check for file not existing
RewriteCond %{REQUEST_FILENAME} !-f
# check for directory not existing
RewriteCond %{REQUEST_fileNAME} !-d
# check for filename not ending in .html to avoid looping
RewriteCond %{REQUEST_fileNAME} !\.html$
# if the above conditions are met of no
# matching file, directory or .html file,
# then rewrite to a .html file
RewriteRule ^(.*)$ $1.html [L,QSA]

-------my original set---------

RewriteBase /
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_FILENAME}.html -f
RewriteRule ^(.+)$ /$1.html [L,QSA]

lucy24




msg:4635089
 6:47 pm on Jan 4, 2014 (gmt 0)

I don't have a server on my computer. I loaded the files to my website in a folder to test them. the site isn't live yet.

I strongly recommend getting MAMP/WAMP (or Linux equivalent, uhm, is it LAMP?) and running it locally. I don't know about WAMP but MAMP has a perfectly normal non-intimidating GUI; you do not need to set foot in Terminal or type one syllable of command-line stuff. The free version is set up for one site at a time.

With a pseudo-server you can test your RewriteRules and other elements of htaccess exactly as on your "real" site, and use root-absolute links throughout the site. You can also use things like directory names ending in / that may not work properly if you're testing the code locally. (Again, don't know about Windows, but Mac browsers respond to slash-final local links by showing you the index of the specified directory. Not fatal, but not optimal either.)

I'm confused by some of the comments above and not sure they are related to my typo or not. Can someone please clarify as to what I should change.

Oops. Ahem. Cough-cough. You blundered into a pre-existing argument and are lucky not to see it before the moderators got involved. (I was one of the offenders. It took two moderators to clean up the blood.)

Try this alternative form, without any conditions:

RewriteRule ^(([^./]+/)*[^./])$ /$1.html [L]

The [QSA] should not be necessary, because where's the sense in having pretty extensionless URLs if you're going to go and slap a query on the end?

Once that's working we'll hammer out the redirect for requests with unwanted .html at the end.

Lorel




msg:4635092
 7:13 pm on Jan 4, 2014 (gmt 0)

Hi Lucy,

Are you saying I have to set up a pseudo-server before this will work - or have loaded on it's own domain? I really don't want to do this unless absolutely necessary as I'm also trying to set this site up as responsive - big headache! I could wait and deal with this once it's loaded on it's own domain (with no index) if that's the case.

I tried the rewrite rule (only). But still not working.

I'm working on a mac.

yes I missed the argument -- got busy working and didn't check my email.

lucy24




msg:4635102
 8:29 pm on Jan 4, 2014 (gmt 0)

Are you saying I have to set up a pseudo-server before this will work

No, not at all. I'm just recommending that you install this program for testing. You'll get the same information that you get from a live site but everything happens faster and you've got more control. You can also do things like change your cache headers that you would never want to do on your real site. Honestly, once you've installed a local pseudo-server you will wonder how you ever existed without it.

Now then...

Can you check your error logs? You may or may not see anything useful, but it will give us an idea where to look. Also try something like Live Headers in firefox. It will not say anything about rewrites-- since the whole point of a rewrite is that the user doesn't know it's happening-- but you will see response headers like 404 "Ain't no such file".

Also

:: ahem, cough-cough, major oversight on everyone's part ::

What exactly does "not working" mean? You know what you mean, but in fact there are many ways something can not work. One of them is "the rule simply does not execute" but there are many others.

Another thing you can try for testing purposes only is to change the [L] rewrite to a [R=301,L] redirect. This is an easy way to test if a rewrite is happening, because then your browser's address bar changes. It's not 100% reliable (some patterns behave differently on rewrites than on redirects, don't ask why), but again, worth a try.

phranque




msg:4635115
 9:33 pm on Jan 4, 2014 (gmt 0)

Lorel, do the internal links on your page refer to urls with the .html extension?

Lorel




msg:4635127
 10:28 pm on Jan 4, 2014 (gmt 0)

yes. Do they have to come off in order to work? I tried it already, then I can't work on the pages on my computer.

lucy24




msg:4635132
 11:17 pm on Jan 4, 2014 (gmt 0)

I tried it already, then I can't work on the pages on my computer.

That's why you need your pseudo-server ;) Once you get into rewriting, you may no longer be able to test pages by using your text editor's html preview or by opening the file in a browser. Sure, individual hard-coded html pages still work; but you're looking at a whole site, with links. That's where the [A-Z]AMP comes in.

Links on your own site have to point to the URL you want people to see and use. Search engines get annoyed if all your links lead to 301s.

phranque




msg:4635133
 11:46 pm on Jan 4, 2014 (gmt 0)

Do they have to come off in order to work? I tried it already, then I can't work on the pages on my computer.

the .html extensions stay in the file name but you don't use the file extensions in the url which refers to that file.

then you fix your mod_rewrite directives so a request for an extensionless url gets rewritten into a filepath with an extension.

you also add mod_rewrite directives to redirect external requests for urls with extensions to the extensionless url.

Lorel




msg:4635606
 9:55 pm on Jan 6, 2014 (gmt 0)

Thanks everyone. I'll try this again after I get a pseudo server set up.

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Code, Content, and Presentation / Apache Web Server
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved