homepage Welcome to WebmasterWorld Guest from 23.22.173.58
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member
Home / Forums Index / Code, Content, and Presentation / Apache Web Server
Forum Library, Charter, Moderators: Ocean10000 & incrediBILL & phranque

Apache Web Server Forum

    
Add trailing slash to URL's
glimbeek



 
Msg#: 4240018 posted 9:23 am on Dec 8, 2010 (gmt 0)

I know this is asked and answered before and that Google provides enough examples, however I'm confused as to which code is "correct".

I found so many examples, but all of them are (slightly) different and not all of the examples give a proper explanation of the used code.

For instance I found the following examples, all of them should be "just" adding a trailing slash:

RewriteRule ^(.*[^/])$ /$1/ [L,R=301]

RewriteEngine On
RewriteBase /
RewriteEngine On
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_URI} !example.php
RewriteCond %{REQUEST_URI} !(.*)/$
RewriteRule ^(.*)$ http://domain.com/$1/ [L,R=301]


RewriteEngine On
RewriteBase /
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_URI} !(\.[a-zA-Z0-9]{1,5}|/)$
RewriteRule (.*)([^/])$ http://example.org/$1$2/ [R=301,L]


RewriteEngine On
RewriteBase /
RewriteRule ^([a-zA-Z0-9]+)/$ /$1 [L]
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /([a-zA-Z0-9]+)
RewriteRule ^([a-zA-Z0-9]+)$ /%1/? [R=301,L]


# Redirect adding trailing slash if missing
rewriteCond %{REQUEST_URI} ^/[^\.]+[^/]$
rewriteCond %{REQUEST_URI} !/somefolder$
rewriteRule ^(.*)$ http://%{HTTP_HOST}/support/$1/ [R=301,L]

# Redirect adding leading www to domain or any subdomain if missing
rewriteCond %{HTTP_HOST} !^www\.
rewriteCond %{REQUEST_URI} !/somefolder$
rewriteRule ^(.*)$ http://www.%{HTTP_HOST}/somefolder/$1 [R=301,L


# Externally redirect to add missing trailing slash
RewriteRule ^([a-z_]+/([0-9]+/)*[0-9]+)$ http://www.example.com/$1/ [NC,R=301,L]
RewriteRule ^([a-z_]+)$ http://www.example.com/$1/ [NC,R=301,L]


The second example seems to be the to use. It works and I understand what it does. Source: [enarion.net...]

I changed it a little because I didn't need all of the code used. I ended up with:

#add trailing slash to all the URL's
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_URI} !(.*)/$
RewriteRule ^(.*)$ http://www.mysite.com/$1/ [L,R=301]


What about the above examples though? For instance in the third examples they use {1,5} and in the forth they use {3,9}. What does that do? Why 1,5 or 3,9? Is it as simple as: "{3,9} matches three, four, five, six, seven, eight or nine [A-Z]'s? And what's the impact of this? If the request is longer then 9 char's will the condition not be met?

And is there "better" solution for adding a trailing slash to URL's? A solution more commonly accepted as a good solution?

Thanks in advance!

 

jdMorgan

WebmasterWorld Senior Member jdmorgan us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 4240018 posted 1:49 pm on Dec 8, 2010 (gmt 0)

Some of these examples have functional errors, some have performance-affecting errors. Some are targeted to site-specific URL-structures.

The key to answering this question is to *understand* exactly what each code snippet actually does. Without that basic understanding, you cannot make an informed decision. This is the problem with copying and pasting code -- How can you ever know if it is right for your site unless you can read it, understand it, and determine whether it is appropriate to your specific task?

The rule-sets with "-f" and "-d" exists-checks can invoke disk reads, which are slow. Further, as coded, they do this before exhausting all other qualifications -- meaning that they do disk checks when it is not even certain if it is necessary to do so... And on most sites, checking the disk won't be necessary at all for this function.

A couple of them include a RewriteCond to exclude a particular URL or URL-path from the "add a slash" function -- Do you need such an exclusion? Most sites won't.

Out of all of them, I like a single-line modified version of #4, with more-accurate comments:

# Externally redirect to add missing trailing slash to requested URL-paths
# with no filetype (i.e. no period) in the final URL-path-part
RewriteRule ^(([^/]+/)*[^./]+)$ http://www.example.com/$1/ [R=301,L]

This eliminates unnecessary disk checks and RewriteConds for efficiency. The pattern says, "Match only URLs starting with any number (including zero) of <one or more non-slash characters followed by a slash>, followed by one or more characters, none of which are a period or a slash."

By understanding the purpose and the directives, and by using regular expressions to their maximum capability, the whole function reduces to a single line...

Jim

glimbeek



 
Msg#: 4240018 posted 2:56 pm on Dec 8, 2010 (gmt 0)

Hi Jim,

Thank you for your reply.
I was thinking along the same line. That's why I asked for some insight from people who have more knowledge about the subject then I do. Just copy and pasting a pit of code and hoping for the best isn't really a good way to do things.

# Externally redirect to add missing trailing slash to requested URL-paths
# with no filetype (i.e. no period) in the final URL-path-part
RewriteRule ^(([^/]+/)*[^./]+)$ http://www.example.com/$1/ [R=301,L]

Looking at your reply, the above example is the one you'd prefer correct?

Before I use it I'll try and see if I understand what it does, correct me if I'm wrong.

^(([^/]+/)*[^./]+)$

^ the beginning of the string
You wrap it all in () so it can be used later on with the $1
[^/] means if the string does not end with a slash
+/ means one more preceding character that's a slash
[^./]+ means if not any character followed by a slash with one or more preceding character that's a dot. The "filetype" check.
$ ends the string

$1 Externally redirect it to http://www.example.com/ and add the rest of the string + a slash at the end.

g1smd

WebmasterWorld Senior Member g1smd us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 4240018 posted 7:43 pm on Dec 8, 2010 (gmt 0)

^
(([^/]+/)*[^./]+)$

Pattern matches "one or more characters that is not a slash, followed by a slash; the preceding whole, repeating one or times if it appears, and is allowed to be missing" [i.e there is an optional "path" of any depth],

followed (after the final slash), "by one or more characters that are not a dot or slash" [i.e. the filename part of the URL has no "extension" on the end].

StaceyJ

5+ Year Member



 
Msg#: 4240018 posted 8:45 pm on Dec 17, 2010 (gmt 0)

^(([^/]+/)*[^./]+)$

Pattern matches "one or more characters that is not a slash, followed by a slash; the preceding whole, repeating one or times if it appears, and is allowed to be missing" [i.e there is an optional "path" of any depth],

followed (after the final slash), "by one or more characters that are not a dot or slash" [i.e. the filename part of the URL has no "extension" on the end].


I was having trouble understanding the "0 or more times" thing (like why in the world would you want to match something 0 times?) and thought I finally got it, but this confuses me. I finally understand pretty well what this whole pattern does, thanks to your and jdMorgan's explanations and quite a bit of time reading and banging myself in the head. But when you said "repeating one or times if it appears, and is allowed to be missing" I got confused. * repeats 0 or more times, not 1 or more, correct? I understand what you mean by "is allowed to be missing", that's the 0 of the 0 or more times. But when you said "repeating one or times if it appears" it threw me for a loop. Shouldn't that have been 0 or more times?

Excuse me if this is a really stupid question or I seem dense, I'm making a lot of progress understanding all this and just want to make sure I'm not misunderstanding something.

Thank you!

g1smd

WebmasterWorld Senior Member g1smd us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 4240018 posted 9:46 pm on Dec 17, 2010 (gmt 0)

^(([^/]+/)*[^./]+)$

Pattern matches "one or more characters that is not a slash, followed by a slash; the preceding whole, repeating one or more times if it appears, and is allowed to be missing" [i.e there is an optional "path" of any depth],

followed (after the final slash), "by one or more characters that are not a dot or slash" [i.e. the filename part of the URL has no "extension" on the end].

There was a typo in my answer - a word missing (shown in bold).

The + means "one or more times".

The * means "zero or more times".

I wrote a slightly wordier answer than was maybe required.

repeating one or more times if it appears, and is allowed to be missing

Yes, it simplifies to "zero or more times"; but I thought that might confuse.

StaceyJ

5+ Year Member



 
Msg#: 4240018 posted 10:03 pm on Dec 17, 2010 (gmt 0)

There was a typo in my answer - a word missing (shown in bold).

Thanks, I thought that might be the case.

I wrote a slightly wordier answer than was maybe required.

Not at all! Please continue, wordy explanations like that really help me to understand it so much better than the docs do. Between your and Jim's explanations I actually "got it" finally.

repeating one or more times if it appears, and is allowed to be missing

Yes, it simplifies to "zero or more times"; but I thought that might confuse.

Thanks, that makes sense and after I read it a few more times thought that's what you meant.

Thank you again!

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Code, Content, and Presentation / Apache Web Server
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved