Forum Moderators: phranque

Message Too Old, No Replies

I've searched and still do not understand - mod rewrite/trailing slash

         

sharkboy

10:36 pm on Feb 28, 2007 (gmt 0)

10+ Year Member



Hi all, first time post - LONG time lurker (3+ years). I always have lurked because I am pretty good at searching and finding answers to my questions on previous posts. But alas, I have run up against a wall on this one

This might be blasphemy, but I am running a little script call Isapi Rewrite on an *ahem* IIS 6.0 Windows box. It uses the same syntax as Apache mod_rewrite. I have had success in implementing some nice clean query string urls, but I am having trouble combining that with a routine that will automatically add a trailing slash when one is not present.

Can any of you help me out and modify my existing (and working) code so that it will add on trailing slashes when they are not present in the url?

Here is the code:

RewriteRule ^/([^/]+)/$ /index.cfm?action=$1 [L, NC]
RewriteRule ^/([^/]+)/([^/]+)/$ /index.cfm?action=$1&category=$2 [L, NC]
RewriteRule ^/([^/]+)/([^/]+)/([^/]+)/$ /index.cfm?action=$1&category=$2&page=$3 [L, NC]
RewriteRule ^/([^/]+)/([^/]+)/([^/]+)/([^/]+)/$ /index.cfm?action=$1&category=$2&page=$3&options=$4 [L, NC]

Any help would be appreciated. Thanks!

jdMorgan

10:55 pm on Feb 28, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



How are you going to determine when to add a slash and when not to add a slash? For example, we need to avoid redirecting to example.com/robots.txt/. And there are plenty of other cases as well... .css files, .jpg files, .js files, etc.

The most common answer is to add a trailing slash if one is not present *and* there is no "." anywhere past the last existing "/" in the URL. Will that work for you? (consider very carefully, now... take your time...) :)

ISAPI Rewrite syntax is *similar* to Apache mod_rewrite, but by no means is it the same, so take care on that count as well...

Jim

sharkboy

2:22 am on Mar 1, 2007 (gmt 0)

10+ Year Member



Thanks for the reply Jim.

After *very careful* consideration, yes - adding a trailing slash if one is not present and there is no "." anywhere past the last existing "/" in the url - will work for my situation.

And it is duly noted that ISAPI Rewrite syntax isn't the same - but I am thankful that there's something even remotely similar to mod_rewrite for Windows :)

I feel a bit embarassed to ask outright - I've always liked to take unknown language or markup and poke and prod it in order to get it to do as I want. But the mod_rewrite syntax absolutely baffles me - I think its because of the lack of any alpha-numeric characters in it.

So, how would you fit the above fit into my existing code?

jdMorgan

3:13 am on Mar 1, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Actually, I'm not sure. You will need to adjust this to generate a 301 external redirect response, because this code probably won't do it. However, the regular-expressions pattern (which I assume is what's baffling you) should be close:

RewriteRule ^/(([^/]+/)*([^/.]+))$ http://www.example.com/$1/ [R=301,L]

The regular expression pattern reads, "a slash followed by one or more characters not equal to a slash, followed by another slash, and as many of these not-slash-then-slash sequences as you like (including zero), followed by one or more characters not equal to a slash or a literal period." The effect of this is to match URL-paths like
/dir/sub.dir1/sub.dir2/subdir3
or
/dir
but not
/dir/subdir1/sub.dir/subdir3.gif
So only the first two example URL-paths above will get redirected to add the slash.

Now the bit you need to look into is the correct form for a canonical substitution URL (shown as "http://www.example.com/$1/" above, and the correct flags to generate an immediate 301-Moved Permanently redirect without processing any more rules. I am zero percent sure that both of these are correct (I'm an unapologetic 100%-Apache person). ;)

Take a look at the regular expressions tutorial cited in our forum charter [webmasterworld.com]. A passing familiarity with this very powerful and very common pattern-matching notation will stand you in good stead with mod_rewrite, ISAPI Rewrite, PERL, PHP, SSI, and many other programming languages. There are a few minor 'flavor variations' of regular expressions, but they actually differ very little between languages and applications -- The primary differences being in what literal characters must be escaped and in what contexts those characters must be escaped.

If you get stuck on the substitution URL and the ISAPI Rewrite flags, take a look at our Microsoft IIS Web Server and ASP.NET forum [webmasterworld.com] for actual examples, or try posting over there for more-qualified help on the IIS-dependent part of the problem.

Jim

sharkboy

2:25 pm on Mar 1, 2007 (gmt 0)

10+ Year Member



Thanks Jim - works like a charm. Here is the resulting code, all I did was take away the "L" on your snippet so it isn't the last rule processed and made it the first line of code. (I guess I am learning a *little* about the syntax)... :)

RewriteRule ^/(([^/]+/)*([^/.]+))$ /$1/ [R=301]
RewriteRule ^/([^/]+)/([^/]+)/$ /index.cfm?action=$1&category=$2 [L, NC]
RewriteRule ^/([^/]+)/([^/]+)/([^/]+)/$ /index.cfm?action=$1&category=$2&page=$3 [L, NC]
RewriteRule ^/([^/]+)/([^/]+)/([^/]+)/([^/]+)/$ /index.cfm?action=$1&category=$2&page=$3&options=$4 [L, NC]

Thanks again.

jdMorgan

2:34 pm on Mar 1, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



RED ALERT! -- Put the [L] back, or you will get a nasty surprise from the search engines... Without the [L], your script URLs may get 'exposed' and will replace your static URLs in the search results listings...

Jim

sharkboy

2:54 pm on Mar 1, 2007 (gmt 0)

10+ Year Member



Thanks for the heads up Jim. The site I am developing using this is not live yet, so no harm done.

So if I understand correctly, the Flags are only triggered when the regular expressions match the url string - correct?

This is of course the danger of modifying code you don't fully understand :)

Here what I now have:

RewriteRule ^/(([^/]+/)*([^/.]+))$ /$1/ [R=301,L]
RewriteRule ^/([^/]+)/([^/]+)/$ /index.cfm?action=$1&category=$2 [L,NC]
RewriteRule ^/([^/]+)/([^/]+)/([^/]+)/$ /index.cfm?action=$1&category=$2&page=$3 [L,NC]
RewriteRule ^/([^/]+)/([^/]+)/([^/]+)/([^/]+)/$ /index.cfm?action=$1&category=$2&page=$3&options=$4 [L,NC]

I will be sure to check out the threads you suggested at a later date - I am up to my knees in other issues in advance of this launch. This will give me what I need for the time being.

Also, thanks for the follow up - that would have been a nasty suprise as I checked the SERPs a couple of weeks from now. Please let me know if you see any other red alerts in the code.

jdMorgan

3:34 pm on Mar 1, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



> So if I understand correctly, the Flags are only triggered when the regular expressions match the url string - correct?

Yes, correct.

> This will give me what I need for the time being.

I hope you'll make the time to do a very thorough check of your site before launching. Make sure all URLs are canonicalized (only *one* URL for each page), proper server headers returned for all conditions (check 200-OK, 401-Authentication Required, 403-Forbidden, 404-Not Found, 410-Gone, 500-Server Error, and all other appropriate conditions to make sure the right headers and response codes are returned -- The Live HTTP Headers extension for Firefox is a good tool for this), proper MIME-types on all files, and proper cache-control headers for all resources -- stuff like that..

The vast majority of technical threads on this site pertain to "damage repair" for sites that were not thoroughly checked out prior to going live. It's too bad, because getting it right the first time is far easier than "repairing" a "bad reputation" with the search engines caused by fairly-trivial mistakes.

Jim