homepage Welcome to WebmasterWorld Guest from 54.226.235.222
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member
Visit PubCon.com
Home / Forums Index / Code, Content, and Presentation / Apache Web Server
Forum Library, Charter, Moderators: Ocean10000 & incrediBILL & phranque

Apache Web Server Forum

    
mod rewrite with a variable number of variables
noyearzero




msg:3887443
 8:06 pm on Apr 7, 2009 (gmt 0)

I have a couple scenarios i'd like to cover with one rewrite rule

http://example.com/folder/a/detail.php
http://example.com/folder/a/b/detail.php
http://example.com/folder/a/b/c/detail.php
http://example.com/folder/a/b/c/list.php

http://example.com/folder/ is a real folder that i want to have my .htaccess file in.

basically i'd like to have everything after folder and before the scriptname written to a variable

http://example.com/folder/detail.php?var=a
http://example.com/folder/detail.php?var=a/b
http://example.com/folder/detail.php?var=a/b/c
http://example.com/folder/list.php?var=a/b/c

it would be nice if i could grab the name of "folder" dynamically too since this will be implemented in multiple directories.

 

jdMorgan




msg:3887465
 8:35 pm on Apr 7, 2009 (gmt 0)

Please post your best-effort code as a basis for discussion.

It looks like the pattern would be ^(([^/]+/)*[^/]+)/[^./]+\.php$ and you would make a back-reference to $1 to populate the value for "var=".

If you're willing to accept a trailing slash on the var= value, then that pattern simplifies to ^([^/]+/)+[^./]+\.php$

If the code goes into /folder/.htaccess, then "folder/" will not appear in the URL-path examined by the RewriteRule, and so does not need to be matched explicitly. It will, however, need to appear in the substitution (target) URL-path.

See the references cited in our Forum Charter for more information.

Jim

[edited by: jdMorgan at 11:47 pm (utc) on April 7, 2009]

noyearzero




msg:3888361
 7:53 pm on Apr 8, 2009 (gmt 0)

Option 1 is looking close... I tweaked it a bit

RewriteRule ^(([^/]+/)*[^/]+)/([^./]+\.php)$ /$2$3?var=$1 [QSA]

however $2 returns the second folder deep and not the first... unless i wrote the right side of the argument wrong.

the request
http://example.com/folder/test/105/detail.php
tries to resolve to
/test/detail.php and not /folder/detail.php

Thanks for your help so far!

jdMorgan




msg:3888481
 10:18 pm on Apr 8, 2009 (gmt 0)

As I stated above, if the "first folder deep" is "/folder" and this code is located in "/folder/.htaccess", then that first folder level is not visible to the Rewriterule for pattern-matching, and will have to be hard-coded by sticking "folder/" onto the substitution path and query string on the right side:

Something like:

RewriteRule ^(([^/]+/)*[^/]+)/([^./]+\.php)$ /folder/$3?var=folder/$1 [QSA,L]

although that may not be exactly what you want in the query string name/value pairs.

Jim

noyearzero




msg:3889142
 5:44 pm on Apr 9, 2009 (gmt 0)

I guess that was part of my problem... I thought the matching was done on the value of REQUEST_URI minus the query string.

So then if I wanted it to be more dynamic I could put the rewrite rules in the root .htaccess file? I tried this and several other iterations with no success.

RewriteRule ^([^/]+)/(([^/]+/)*[^/]+)/([^./]+\.php)$ /$1/$4?var=$2 [QSA]

jdMorgan




msg:3889179
 6:23 pm on Apr 9, 2009 (gmt 0)

There's nothing intrinsically wrong with your rule except for a missing [L] flag. But "no success" doesn't tell me much about about the remaining problem is.

The output of this RewriteRule won't match the rule's own pattern, so it doesn't look like it will cause an infinite loop, which is the most common problem. So how did you test (what were the URLs you used), what did you expect, what were the results, and how did those results differ from your expectations?

RewriteRule ^([^/]+)/(([^/]+/)*[^/]+)/([^./]+\.php)$ /$1/$4?var=$2 [QSA,L]

Note: Do remember to completely-flush your browser cache after any change to your server-side code (any kind of server side code).

Jim

noyearzero




msg:3889980
 4:00 pm on Apr 10, 2009 (gmt 0)

Yeah I figured you'd call me out on the 'no success' thing.

i tried
http://example.com/folder/a/b/detail.php
and every time it told me that
/folder/a/b/detail.php could not be found. and from what i gather if it tries to resolve to the original request, that means it couldn't match anything to the pattern.

I expected that adding the first part in () would make $1 the value of the first folder and push all the other variables up a number. But since it never tried to resolve to anything other than the original request, I couldn't tell.

I did have the L in there to start off with but it was giving me a server error so I took it out and the server error went away... kinda miffed me.

jdMorgan




msg:3889982
 4:12 pm on Apr 10, 2009 (gmt 0)

Have you enabled mod_rewrite?

Either both of the following lines or only the second line --it varies depending on server set-up-- is required ahead of your rule:

Options +FollowSymLinks
RewriteEngine on

Again, you may or may not need the first line. The only way to find out is to test. On some servers its required, and on others, its not allowed. On servers where its required but not allowed, you cannot use mod_rewrite.

BTW, to avoid confusing test results if you're trying to add these lines, use a super-simple rule such as

RewriteRule ^foo\.html$ http://www.webmasterworld.com/ [R=301,L]

If you then request /foo.html from your server, you should land back here at WebmasterWorld.
[/code]
Jim

noyearzero




msg:3890174
 8:57 pm on Apr 10, 2009 (gmt 0)

It is enabled. I have other rewrites in the same file that are working correctly.

I think I may have just found the problem. I tried this rewrite

RewriteRule ^(.*)$ /folder/detail.php [QSA,L]

while folder is a real directory and my request was
http://example.com/folder/a/b/detail.php

however if changed my request to
http://example.com/folder1/a/b/detail.php
it resolved to
http://example.com/folder/detail.php

is this because rewrite rules outside of real folders are ignored? Like my example... it would only obey rewrite rules found in /folder/.htaccess if i was making a request to http://example.com/folder/anything

if thats true, is there anyway to accomplish what i want at all?

g1smd




msg:3890176
 9:00 pm on Apr 10, 2009 (gmt 0)

The
/folder/.htaccess file can only deal with requests for URLs that resolve to that folder.

You'll need the .htaccess file to be in the root if it needs to deal with other URLs.

You need to think both about URLs as they appear 'on the web' and the folder/file structure on the server. They are merely 'related' and not the same thing.

.

This rewrite creates an infinite loop:

RewriteRule ^(.*)$ /folder/detail.php [QSA,L]

The output matches the input pattern and it rewrites again.

jdMorgan




msg:3890187
 9:13 pm on Apr 10, 2009 (gmt 0)

You said above that you'd moved the rule from /folder/.htaccess to /.htaccess.

If you make a major change, -- or any change at all for that matter, please say so.

What is your current rule, and where is it currently located?

Jim

[edited by: jdMorgan at 6:45 pm (utc) on April 11, 2009]

noyearzero




msg:3891226
 3:37 am on Apr 13, 2009 (gmt 0)

sorry, i wasn't too clear that i was switching directores.... her is my current rule

/.htaccess
RewriteRule ^folder1/(([^/]+/)*[^/]+)/([^./]+\.(.*))$ /_folder1/$3?rewrite_id=$1 [QSA,L]
RewriteRule ^folder2/(([^/]+/)*[^/]+)/([^./]+\.(.*))$ /_folder2/$3?rewrite_id=$1 [QSA,L]

/_folder1 and /_folder2 are the real folder names. this rule solves exactly what i originally wanted, but its created a new problem. requests to something like /folder1/file.php fail since its matching multiple folders deep only. any ideas to tweak the rule to account for this?

or would it be possible to progressively rewrite the URL?... ex: request /folder1/a/b/detail.php
step 1: rewrite /folder1/a/b/detail.php to /_folder1/a/b/detail.php (but don't submit yet)
step 2: check if /_folder1/a/b/detail.php actually exists. if it does, then submit the rewrite.
step 3: if its not a real file rewrite to /_folder1/detail.php?a=a/b

g1smd




msg:3891273
 8:08 am on Apr 13, 2009 (gmt 0)

You can make a part of the pattern optional by enclosing in () and adding ? directly after.

The problem is that as coded you have a / after folder1, then another / after the bracketed part. This means the bracketed part has to exist to match the pattern.

You need to get the second / moved inside the brackets and then apply something (? or *) to make that part of the pattern optional.

Write out a complete list of different URL formats that could be requested, mark which ones have to be reused as backreferences, look for patterns, and think it through again.

noyearzero




msg:3891407
 2:34 pm on Apr 13, 2009 (gmt 0)

The that did the trick, but I still have one problem. requests to real directories after the first directory level fail because it is always rewriting the subdirectories.

my current ruleset
/.htaccess
RewriteRule ^folder1/(([^/]+/)*[^/]+/)?([^./]+\.(.*))$ /_folder1/$3?rewrite_id=$1 [QSA,L]
RewriteRule ^folder2/(([^/]+/)*[^/]+/)?([^./]+\.(.*))$ /_folder2/$3?rewrite_id=$1 [QSA,L]

so if i have the real directories
/_folder1/images/
requests to http://example.com/folder1/images/img.jpg will be rewritten to http://example.com/_folder1/img.jpg?rewrite_id=images/

i always want to rewrite the folder name, but then only rewrite the rest of the request IF it doesn't resolve to a real file. So I guess i will need to do a progressive style rewrite, but i'm not sure how to accomplish this.

(i do realize it could be done by rewriting the directory name from within the root .htaccess file and then within each folder have an .htaccess file that checks weather its a real file and further processes it. However I will have many folders like this and it seems like bad form to duplicate the same ruleset so many times. It also seems like it would create extra requests that are not neccessary.)

g1smd




msg:3891527
 5:18 pm on Apr 13, 2009 (gmt 0)

I think you need another level of ( ) somewhere near the question mark, but I am not sufficiently clear on all of the URL formats that you need to match.

The alternative is to have several rules... each matching a specific 'type' of URL.

jdMorgan




msg:3891601
 7:31 pm on Apr 13, 2009 (gmt 0)

Too complicated. One rule should suffice:

RewriteRule ^(folder1¦folder2)/(([^/]+/)*)([^./]+\..+)$ /_$1/$4?rewrite_id=$2 [QSA,L]

Replace the broken pipe "¦" characters with solid pipes before use; Posting on this forum modifies the pipe characters.

--

The problem with "real" directories needs to be described in detail, with examples. It's not at all clear.

If you mean that folder1 and folder2 have physically-existing subdirectories and that those subdirectories should not be affectd by this rule, then you could use:

RewriteCond %{DOCUMENT_ROOT}/$1/$2 !-d
RewriteRule ^(folder1¦folder2)/(([^/]+/)*)([^./]+\..+)$ /_$1/$4?rewrite_id=$2 [QSA,L]

However, this depends on the DOCUMENT_ROOT+URL-URL-path-part in the RewriteCond being 100% correct and is hard to debug, and results in a disk check for every request to folder1 or folder2 which matches the rule's pattern to see if the subdirectory exists.

For this reason, you should consider either moving the 'real' directories and files out of /folder1 and /folder2, or using different 'virtual directories' -separate from folder1 and folder2- to create your 'virtual' URLs. This would prevent having to make the 'directory-exists' disk check for every request.

If you mix 'real' and 'virtual' URL-paths, then you also have the problem that you can never create a virtual subdirectory that has the same name as a real subdirectory, and vice-versa. This leads to errors and to maintenance headaches.

In other words, use a directory structure such as
/folder1-real/<all physically-existing files and folders>
/folder1-virtual/<all paths here and below are rewritten to the script>

-or-

/folder1/real/<all physically-existing files and folders>
/folder1/virtual/<all paths here and below are rewritten to the script>

In this way, the URL carries all the information needed by a RewriteRule to determine whether it should execute.

Jim

noyearzero




msg:3891783
 2:47 am on Apr 14, 2009 (gmt 0)

unfortunately my complicated multiple rules will have to stay since i have scenerios like /(folderA¦folderB¦folder1)/ will be rewritten to /_folder1/ then other items will be rewritten to /_folder2/ etc.

I definitely see the issues involved with virtual/real directories. i can avoid most issues with overlapping names ahead of time and i can't even think of a scenario when this would be a problem. I just have that weird feeling that it could come up in the future so i'd like to have my ruleset account for it ahead of time. so for that reason and pure curiosity i'm curious how to accomplish this. here is my best effort so far.

RewriteRule ^folder1/(.*)$ /_folder1/$1
RewriteRule ^folder2/(.*)$ /_folder2/$1
RewriteCond %{SCRIPT_FILENAME} !-f
RewriteRule ^/([^/]+/)/(([^/]+/)*[^/]+/)?([^./]+\..*)$ /$1/$4?rewrite_id=$2 [QSA,L]

i know whats in the first set of () in the second rewriterule is not correct... i want to grab the name of the first folder. ex. _folder1. and also i don't think %{SCRIPT_FILENAME} is what i'm looking for in the rewritecond. i want to test against the currently written url at the point that it matches the second rewriterule

noyearzero




msg:3892025
 1:31 pm on Apr 14, 2009 (gmt 0)

I found my error with the second rewrite rule... but i still can't get the RewriteCond right. heres where i'm at. Its a twist on your suggestion jd.

RewriteRule ^folder1/(.*)$ /_folder1/$1
RewriteRule ^folder2/(.*)$ /_folder2/$1
RewriteCond /$1/$4 -f
RewriteRule ^/([^/]+)/(([^/]+/)*[^/]+/)?([^./]+\..*)$ /$1/$4?rewrite_id=$2 [QSA,L]

What i want to accomplish is only apply the second rule if what would result is a real file. If i comment out the Cond, it rewrites properly. But as is it doesn't think the Cond is true.

jdMorgan




msg:3892042
 1:48 pm on Apr 14, 2009 (gmt 0)

If you're testing for a file or a directory, then you must include the base server filepath in the RewriteCond. That path is available as %{DOCUMENT_ROOT}.

If you're having trouble getting that path right, then temporarily change the rule to a redirect instead of a rewrite, and copy that %{DOCUMENT_ROOT}/$1/$4 tpath expression into a 'fake' query string variable appended to the RewriteRule substitution. The redirect will result in the new URL being sent to your browser, where you can inspect the composite filepath in the query string.

Jim

noyearzero




msg:3892124
 3:24 pm on Apr 14, 2009 (gmt 0)

I'm using this in a virtual server environment. So my DOCUMENT_ROOT is not the actual root for this account. I set up the VirtualDocumentRoot with v_host_alias. I was never able to grab the value of the VirtualDocumentRoot in PHP, however all my pages were served from the proper directories. I always thought this was kind of strange.

when i plug in an absolute directory for the rewrite conditions it works properly. so can you think of a work around for how to grab the VirtualDocumentRoot?

g1smd




msg:3892374
 7:39 pm on Apr 14, 2009 (gmt 0)

*** I have scenerios like /(folderA¦folderB¦folder1)/ will be rewritten to /_folder1/ ***

You say "rewrite" here?

Does this mean that multiple URLs serve the same content?

jdMorgan




msg:3892385
 8:02 pm on Apr 14, 2009 (gmt 0)

> work around for how to grab the VirtualDocumentRoot?

If the v_host to document-root "map" is small and simple, then you could reproduce it in mod_rewrite by checking requested %{HTTP_HOST} to get the hostname, and base which filepath you check on that. The simplest case would be one rule-set for each domain, with the 'file-exists' path customized for each. Code economy might then be improved after working out the basic function.

If the v_host to document-root map isn't simple and small, then it sounds like it's time to re-architect your server configuration, URLs, and file structure from the ground up based on your current needs; there are some 'tangles' you can get into that are impossible or just too troublesome to fix in any other way.

Jim

noyearzero




msg:3892479
 10:26 pm on Apr 14, 2009 (gmt 0)

g1:
multiple URLs could serve the same content...however for each site this will be applied to, only one will be used. i just don't want to have to change this file for every site. however if i had said 'yes', what would you have said?

jim:
the "map" is simple. it is:
[dir.example.com...] = /dir/example.com
it would be super delicious if i could actually have a function in my server httpd.conf or vhosts.conf that modifies the value of DOCUMENT_ROOT without having to create an entry for each domain. it seems like that defeats the purpose of having a VirtualDocumentRoot.

i guess second best option would be to have a rule-set in the root .htaccess file that modifies the DOCUMENT_ROOT value. but if this is possible, wouldn't it be possible to have it in the httpd.conf and work the same?

if neither of those are options, i would have to use backreferences to the parts of the HTTP_HOST to get generate the full document root anytime i wanted to call it?

g1smd




msg:3892497
 10:57 pm on Apr 14, 2009 (gmt 0)

I'd have started banging on about potential Duplicate Content problems.

jdMorgan




msg:3892601
 2:20 am on Apr 15, 2009 (gmt 0)

Here is an example to build the filepath including the subdomain, domain, and TLD as you show above. Note that I 'invented' the "www/public/html" part to show where the common root path would go. Change that to whatever the real path might be.

RewriteCond %{HTTP_HOST} ^([^.]+)\.([^.]+\.[^.:]+)\.?(:[0-9]+)?$
RewriteCond /www/public/html/%1/%2/$1/$4 -f
RewriteRule ^/([^/]+)/(([^/]+/)*[^/]+/)?([^./]+\..*)$ /$1/$4?rewrite_id=$2 [QSA,L]

As for having to re-generate this path everywhere and all the time, you could look into setting an environment variable. E.g. put:

RewriteCond %{HTTP_HOST} ^([^.]+)\.([^.]+\.[^.:]+)\.?(:[0-9]+)?$
RewriteRule ^.? - [E=myDocRoot:/www/public/html/%1/%2]

at the top, ahead of all the other the rules so that it always executes. Then you can refer to "myDocRoot" anywhere any other environment variable can be referenced. You could then re-code the previously-discussed RewriteCond as

RewriteCond %{myDocRoot}/$1/$4 -f

Jim

noyearzero




msg:3893246
 8:58 pm on Apr 15, 2009 (gmt 0)

g1:
I was hoping you were going to delve into to wonders of Aliases. Which I know almost nothing about yet, however it seems like it might apply.

Jim, that works like a charm. I thought i read somewhere that you can put rewrites in the httpd.conf. I tried using the code below but it doesn't have the same result as putting it in the .htaccess. I'm testing this in a windows environment if that makes a difference.

<IfModule rewrite_module>
Options +FollowSymLinks
RewriteEngine on
RewriteCond %{HTTP_HOST} ^([^.]+)\.([^.]+\.[^.:]+)\.?(:[0-9]+)?$
RewriteRule ^.? - [E=myDocRoot:%{DOCUMENT_ROOT}/%1/%2]
<IfModule>

jdMorgan




msg:3893272
 9:32 pm on Apr 15, 2009 (gmt 0)

I can only guess. And that's usually not productive.

If putting that code in httpd.conf didn't have the same result as in .htaccess, then what result *did* it have?

The only thing that differs with respect to this code is that you may not need to declare the Options in this block of code if it's in httpd.conf; You can do if you like, but it will still affect only .htaccess files, and not the present code. (See Options directive in Apache Core documentation.)

You can also dump the <ifModule> container. It's only needed if you want the code to fail silently when mod_rewrite is not installed on a server.

Jim

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Code, Content, and Presentation / Apache Web Server
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved