Forum Moderators: phranque

Message Too Old, No Replies

help with a mod rewrite design

Need a little help with Mod Rewrite Design

         

j_a_c_k

2:01 am on Jul 28, 2006 (gmt 0)

10+ Year Member



I have come to appreciate Search Engine's reluctance to index dynamic pages. I have a site that passes a variable query string of 0 to 6 parameters.
All the parameters are numbers greater than 0 and less than 9999.
My current URLs look like this:
[mysite.com...]

Usually I have 1 to 3 parameters, not in any particular order. In this redesign, I can put them in a sequence if need be, but I am not sure how to handle the missing parameters. I thought of always including all six using zero for null as none will ever really be zero.

I would like to get to a more friendly URL like
[mysite.com...]
or better yet
[mysite.com...]

which I would like to translate back to:
[mysite.com...]

I cannot figure out how to this without 6*5*4*3*2*1=720 RewriteRule statements.

is there no way to pass the output of one RewriteRule statement to another RewriteRule?

I tried creating a Perl program to handle this but could not get it to work. Probably because the site is on a name based shared vps with half a dozen other sites. The Apache documents say that the program will be started when the server starts and will handle the rewrites as they come in.

jdMorgan

4:18 am on Jul 28, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



The output of one RewriteRule passes to the next by default, unless you use the [L] flag on the rule. Combined with the [QSA] flag (see mod_rewrite documenteation), a stack of six rules would work.

You could also use a series of six RewriteConds testing REQUEST_URI, with each one passing any previously-found variables to the next RewriteCond so as to 'preserve them', followed by a single RewriteRule. This method would parse out the variables in any order (3 variables shown here for example):


RewriteCond %{REQUEST_URI} /b([0-9]{1,4})/?
RewriteCond %1<>{REQUEST_URI} ([^<]*)<>/e([0-9]{1,4})/?
RewriteCond %1<>%2<>%{REQUEST_URI} ([^<]*)<>([<]*)<>/d([0-9]{1,4})/?
RewriteRule -([a-z]+\.html)$ /dodah.php?b=%1&e=%2&d=%3&page=$1 [L]

As you surmise, you'll either need to use a fixed-order 'friendly-url' format, or 'tag' each variable with a unique identifier (the letters 'b' and 'e' in your example URL). If you use a fixed-order, variable-always-present friendly URL, then you can do the whole thing with a single RewriteRule.

Jim

j_a_c_k

1:57 pm on Aug 1, 2006 (gmt 0)

10+ Year Member



Thank you for helping me to understand these basics.

After a great deal of discussion here, we have come to a design objective that I am not sure we can meet.

We want the shortest possible pseudo page names; we want to omit unused parameters, not put them in as zeros.

Again there are from zero to six parameters and they are always followed by a page name in the form modelpage.html. (And sometimes modelpage.htm, although we could clean this up and be consistent)

The desired URL would look like this:
[mysite.com...]

This is a great objective. I cannot figure out how to build the output:
[mysite.com...]

I am going insane trying to build these rules and conditions when I may or may not have parameters for a, b, c, d, e, f and, ideally, we would like them to appear in any sequence, except that the page=modelpage.htm will always be last.

In your example above, as soon as a parameter is omitted the rule is aborted. I am back to my original problem of hundreds of permutations.

jdMorgan

2:52 am on Aug 2, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



OK, one last try, despite our charter [webmasterworld.com].

The following code should work, given that each URL parameter is 'tagged' with a letter (a-f), followed by one to four digits, and then a slash. It will be ungodly-inefficient and slow, but if your team insists on a variable-order, optional-parameter-presence URL, that's what you get, because it makes parsing the URL quite difficult.


# First rule clears any query string on modelpage request
# (needed to prevent possible query string injection exploits)
RewriteRule ^(.*/modelpage\.html?)$ $1?
# Next, check for each of six parameter tags (a-f) and add to query string if present
RewriteRule ^(.*a([0-9]{1,4})/.*modelpage\.html?)$ $1?a=$2 [QSA]
RewriteRule ^(.*b([0-9]{1,4})/.*modelpage\.html?)$ $1?b=$2 [QSA]
RewriteRule ^(.*c([0-9]{1,4})/.*modelpage\.html?)$ $1?c=$2 [QSA]
RewriteRule ^(.*d([0-9]{1,4})/.*modelpage\.html?)$ $1?d=$2 [QSA]
RewriteRule ^(.*e([0-9]{1,4})/.*modelpage\.html?)$ $1?e=$2 [QSA]
RewriteRule ^(.*f([0-9]{1,4})/.*modelpage\.html?)$ $1?f=$2 [QSA]
# Now rewrite modelpage.html to dodah.php, leaving the accumulated query string intact
RewriteRule /modelpage\.html?$ /dodah.php [L]

This is untested. See also this thread [webmasterworld.com] for a serious warning which explicitly addresses this kind of rewrite.

Jim

jdMorgan

2:55 pm on Aug 2, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I just noticed an omission in the final rule above. It should read:

RewriteRule /modelpage\.html?$ /dodah.php?page=modelpage.html [QSA,L]

or perhaps

RewriteRule /modelpage\.(html?)$ /dodah.php?page=modelpage.$1 [QSA,L]

if you wish to carry the htm/html extension option into the script's "page=" value.

Jim