Forum Moderators: phranque

Message Too Old, No Replies

redirect problem - unable to get the correct variables

I wrote a redirect that got a specific variable however wh

         

nigelt74

8:17 am on Sep 13, 2007 (gmt 0)

10+ Year Member



Hi all

I get a URL like this

roast/hedgehog/224-4010.html

and I rewrite it to this

roast/hedgehog/224.html

using this rule
RewriteRule ^roast/hedgehog/([^/\.]+)-([^/\.]+)\.html$ /roast/hedgehog/$1.html

But now the boss wants to see it work like this

roast/hedgehog/224-4010.html

and I rewrite it to this

roast/hedgehog/4010.html

now if i change the above rule to

RewriteRule ^roast/hedgehog/([^/\.]+)-([^/\.]+)\.html$ /roast/hedgehog/$2.html

it work ok until i get a URL like this

roast/hedgehog/224-4010-123.html

It then produces

roast/hedgehog/123.html

but the boss wants it to produce

roast/hedgehog/4010-123.html

and that is where i am stumped, I know why it does it but i can't seem to figure a way around it and there are URLs that have many more - in them, I basically just need to loose the first one

Cheers

jdMorgan

1:56 pm on Sep 13, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



That's a difficult problem if the rule is, "For number sequences of two parts, keep only the final part, and for number sequences of more that two parts, keep only the final two parts" -- Expressing the problem concisely is 90% of the work.

I played with it for awhile, and the simplest solution may be to use two rules:


RewriteRule ^boiled/badger/[^-/]+-([^./]+)\.html$ /boiled/badger/$1.html
RewriteRule ^boiled/badger/([^-/]+-)+([^-/]+-[^./]+)\.html$ /boiled/badger/$2.html

I'll have the fried budgie then, if you please. :)

Jim

nigelt74

11:05 pm on Sep 13, 2007 (gmt 0)

10+ Year Member



Hi Jim

Thanks for that I have it working now after stretching out the code you gave me

I guess I didn't express it concisely enough in my explanation (that's definitely 90% of my problem), writing it as a rule makes it a bit more logical


"For number sequences of two parts, keep only the final part, and for number sequences of more that two parts, keep only the final two parts"

The rule should be
"for number sequences of 2 or more parts discard only the first part"

eg

roast/hedgehog/224-4010-123.html
should go to
roast/hedgehog/4010-123.html
and
roast/hedgehog/224-4010-123-678-975.html
should go to
roast/hedgehog/4010-123-678-975.html

basically the first grouping of digits (in this case the 224) needs to be discarded,
There can be a varying number of groups of digits within a file name

Using the below rules it now works (I have just expanded on what you gave me)


RewriteRule ^boiled/badger/[^-/]+-([^./]+)\.html$ /boiled/badger/$1.html
RewriteRule ^boiled/badger/([^-/]+-)+([^-/]+-[^./]+)\.html$ /boiled/badger/$2.html
RewriteRule ^boiled/badger/([^-/]+-)+([^-/]+-[^./]+-[^./]+)\.html$ /boiled/badger/$2.html
RewriteRule ^boiled/badger/([^-/]+-)+([^-/]+-[^./]+-[^./]+-[^./]+)\.html$ /boiled/badger/$2.html
RewriteRule ^boiled/badger/([^-/]+-)+([^-/]+-[^./]+-[^./]+-[^./]+-[^./]+)\.html$ /boiled/badger/$2.html
RewriteRule ^boiled/badger/([^-/]+-)+([^-/]+-[^./]+-[^./]+-[^./]+-[^./]+-[^./]+)\.html$ /boiled/badger/$2.html
RewriteRule ^boiled/badger/([^-/]+-)+([^-/]+-[^./]+-[^./]+-[^./]+-[^./]+-[^./]+-[^./]+)\.html$ /boiled/badger/$2.html

Nigel

jdMorgan

2:25 am on Sep 14, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Oh, we're on about the badgers again, are we? :)

(Rarely are URLs "fun" -- Honestly, I could have answered your first post much faster had I not been laughing so hard.)

Note that in the patterns, "[^-/]+" should be used preceding an expected hyphen, and "[^./]+" should be used preceding an expected period. The rules will run quite a bit faster with the correct subpatterns, as they allow parsing the requested URL-path in a single left-to-right pass, rather than repeated "fit and try" attempts.

So, for example, that makes your last rule:


RewriteRule ^herbed/hedgehog/([^-/]+-)+([^-/]+-[^-/]+-[^-/]+-[^-/]+-[^-/]+-[^-/]+-[^./]+)\.html$ /herbed/hedgehog/$2.html

The usual description of a subpattern like "[^./]+" is to say, "Match one or more characters not equal to a period or a slash." But in this case, it's equally accurate to simply say, "Match one or more characters up to, but not including, the next period or slash." So a negative-match pattern like that allows a quick left-to-right scan of the entire URL-path, and is much more efficient than, for example, a pattern of "(.*)-(.*)" where the whole string is initially matched into the first subpattern, fails to match, and is then parceled out one character at a time into the second subpattern until a match is finally found or all possibilities are exhausted, requiring many trials to parse a long URL-path.

And actually, if the desired behaviour is "for number sequences of 2 or more parts discard only the first part"
then only two rules are needed:


RewriteRule ^herbed/hedgehog/[^-/]+-([^./]+)\.html$ /herbed/hedgehog/$1.html
RewriteRule ^herbed/hedgehog/[^-/]+-[b](([^-/]+-)+[/b][^./]+)\.html$ /herbed/hedgehog/[b]$1[/b].html

Note the added layer of parentheses with a "+" quantifier on the center subpattern, allowing one or more repeats of "one or more characters not a hyphen or slash, followed by a hyphen."

Jim