Forum Moderators: phranque

Message Too Old, No Replies

Subdomain Redirect problem in php suexec environment

Redirect looping for one but not another in exact same setup.

         

twohawks

8:10 pm on Oct 25, 2008 (gmt 0)

10+ Year Member



Hi All,
I am having a problem with subdomain redirect looping and/or __?
I am not proficient in understanding the rewriting codes or methods, but I have assembled this code over time and it has seemed to work quite well for me up to now, however, it is apparently failing in one single instance, and that one instance only.
What's baffling is its failing right next to an identical setup that exists in the same place (basically).
After all manner of troubleshooting I feel I must conclude that it must be a configuration sneaking into my server .conf file(s), and the I must address something there, but it illudes me.

Following is relevant information and some questions. Maybe someone can spot what I am failing to see or address.

Thanks for checking in....

I am on shared hosting (as a reseller), apache running on unix (maybe linux) recently migrated to php suexec, and we use whm and cPanel.
I don't care to rely on cPanel so much, but I believe that in order to write a vhosts configuration (or config changes) into the server file, if that's what's desired, one possibly must use this to do it. I am uncertain if cPanel checks the dns record for changes and subsequently auto generates vHosts entries, but it may do so (?anyone?). Reason I bring this up is I hear-tell that cPanel rewrites the conf file(s) each day on a regular basis (aside from rewriting on the fly as new changes are introduced manually from within cPanel), so I figure it must be checking on things periodically and making adjustments accordingly. I know it sees redirects that I write up and it applies some of those to its redirects listing, but I believe this is nothing to do with the conf files (or does it?).

My procedure typically is...
I setup subdomain in DNS record from within whm.
I setup the rewrites in .htaccess in the root directory of the hosted site.
Wala - all is good.

In this case I am uncertain as yet if cPanel scans and realizes the subdomain at some point and then adds it to its own list of subdomains, and/or does anything further with that info/record.

OR

I setup a subdomain from within cPanel.
I setup the rewrites in .htaccess in the root directory.
I check dns records for the subdomain entries injected by cPanel.
Wala - all is good.

(If cPanel generates any rewrites in htaccess files anywhere I remove them, preferring to rely on my own.)
-------------
It didn't seem to matter which way I approach it, that is, for all but the one subdomain configuration that won't work no matter what I do.

Also note that, I took one that worked and the one that doesn't, emptied the directories of all but a single html document (hello world type of thing), proceeded to transpose the relevant directory names (via renaming them) to test if it was something to do with whatever may be installed in the uncooperative directory group, but it made no difference. Only the one (referenced in section 3 below) ever fails... with either looping error from the server or a 404-file not found.

Some settings to note are...
Options +FollowSymLinks -MultiViews (tested variations on this)
RewriteEngine On
#RewriteBase / (tested on/off, ..don't need it on)
UseCanonicalName Off ...more on this below...

I am aware of clearing cache and cookies and dns cache (meticulously) while testing, and also any relevant server caches created by installed applications.

-------------------------------
Regarding UseCanonicalName directive...
I noticed in the requested copies of vHosts settings (from my host) that UseCanonicalName Off appeared in the working items, but not in the failing one, which I found to be strange. When I asked about this the host told me that it is off be default anyway and he didn't know why it would even be appearing in there for any of the subdomain entries. I asked them to place it in there for my 'broken' one anyway -just to humor me while I am testing, because it really seemed to be my last option - there seem to be no more potential issues, at least, I can think of.
Anyway, it didn't make any difference, and now I am at a loss of what I can possibly do next.

Per this last item... I think UseCanonicalName can only be set in vHost in, say httpd.conf or *.conf, but I am interested if I can set this directive in php.ini? Since we are using php suexec I think I should be able to do this, however, I am uncertain if the server needs a restart or how exactly this would work? And I don't know if I would have to write the whole relevant vHost configuration out for that subdomain, or just include that setting (but I think its the former)?

------------------------
Personally, I suspect UseCanonicalName directive may be the culprit and needs to be nailed down better somehow, but I need better server access to review the changes taking place in the *.conf file(s), and I am uncertain how to obtain this. As it is now I have to submit a ticket and wait uncertain number of hours to obtain a copy, and by then too many other things could be happening (sigh).

Other questions are,
What might cause one to loop, and not the other?
And what would cause one to perchance 404 (file not found) error, when the other never does?

Following is the relevant rewrite code being used (everything is base on the section 2 code). Keep in mind I use some other code for handling www or blank prefixes, which I am not showing here.
I hope I have provided thorough enough information.
Jeez... any help would be greatly apreciated.
Cheers,
TwoHawks

##################################################
# Section 2: Testdirect Subdomain and /test_nbhforum folder
##################################################
#### REDIRECT SUBDOMAIN CALL TO A FOLDER AND DISPLAY AS SUBDOMAIN####
#### Calls to TESTDIRECT.example.COM = contents /TESTFORUM/ ####
RewriteCond %{REQUEST_URI} !^/test_nbhforum/.*
RewriteCond %{HTTP_HOST} testdirect\.example\.com [NC]
RewriteRule (.*) /test_nbhforum/$1 [L]

#### REDIRECT CLIENT SUBDIRECTORY REQUESTS TO SUBDOMAIN (VIA ROOT HTACCESS)####
#### Non-servr Calls to /TESTDIRECT = TESTDIRECT.DOM.COM
RewriteCond %{THE_REQUEST} ^[A-Z]+\ /testdirect/
RewriteRule ^(.*)$ "http\:\/\/www\.example\.com\/test_nbhforum" [R=301,L]

#### Non-servr Calls to /TEST_NBHFORUM = TESTDIRECT.DOM.COM (as defined in #3 above)
RewriteCond %{THE_REQUEST} ^[A-Z]+\ /test_nbhforum/
RewriteRule ^test_nbhforum/(.*) http://testdirect.example.com/$1 [R=301,L]

################################################################
##### Section 3: NBHealers Live Section #####
################################################################
#### REDIRECT SUBDOMAIN CALL TO A FOLDER AND DISPLAY AS SUBDOMAIN####
#### Calls to NBHEALERS.example.COM = contents /NBHFORUM####
RewriteCond %{REQUEST_URI} !^/nbhforum/.*
RewriteCond %{HTTP_HOST} nbhealers\.example\.com [NC]
RewriteRule (.*) /nbhforum/$1 [L]

#### REDIRECT CLIENT SUBDIRECTORY REQUESTS TO SUBDOMAIN (VIA ROOT HTACCESS)####
#### Non-servr Calls to /NBHEALERS = NBHEALERS.DOM.COM (as defined in #3 above)
RewriteCond %{THE_REQUEST} ^[A-Z]+\ /nbhealers/
RewriteRule ^(.*)$ "http\:\/\/www\.example\.com\/nbhforum" [R=301,L]

#### Non-servr Calls to /NBHFORUM = NBHEALERS.DOM.COM (as defined in #3 above)
RewriteCond %{THE_REQUEST} ^[A-Z]+\ /nbhforum/
RewriteRule ^nbhforum/(.*) http://nbhealers.example.com/$1 [R=301,L]

[edited by: jdMorgan at 7:29 pm (utc) on Nov. 1, 2008]
[edit reason] Please use example.com [/edit]

twohawks

10:17 am on Oct 29, 2008 (gmt 0)

10+ Year Member



WHOA... Don't answer post #:3775582 yet... I just figured something else out... I thought one had to use some sort of placeholder in order to work the match in a rule because I see that so often being done, but I just discovered it doesn't work that way - jeez, this could have saved me some time, but I didn't understand that from the manual or other things I've read. Obviously I have not understood the purpose of when people are using a placeholder...

So I changed my rule and now all six examples work (wow), but I really would like some feedback on both approaches 'cause some of what I did is based on approaches I have seen used here, and also I don't know what else may be askew in my understanding - for all I know I am approaching this in some unwittingly insecure manner(?).

Here's the rewritten Rule and understanding...
Old rule:
RewriteRule ^test_nbhforum(/?$¦.*/$¦.*/(.*)) http://testdirect.example.com/$2 [NC,R=301,L]

New Rule RewriteRule (/?$¦.*/(.*)) http://testdirect.example.com/$2 [NC,R=301,L]

Logic:
If the path in the request
1) ends in one or no slash, OR
2) is followed by any string of characters followed by a slash "and any string of characters thereafter"
then redirect to url http://testdirect.example.com, include a trailing slash, and include the quoted portion in #2 there if it exists.

Thanks for looking in.
TwoHawks

[edited by: jdMorgan at 3:05 pm (utc) on Oct. 30, 2008]
[edit reason] example.com [/edit]

g1smd

11:06 am on Oct 29, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



OK on not following how things function... I spent a year or two in that phase. In fact one core revelation had escaped me until only the last few months: processing order is "left side of Rule", any "conditions" above the rule (RewriteCond), "right side of Rule".

As for the last piece of code above; while that could work, remember that a pattern like

.*/.*
says something like this: "match everything in the whole string, followed by a slash, followed by matching the entire string". It is inefficient as it has to try multiple tests to finally get a match.

The parser is having this sort of conversation with itself...

OK. Start off by matching "everything".
Right. Got it.
Am I done now?
Err... "...followed by a slash?"
Umm. What slash?
I just grabbed "everything" like what you said.
Eh?
OK. Back up one. Can you see a slash?
No.
OK. Back up one. Can you see a slash?
No.
OK. Back up one. Can you see a slash?
No.
OK. Back up one. Can you see a slash?
Yeah.
Great. Grab it.
Are we done now?
No. It says "followed by everything" again.
Err, but we had "everything" before the slash.
Have we got the right slash?
Dunno.
Backup one.
OK
What do you see?

... and so on.

Since expressions are evaluated "from the left" it is much better to code this like "get everything that isn't a slash (and there must be one or more characters present), followed by a slash" ... something like.

([^/]+)/

and then get it to deal with the next part of the requested URL, without having to back up at any point.

jdMorgan

1:43 pm on Oct 29, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Logic:
If the path in the request
1) ends in one or no slash, OR
2) is followed by any string of characters followed by a slash "and any string of characters thereafter"
then redirect to url http://testdirect.example.com, include a trailing slash, and include the quoted portion in #2 there if it exists.

The problem with this is that #1 is always true, since any requested URL-path "will end with a slash or no slash".

I don't see any way to resolve the following two cases, because "anything here" is too ambiguous:

3.http://www.example.com/test_nbhforum_anything_here... = error 404 file not found
4.http://www.example.com/test_nbhforum_anything_here.../ = http://testdirect.example.com/

The only cases where that would not be ambiguous is if the "test_nbhforum" is a literal (and the code is placed in .htaccess *above* that directory, so that the rule can check for the literal "test_nbhforum" in the requested path), or if the leading underscore of "_anything here" is a literal underscore, in which case the rule could check for it.

Often, the problem is not one of coding, but of defining the problem accurately so that a solution can be coded. Here, it appears that the URL-structure is an impediment to a solution. It is almost always a mistake to begin coding before the problem is thoroughly and concisely defined.

Jim

[edited by: jdMorgan at 2:27 pm (utc) on Oct. 30, 2008]

twohawks

5:31 pm on Oct 29, 2008 (gmt 0)

10+ Year Member



@g1smd...

OK on not following how things function... I spent a year or two in that phase. In fact one core revelation had escaped me until only the last few months: processing order is "left side of Rule", any "conditions" above the rule (RewriteCond), "right side of Rule".

Wow... you know I just read somewhere in one of those rewrite manual/tutorials the author was reciting the logic and I blew it off as 'I don't get it, must just be me'... and it was because it resembled what you just said. I had not yet seen this mentioned otherwise. I appreciate your mention of it.
Thanks for the encouragements ;^)

As for the last piece of code above; while that could work, remember that a pattern like .*/.* says something like this: "match everything in the whole string, followed by a slash, followed by matching the entire string". It is inefficient as it has to try multiple tests to finally get a match. <...snip-example...>

I arrived at that because I read using "!" not is only allowed at the beginning of a condition or rule. Thanks for pointing out the recursive parsing - ick!
See, I 'knew' I wasn't being brilliant ! although I had hoped it. Oh, and it gets worse... JD has more butt-kicking awaiting my attention next (I'm waring my thick diapers today [how embarrassing])
8^P

Since expressions are evaluated "from the left" it is much better to code this like "get everything that isn't a slash (and there must be one or more characters present), followed by a slash ... something like.
([^/]+)/
and then get it to deal with the next part of the requested URL, without having to back up at any point.

And so here you are with "not" in the middle - so it is allowed after all! I wonder what I would have come up with had I sorted that understanding out (trying out ^ instead of ! - just escaped me), but as I said, thought I had to ditch that logical persuit.

Thank you, thank you!

jdMorgan

6:05 pm on Oct 29, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I *really* think you would benefit from reading the mod_rewrite documentation --top to bottom, at least once-- and the regular-expressions tutorial linked from our Apache Forum Charter [webmasterworld.com], rather than having to pick it all up piecemeal from 'hints' in forum posts.

The "!" NOT operator 'belongs to' mod_rewrite. The "^" NOT operator --valid only inside [alternate character groups] in regular expressions-- 'belongs to' the regex parser. Therefore, you can use "!" only at the beginning of a pattern in mod_rewrite, while you can use negative-match [alternate character groups] as much as you like within patterns.

If you use a parenthesized construct such as "!(<some-pattern>)", be aware that you cannot back-reference the matching contents of that pattern, because in the case of the RewriteCond being TRUE, the pattern itself *did not* match, and the "!" operator then inverted the RewriteCond match state from FALSE to TRUE. So in this case, the "matched contents" of the parenthesized sub-expression don't exist, and the back-reference value will be blank.

Also be aware that as the Apache mod_rewrite documentation states, all RewriteCond back-references are taken from the last-matched RewriteCond only, so whatever you want to back-reference from your RewriteConds must be matched in the final RewriteCond that will match (or you must "carry them forward" from sequential RewriteCond to RewriteCond, which is an advanced topic that we don't need in your application.

Jim

twohawks

6:33 pm on Oct 29, 2008 (gmt 0)

10+ Year Member



@JDM

Logic:
If the path in the request
1) ends in one or no slash, OR
2) is followed by any string of characters followed by a slash "and any string of characters thereafter"
then redirect to url http://testdirect.example.com, include a trailing slash, and include the quoted portion in #2 there if it exists.

The problem with this is that #1 is always true, since any requested URL-path "will end with a slash or no slash".

Hmmm... I think the code is correct and what you are pointing out is my failing to express it at all accurately, yes?

What it does (.*/$ part of the rule) is it looks at the beginning of 'the-path-to-file' portion of what's returned from the condition and follows/reads on up, first looking for a match to the return value being checked (which is /test_nbhforum), then it keeps going matching anything up to either the first slash it finds, or if there is no subsequent slash it will match everything to the end.
. . . The situation of no subsequent slash will only occur, of course, if the user enters that path in the url, possibly with a typo on the end (or whatever) and no trailing slash. Every other situation will necessarily include more slashes in a meaningful url here, so we're covered.

Of course, that's not intended as the 'proper' logical explanation really, but trying to describe what I feel I observe happening, and it is the intended result.
I.E. this actually appears handle examples 3 + 4, what is discussed next, but does not interfere with the other examples.
. The examples are possible representations of the intended situations I wish to have addressed, with the results indicated,
excepting that 3 and 4 now return the same as #4 without error by using this in the rule....
.*/$ (the one being addressed here)

Is the correct, or better expressed, logic for the rule then,
/?$ :match if begins with the returned value ending in one or no slash, which addresses /test_nbhforum or /test_nbhforum/ when occuring just next after domain.com,
.*/$ ¦ :match if begins with returned value followed by any characters stopping at the first encountered slash, which addresses 3+4 as discussed below
...and then I need to work on the third one a la g1sm's earlier response, ...so the question was is this the correct, or better expressed, logic for this portion of the rule then?


I don't see any way to resolve the following two cases, because "anything here" is too ambiguous:

3.http://www.example.com/test_nbhforum_anything_here... = error 404 file not found
4.http://www.example.com/test_nbhforum_anything_here.../ = http://testdirect.example.com/

The only cases where that would not be ambiguous is if the "test_nbhforum" is a literal (and the code is placed in .htaccess *above* that directory, so that the rule can check for the literal "test_nbhforum" in the requested path), or if the leading underscore of "_anything here" is a literal underscore, in which case the rule could check for it.

What those examples mean to represent might be culled from...
3.http://www.example.com/test_nbhforum.* = http://testdirect.example.com/ (this is updated from 404 error with the newer code issued)
4.http://www.example.com/test_nbhforum.*/ = http://testdirect.example.com/
...where .* means your typical 'followed 0 or more of any character', hence I had used 'anything here' - meant to be ambiguous.

Before continuing, as mentioned somewhere else, I keep all rules in htaccess in the root, so this is assumed at this point.

The above examples 3 and 4 are intended to describe possible matches a la my responses just above this one, i.e, while using .*/$ in the rule as discussed there.
. I am uncertain if the ambiguity you mention is simply with reference to the communication (guilty as charged!), or if you mean it really shouldn't work, but I would guess its the former since you folks seem to be focussed on trying to help me think more carefully and also the sample works without conflict. (This is my best response to this one.)

Often, the problem is not one of coding, but of defining the problem accurately so that a solution can be coded. Here, it appears that the URL-structure is an impediment to a solution. It is almost always a mistake to begin coding before the problem is thoroughly and concisely defined.

. Hmmm, I thought I defined the problem fairly well earlier. I recall no 'consternation' after my last revamping of it, but I will go back and review that.
. The examples, of course, were meant to indicate reflections of how the rule might be affected in those cases, such acting as representations of the most prominent potential problems I wish to address.

Wow, when I read your response I thought, 'jeez, he's kicking my butt'. Just what I came for... I am grateful for your patience and effort. Thanks, JD.

Cheers for now,
TwoHawks

[edited by: jdMorgan at 3:06 pm (utc) on Oct. 30, 2008]
[edit reason] example.com [/edit]

g1smd

6:48 pm on Oct 29, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



*** What it does (.*/$ part of the rule) is ... ***

... grabs the whole input value and says "got it all. finished".

There's no "matching to first slash", the

.*
says "get the lot".

I am thinking that none of the alternatives, separated by pipes, get a look in, as this regex satisfies all inputs.

twohawks

8:20 am on Oct 30, 2008 (gmt 0)

10+ Year Member



server doubleposted this somhow - see next

[edited by: twohawks at 8:34 am (utc) on Oct. 30, 2008]

twohawks

8:34 am on Oct 30, 2008 (gmt 0)

10+ Year Member



@g1smd <Sigh> Man, that's rough, but alas, correct. Figured that out before coming back here and seeing your post! Just couldn't put my finger on it, but I guess that's regexp 101 isn't it...
. . Where I tripped up on that was thinking .* in a rule following a condition meant match the condition, not match if condition true - which then means, as you point out, grab everything.

Okay, sorted something out, I think this one is accurate for a change.... But...

. . before I go on I just want to say, @JD, honestly, I did perform a readthru on those references, and a few others as well, and more than once, before launching into this particular post. I am not saying I studied them well, but I did look thru, and I am referencing these materials judiciously as I wade thru this because I really want to understand it rather than just copy stuff I don't understand. Maybe I am not the brightest crayon in the box - only little bits have stuck at a time, but I have been doing my best to respect the time of those whom I might be asking for help by rtm's before presuming upon folks here.

Also, thank you for laying down those critical tips, as well as for checking in and asking me to actually read the basic pertinent documentation - I agree that's very important to mention, and that is a substantive post -for me or anyone who may happen upon this thread.

-----------------------
Now, here's where I have gotten to on my rule for Section 2. This successfully handles all 6 examples (presented in post#:3775582).

RewriteCond %{THE_REQUEST} ^[A-Z]+\ /(testdirect) [NC]
RewriteRule ([^/]+/?)(.*) http://%1.example.com/$2 [NC,R=301,L]

# if match condition is true,
# ([^/]+ start grabbing everything in 'the return' excluding a slash character
. . /? until 0 or 1 occurances of slash. )
. . This effectively matches "/testdirect" plus anything possibly added on to it until either a slash exists, or its the end of the line, whichever comes first.
# (.*) in this case won't be affected unless the prior match finds a slash, in which case this then (and only then) will grab everything after that slash.

As simple as it is, you really got my attention with that [^/]+, g1sm. Did a lot of little tests with that - couldn't figure a substitution, but it helped lead me to a better understanding and new resolution. I had a serious problem with a bad rule string that wouldn't fail testing and so led me to believe things that are just plain wrong. Hope I'm moving past that now.

I think this is spot on, if not close, professors, ...eh?

[edited by: jdMorgan at 2:22 pm (utc) on Oct. 30, 2008]
[edit reason] Please use example.com [/edit]

jdMorgan

2:21 pm on Oct 30, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Look into start-anchoring your RewriteRule pattern, and including "testdirect" in it as well.

I'm kind of lost as to what your intent is (it's hard to keep up with many and long open threads), but I would think you'd want to do this... Something like:


RewriteRule ^testdirect/([^/]+/?)(.*) http://%1.example.com/$2 [NC,R=301,L]
- or maybe -
RewriteRule ^testdirect/?(.*) http://%1.example.com/$1 [NC,R=301,L]

(I didn't show the RewriteCond, as it remains unchanged)

It's simply a good idea to anchor patterns as solidly as possible and to make them as specific as possible, in order to preclude unexpected matches, and therefore, unexpected "side-effects."

Jim

twohawks

7:39 pm on Oct 31, 2008 (gmt 0)

10+ Year Member



Thanks JD.
I have added and tested anchoring and I get that.
I do not yet wish to include the "testdirect" in there just yet... I am trying to achieve one or two more things where this would not (yet) suit.

I'm pretty close to posting the problem and solution I am setting up for now, but I am trying to take it a bit further and I have a question....

The below example resembles the idea of a solution I would wish to leaverage in that
I wish to create a variable of "subdomainX" and retrieve it later, even if (subdomainX) doesn't return true, i.e., assign it a custom value if need be (clarity below).

Here is a picture-example of 'the idea' of it, but obviously it will not work because
if the intended variable value "subdomainX" is not actually in the return
...well then the variable value is returned empty....

Reference: Input URL is domain.com/forumX/xyz
. . . . . . intended rewrite is "subdomainX.example.com/xyz..."
. . . . . . actual result ".example.com/xyz"

Example1: ineffective, variable is empty
. RewriteCond %{THE_REQUEST} ^[A-Z]+\ /(subdomain1)¦forum1 [NC]
. RewriteCond %{THE_REQUEST} ^[A-Z]+\ /(subdomain2)¦forum2 [NC]
. RewriteCond %{THE_REQUEST} ^[A-Z]+\ /(subdomain3)¦forum3 [NC]
. RewriteRule ^([^/]+/?)(.*) http://%1.example.com/$2 [NC,R=301,L]

I want it to be that if the condition is true, but the return is not subdomainX in that line, then assign subdomainX the value of its variable label.
Well there's the idea of it, I hope that is clear.
I tried variations with doing it in the rule, and using chaining, but if the return is empty you get nada, and I could not figure out how to assign my own variable.

I already have a solution that requires more lines, but I thought something along this tack might be more elegant and succinct.

***Is there some way to approach this problem in a similar manner to the idea proposed here?
I would think it would require the ability to assign a variable to a custom value, i.e., something that is not in the return. If so, can it be done, and what tool/method is needed?

Cheers,
TwoHawks

[edited by: jdMorgan at 7:33 pm (utc) on Nov. 1, 2008]
[edit reason] example.com [/edit]

twohawks

9:39 pm on Oct 31, 2008 (gmt 0)

10+ Year Member



Addendum to the above post...
It should be obvious, but the idea is subdomainX variable value is picked up by %1 in the rewrite rule,
...for anyone reading this ;^)

g1smd

9:49 pm on Oct 31, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



The rule will only run if all of the conditions are true, so that rule can never run -- -- true.

If you "collect" a backreference in a condition, you can only reuse it in the rule if the rule is on the very next line, so in this case I think that you can't change the [NC] to be [NC,OR] to get it working -- -- see correction below.

This could be split into three condition/rule pairs (i.e. the rule is replicated three times), with one condition before each.

[edited by: g1smd at 10:28 pm (utc) on Oct. 31, 2008]

jdMorgan

10:18 pm on Oct 31, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



> If you "collect" a backreference in a condition, you can only reuse it in the rule if the rule is on the very next line, so in this case you can't change the [NC] to be [NC,OR] to get it working.

Not quite: The back-reference is taken from the last-matched RewriteCond, so the code will work fine if [OR] flags are added to all but the final RewriteCond and any one is true. As long as you only need to back-reference matched values in a single RewriteCond (when only one will be true), this approach works fine.

Otherwise, you have to code each RewriteCond to back-reference the value(s) from previous RewriteConds, re-match them, and 'carry them along' -- But that's a subject for another time, and when it's needed.

Jim

g1smd

10:25 pm on Oct 31, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Cheers for the correction Jim. Important point that I missed.

In this case then, adding OR would fix it.

twohawks

11:56 pm on Oct 31, 2008 (gmt 0)

10+ Year Member



Hmm... yeah, that was overlooked due to copying and not rechecking, I meant to include the [NC,OR] in the first two conditions, i.e., all but the last condition.

So that fixes an oversight, but still doesn't solve the problem addressed in the question, because %1 will still always be empty when the Reference Url offered in the above example is used.
Check that reference again.

I have been looking at how to either set a value in a previous RewriteCond and back reference that (been doing some learning-up on that), and also a method for setting the value as a regexp backreference in the same line.

I'll get back to you. In the meantime, I am looking for suggestions ;^)

Cheers,
HTH

jdMorgan

12:20 am on Nov 1, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



It is not clear how the reference input URL domain.com/forumX/xyz
is related to the pattern in
RewriteCond %{THE_REQUEST} ^[A-Z]+\ /(subdomain1)¦forum1 [NC]
in that "forumX" does not appear to match "subdomain1", and I'm also not sure you intended to use "¦" instead of "/" ahead of "xyz".

Your examples must be consistent, and exactly-correct, otherwise we waste a lot of time.

Also, I suggest that you write and debug the simplest possible code: Throw out the subdomain2 and subdomain3 RewriteConds for now and just get one simple case working, then add complexity to it later once you've established a working baseline. Again, this can save a lot of time.

Jim

twohawks

1:27 am on Nov 1, 2008 (gmt 0)

10+ Year Member



Yes I intend to use the Or Pipe in this example.

response@ "It is not clear how the reference input URL domain.com/forumX/xyz is related to the pattern in RewriteCond %{THE_REQUEST} ^[A-Z]+\ /(subdomain1)¦forum1 [NC]"

Well, For the examples given above I intended the understanding:
- forumX where X is a number in the example
- subdomainX where X is number
...Thought that would come across intuitively, sorry.

Maybe this helps clarify... and remember - I realize this won't actually work, but I thought it would be a reasonable way to picture the idea of it (so I can phrase the question)...
------------------------
REDO FOR CLARITY>>>>
------------------------
Input URL here matches the rule and 2nd condition in the following example...
- "domain.com/forum2/xyz"
Input URL here ALSO matches the rule and 2nd condition....
- "domain.com/subdomain2/xyz"
(this is desired behavior)

. RewriteCond %{THE_REQUEST} ^[A-Z]+\ /(subdomain1)¦forum1 [NC]
*. RewriteCond %{THE_REQUEST} ^[A-Z]+\ /(subdomain2)¦forum2 [NC]
. RewriteCond %{THE_REQUEST} ^[A-Z]+\ /(subdomain3)¦forum3 [NC]
. RewriteRule ^([^/]+/?)(.*) http://%1.example.com/$2 [NC,R=301,L]

In this example
Each "forumX" (where X is number) is a directory with a forum in it, AND
Each "subdomainX (where X is number) is a subdomainname corresponding to relevant forumX in the same line.

IN THE SECOND CONDITION IN THE EXAMPLE ABOVE,
1) /(subdomain2)¦forum2 MEANS matching /subdomain2 OR /forum2 (**that's why either URL input offered here is true in Condition2),
2) AND place the return for (subdomain2) into a variable to be used as the subdomainname in the rule (where %1 appears)

I've tested fully and OR'ing (using the pipe) in the condition pattern is not a problem.

INPUT RETURN RESULTS...
- domain.com/subdomain2/xyz RETURNS subdomain2.domain.com/xyx
- domain.com/forum2/xyz RETURNS .domain.com/xyx <--**PROBLEM:I want subdomain2 at the beginning here,

BUT that cannot happen, obviously, because in the case of input 2 there can be no return value for variable "subdomain2" because its not there.

REASONING: IT would cut out having to write multiple Rules.

SO I WAS PONDERING THIS PROBLEM..., is it possible (without using rewritemapping file) to create my own variable value for the subdomainname and tie that in to the desired condtion in each case (as my example is meant to portray)?

Currently I am working on a backreference within the regexpression in the Conditional pattern using an idea offered here, [rewrite.example.com...]
...but, well, I am working on it...

I hope this clarifys things.
CHEERS!
HTH

[edited by: jdMorgan at 2:29 am (utc) on Nov. 1, 2008]
[edit reason] example.com [/edit]

g1smd

2:11 am on Nov 1, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



If
subdomainX
has X as a number, you can do away with all the mucking about with multiple conditions.

(subdomain([0-9]+))/

(subdomain([^/]+))/

The above two examples do almost the same thing. One may be better than the other in your application.

jdMorgan

2:27 am on Nov 1, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



OK, like I said, I'm not going in for the complication of multiple subdomains until you get one of them working. Here's a trick to get forum1 to "map" to subdomain1:

RewriteCond %{THE_REQUEST} ^[A-Z]+\ /(subdomain1)/ [NC,OR]
RewriteCond subdomain1>%{THE_REQUEST} ^(subdomain1)>[A-Z]+\ /forum1/ [NC]
RewriteRule ^([^/]+/?)(.*) http://%1.example.com/$2 [R=301,L]

If "subdomain1" is present in the requested URL, then %1 is populated with "subdomain1" from the URL itself.
If "forum1" is present in the requested URL, then %1 is populated by matching the literal "subdomain1" string provided on the left side of the RewriteCond.

The ">" is an arbitrary-but-unique character used only as a delimiter between the two combined variables; It has no special meaning to regular-expressions, and you could use other characters such as "~" or "<" or any of the others which are not allowed in HTTP URL-paths without being pre-encoded.

I'm really not sure why THE_REQUEST has crept into this thread, but although I don't think it's required and REQUEST_URI or a back-reference to the entire RewriteRule pattern could be used, it won't hurt anything to use it.

Jim

[edited by: jdMorgan at 2:28 am (utc) on Nov. 1, 2008]

twohawks

2:58 am on Nov 1, 2008 (gmt 0)

10+ Year Member



@g1smd "is... has X as number"...
TwoHawks: Thank you, but no, that's not it. That's only for discussion reference.

==========================================================
@JD: "...until you get one of them working..."
TwoHawks: One current working model I am using now...
#### 2.REDIRECT SUBDOMAIN-NAMED SUBDIRECTORY REQUESTS TO THE DNS SUBDOMAIN ####
#### Non-server Calls to domain.com/subdomain2(.*)/(.*) = http://SUBDOMAIN2.example.com/$2
RewriteCond %{THE_REQUEST} ^[A-Z]+\ /(subdomain2) [NC]
RewriteRule ^([^/]+/?)(.*) http://%1.example.com/$2 [NC,R=301,L]

# ===========================================================
#### 1.REDIRECT 'NON'-SUBDOMAIN-NAMED SUBDIRECTORY REQUESTS TO DNS SUBDOMAIN ####
#### Non-servr Calls to /forum2/.* (forum location) = subdomain2.example.com/.*
RewriteCond %{THE_REQUEST} ^[A-Z]+\ /forum2 [NC]
RewriteRule ^([^/]+/?)(.*) http://subdomain2.example.com/$2 [NC,R=301,L]

# ===========================================================
#### 3.REDIRECT IF NO SUB-DOMAIN (inject prefix www.) ####
RewriteCond %{HTTP_HOST} ^([^.]+)\.([^.]+)$ [NC]
RewriteRule ^(.*)$ http://www.%1.%2/$1 [R=301,L]

# ===========================================================
#### 4.Remove "www." from any subdomain requests ####
RewriteCond %{HTTP_HOST} ^www\.([^.]+)\.example\.com [NC]
RewriteRule (.*) http://%1.example.com/$1 [R=301,L]

# ===========================================================
#### 5.REWRITE SUBDOMAIN CALLS TO THEIR RELEVANT FOLDERS ####
#### Calls to subdomain2.example.com/.* = contents /forum2/.* ####
RewriteCond %{REQUEST_URI} !^/forum2/
RewriteCond %{HTTP_HOST} subdomain2\.example\.com [NC]
RewriteRule (.*) /forum2/$1 [L]

===========================================
And I am doing multiple forums, but that's not shown. I am working on the code to compact it tighter (this current discussion)

======================================================
@JD: re, discussion of string in teststring section of condition, and the ">" as delimiter...

TwoHawks: that's very interesting. Thank you for the suggestion. I am going to study this.

And regarding "THE_REQUEST"... I will revisit/check my use of that.

Thanks for your help and direction ;^)

Cheers,
HTH

[edited by: jdMorgan at 7:26 pm (utc) on Nov. 1, 2008]
[edit reason] Please use example.com ONLY [/edit]

twohawks

7:13 am on Nov 1, 2008 (gmt 0)

10+ Year Member



@JD... per post #:3777978 where "...then %1 is populated by matching the literal "subdomain1" string provided on the left side of the RewriteCond ..."

You know, I checked that out and realized what I had been working on was related, per [rewrite.example.com...] that I referenced above your post.

But I couldn't get it working and figured the literal must appear in the return or it simply will not work. But there's your example that works without it being returned.
. Not only did I have the placement of the variables reversed, I simply do not understand how its working?

You show...
RewriteCond subdomain1>%{THE_REQUEST} ^(subdomain1)>[A-Z]+\ /forum1/ [NC]

I had done these prior, that didn't work of course...
RewriteCond testdirect,%{THE_REQUEST} (testdirect),^[A-Z]+\ /test_nbhforum [NC]
RewriteCond %{THE_REQUEST},testdirect ^[A-Z]+\ /test_nbhforum,(testdirect) [NC]

Then after your example, this that does...
RewriteCond testdirect,%{THE_REQUEST} ^(testdirect),[A-Z]+\ /test_nbhforum [NC]

(I am using comma as a delimiter.)
I note that the subdomain variable must be first, and also in the pattern must come inside the start anchor "^".

Can you point me to where I can specifically read about this to learn how and why it is working?

Thanks!

twohawks

7:03 pm on Nov 1, 2008 (gmt 0)

10+ Year Member



============================================
Followup: While I am trying to find out some information about this method, now that we know we can 'inject a literal value' (in a manner of speaking, 'till I learn more about it), here then is my proposed one-liner resolution I have been seeking for the rewrite condition statement,
which happens to address all 6 problem-examples (relisted below) that represent the top and bottom of the quotient I seek to settle...
while also allowing me to now merge two whole sections (Sections 2 + 1 in my "working model" shown a couple posts back)

The new RewriteCond Statement and its relevant rule:


RewriteCond subdomain1,%{THE_REQUEST} ^(subdomain1),[A-Z]+\ /(subdomain1¦forum1) [NC]
RewriteRule ^([^/]+/?)(.*) http://%1.example.com/$2 [NC,R=301,L]

And so now... Sections 1 and 2 Merged:
Code is reduced
from 12 lines: 6 conditions and 6 rules
to 4 lines: 3 conditions and 1 rule....
(all subdomain or forum names are explicably unique, the numbered names are for reference here only)

INSTEAD OF THIS (sections 2+1)...
SECTION 2


. RewriteCond %{THE_REQUEST} ^[A-Z]+\ /(subdomain1) [NC]
. RewriteRule ^([^/]+/?)(.*) http://%1.example.com/$2 [NC,R=301,L]

. RewriteCond %{THE_REQUEST} ^[A-Z]+\ /(subdomain2) [NC]
. RewriteRule ^([^/]+/?)(.*) http://%1.example.com/$2 [NC,R=301,L]

. RewriteCond %{THE_REQUEST} ^[A-Z]+\ /(subdomain3) [NC]
. RewriteRule ^([^/]+/?)(.*) http://%1.example.com/$2 [NC,R=301,L]

SECTION 1


. RewriteCond %{THE_REQUEST} ^[A-Z]+\ /forum1 [NC]
. RewriteRule ^([^/]+/?)(.*) http://subdomain1.example.com/$2 [NC,R=301,L]

. RewriteCond %{THE_REQUEST} ^[A-Z]+\ /forum2 [NC]
. RewriteRule ^([^/]+/?)(.*) http://subdomain2.example.com/$2 [NC,R=301,L]

. RewriteCond %{THE_REQUEST} ^[A-Z]+\ /forum3 [NC]
. RewriteRule ^([^/]+/?)(.*) http://subdomain3.example.com/$2 [NC,R=301,L]


============================================================
I NOW HAVE THIS (Sections 2+1 merged):

RewriteCond subdomain1,%{THE_REQUEST} ^(subdomain1),[A-Z]+\ /(subdomain1¦forum1) [NC,OR]
RewriteCond subdomain2,%{THE_REQUEST} ^(subdomain2),[A-Z]+\ /(subdomain2¦forum2) [NC,OR]
RewriteCond subdomain3,%{THE_REQUEST} ^(subdomain3),[A-Z]+\ /(subdomain3¦forum3) [NC]
RewriteRule ^([^/]+/?)(.*) http://%1.example.com/$2 [NC,R=301,L]

That's what I'm talkin' about!

Examples of Results:


1.http://www.example.com/forum1 = http://subdomain1.example.com/
2.http://www.example.com/forum1/ = http://subdomain1.example.com/
3.http://www.example.com/forum1_anything_here... = http://subdomain1.example.com/
4.http://www.example.com/forum1_anything_here.../ = http://subdomain1.example.com/
5.http://www.example.com/forum1/edit.php?id=145 = http://subdomain1.example.com/edit.php?id=145
6.http://www.example.com/forum1_anything_here/edit.php?id=145 = http://subdomain1.example.com/edit.php?id=145

BTW, using the code as presented so far you cannot use REQUEST_URI instead of THE_REQUEST.

========================================================================
Furthermore, I tried to use a backreference for "subdomainX" (where X is a number) in the Condition Pattern (Right Side) trying it with both %1 or \1, but it will not work. Why is that?
Example:


RewriteCond subdomain2,%{THE_REQUEST} ^(subdomain2),[A-Z]+\ /(%1¦forum2) [NC]
RewriteCond subdomain2,%{THE_REQUEST} ^(subdomain2),[A-Z]+\ /(\1¦forum2) [NC]

...tried many variations to no avail. Shouldn't it work?

[edited by: jdMorgan at 7:35 pm (utc) on Nov. 1, 2008]
[edit reason] example.com [/edit]

jdMorgan

7:39 pm on Nov 1, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I'd recommend studying-up on regular-expressions -- particularly "anchoring", and on back-references in mod_rewrite. Links to resources are available in our Forum Charter.

Back-references such as $1 or %1 cannot be used in regular expressions patterns.

You may use "atomic back-references" such as \1 to reference pattern-matches within the current pattern, but they will only work if your server's operating system includes the POSIX 1003.2 regular-expressions library or later. Because of this dependency, I cannot recommend their use.

The correct form of the RewriteCond using REQUEST_URI would be:


RewriteCond subdomain1,%{REQUEST_URI} ^(subdomain1),/(subdomain1¦forum1) [NC,OR]

Note: Please use "example.com" only when posting here. We do not allow links to non-authoritative sites, and "domain.com" is a live site which we do not wish to link to. I have edited all previous posts to comply with this policy, but it is not a good use of time.

Jim

[edited by: jdMorgan at 8:00 pm (utc) on Nov. 1, 2008]

twohawks

8:19 pm on Nov 1, 2008 (gmt 0)

10+ Year Member



Sorry about the domain dot com thing... I had looked in the rules and did not see anything about this. I will use example.com from now on.

Posix.. that's probably why \1 is not doing it for me.

I've downloaded just about all of google on mod_rewrite back references, but finding references to the method you proffered has been extremely scant, that's why I asked something more direct if you or anyone here had it to offer. ...guess I am going to have to download the rest of google on my time off! I will scour the forum charter more closely.

I will run tests later with request_uri and post back.
Also have to run the vHosts testing later for comparison (per related stuff mentioned halfway back in this thread).

Thank you for the attention to following this problem through. This has been a great learning experience.
Cheers,
TwoHawks

jdMorgan

8:33 pm on Nov 1, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



The explanation I offered with the initial posting of that method is the best I can do. There's nothing special about it; The described behavior of those RewriteCond lines is intrinsic in how RewriteConds, regular expressions, and back-references work.

The best source for mod_rewrite info is apache.org -- Tutorials and code posted elsewhere are of widely-ranging accuracy and quality, with a decided majority being poor in both respects.

Jim

This 56 message thread spans 2 pages: 56