Forum Moderators: phranque

Message Too Old, No Replies

virtual subdomains using mod rewrite

need help to debug my rewrite rules

         

dawnray

12:12 am on Sep 27, 2004 (gmt 0)

10+ Year Member



Hi I'm trying to get mod rewrite to handle subdomains using an internal rewrite to a file in a directory of subdomains. If the file exists. Otherwise the rule should leave the URL alone. I have the following code:

RewriteEngine on
RewriteCond %{HTTP_HOST} ^([a-z0-9]+)\..+\..*$ [NC]
RewriteCond /path/to/public_html/%1!-d
RewriteCond /path/to/public_html/%1!-l
RewriteCond /path/to/public_html/subs/%1.php -f
RewriteRule ^(.*)$ [www,mydomain.info...]

this gives me an internal server error when I request [foo.mydomain.com...] and foo.php exists in subs. I've tried it without the www and withouth [www...] and both seem to just go away and hang.

Any ideas whats wrong or how to fix it?

jdMorgan

12:48 am on Sep 27, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



dawnray,

Welcome to WebmasterWorld!

Yes the problem is that you can't have a variable back-referenced in the pattern of a RewriteCond. You can create a back-reference in the pattern, but you can't use one.

Well, sort of... There is a "tricky" work-around, which we discussed in a thread titled Rewriting arbitrary subdomains to subdirectories [webmasterworld.com], related to your project. You will have to add your "check for directory exists logic" to the code, but the basic technique may be helpful.

Jim

dawnray

5:00 pm on Sep 27, 2004 (gmt 0)

10+ Year Member



Thanks for writing back Jim. I've read the thread and its clear you guys know mod rewrite. I'm new to it and not that hot on regular expressions even. I copied the code from someone else's recommendation and made sure I understood it, but I had real trouble following your thread. I've seen about 4 different approaches to this same problem and its hard to know which strategy is the best to follow.

So going back to first principles. I have a php script which from time to time creates and writes a new php script called newsubdomain and puts it in directory subs. Now anytime someone requests newsubdomain.mydomain.info I want them to see subs/newsubdomain.php without any external redirect.

If there's no file, then the default apache behaviour returns index.htm which is cool.

As far as I can tell the rewritecond clauses are working. If there's no file in the subs directory I see the default index. When there is a file, apache hangs, presumably disappearing into an endless recursion.

How is this prevented?

jdMorgan

7:58 pm on Sep 27, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



The recursion problem is in large part the reason for the code in the thread I cited. In addition to your requirement to check for subdoamin-exists, you must also check to make sure that you are not rewriting subdomain1.example.com/subdirectory1 to subdomain1.example.com/subdirectory1/subdirectory1 to subdomain1.example.com/subdirectory1/subdirectory1/subdirectory1, etc. Without a check of whether the subdirectory name already matches the subdomain name, this will occur.

So, you'll need to integrate this function of the code I cited into your subdomain->subdirectory RewriteRule.

Jim

dawnray

10:14 pm on Sep 27, 2004 (gmt 0)

10+ Year Member



ok

can you explain how this recursion happens in the first place - this would probably help me get a grip on the code.

thanks
philip

dawnray

10:34 pm on Sep 27, 2004 (gmt 0)

10+ Year Member



"the problem is that you can't have a variable back-referenced in the pattern of a RewriteCond"

I'm not sure this is right. Probably I should have one in the extra cond that I didn't know I needed to write ... am I on the right lines here?

dawnray

12:45 am on Sep 28, 2004 (gmt 0)

10+ Year Member



ok I've been through the thread more carefully and the one it refers to and I've now got this code which returns a 500 error just from www.mydomain.info, as well as everything else I request

# rewrite only if host is not empty
RewriteCond %{HTTP_HOST}--space--!^$
# rewrite only if host is not main server
RewriteCond %{HTTP_HOST}--space--!^(www\.)?mydomain\.info$ [NC]
# extract subdomain and path, ignore leading www
RewriteCond %{HTTP_HOST}<->%{REQUEST_URI} ^(www\.)?([^.]+).*<->/([^/]+)
# dont rewrite if already rewritten
RewriteCond %3<->%4!^(.*)<->subs/\1.php$ [NC]
# check for directory or symlink
RewriteCond /path/to/public_html/%3--space--!-d
RewriteCond /path/to/public_html/%3--space--!-l
# check file exists
RewriteCond /path/to/public_html/subs/%3.php -f
RewriteRule ^(.*)$ /subs/%3.php

I've tried commenting rewriteconds until I'm left with just the rule, and I still get a 500 error.

any ideas?

jdMorgan

2:19 am on Sep 28, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



What does your server error log say when you get a 500-Server Error?

Are you running this code in .htaccess or in httpd.conf? There is a subtle syntax difference in RewriteRule patterns between the two.

Jim

dawnray

2:56 am on Sep 28, 2004 (gmt 0)

10+ Year Member



its in htaccess.

I don't know where to look for the error log or whether I have permission to read it. I'll check.

dawnray

3:37 pm on Sep 28, 2004 (gmt 0)

10+ Year Member



The error log message looks like this

[Tue Sep 28 01:18:05 2004] [alert] [client 195.137.20.104] /home/dawnray/public_html/.htaccess: RewriteCond: cannot compile regular expression &#039;!^http://(www\.)?([^\.]\.mydomain\.info/.*$&#039;

dont know what the #039 is - the error log seems to have added a ( in the middle. I have tried this without the (www\.) but that doesn't seem to help.

jdMorgan

3:49 pm on Sep 28, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Looks like you created the file with a word processor-type application, instead of a plain text editor, thus introducing a special character that needs to be coded as &#039. Try using NotePad instead. You can also use WordPad or Word, but only if you select the option to save as plain-text only.

Reviewing your application, the following simplified code should work. Since your subdomain-subdirectorie names are made unique by prepending "subs", you probably won't need the fancy anti-recursion code I cited. I'm also not sure you need to check for "exists as directory" or "exists as symlink". So try the simplified version first, and then add one or both of those checks back in only if needed.


Options +FollowSymLinks
RewriteEngine on
RewriteCond %{HTTP_HOST} .
RewriteCond %{REQUEST_URI} !^/subs/
RewriteCond %{HTTP_HOST} !^(www\.)?mydomain\.info [NC]
RewriteCond /path/to/public_html/subs/%1.php -f
RewriteRule .* /subs/%1.php [L]

Jim

dawnray

9:00 pm on Sep 28, 2004 (gmt 0)

10+ Year Member



Thanks Jim

Well I've put the code in, exactly as written, using textedit (mac equivalent of notepad) and just getting 500 errors. I have to ask for the error logs to be emailed to me. so I'm waiting for that.

dawnray

9:57 pm on Oct 3, 2004 (gmt 0)

10+ Year Member



One of the problems I'm struggling with with this code is on a rewritecond like

RewriteCond %{HTTP_HOST}!^(www\.)?mydomain\.info$ [NC]

presumably the! means there is no backreference generated in this rule. However with

RewriteCond %{HTTP_HOST} ^(www\.)?([^.]+).mydomain\.info$ [NC]

is there a backreference associated with the (www\.)? part?

I can't work out what number to use in my backreference to pick up the subdomain name.

jdMorgan

11:28 pm on Oct 3, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



> RewriteCond %{HTTP_HOST} !^(www\.)?mydomain\.info$ [NC]

This means, "If the HTTP_HOST is NOT www.mydomain.info or mydomain.info, upper, lower, or mixed-case, and does not include the optional port number, then the following RewriteRule applies."

"!" means NOT when it precedes regular expressions in mod-rewrite. Backreferences are created any time parentheses are used. Only the backreferences in the last-matched RewriteCond will be available to RewriteRule; if you use and back-reference multiple RewriteConds, their order can be critical. In the example you show, %1 will contain "www." or blank, and %2 will contain the subdomain name.

Some browsers and other user-agents will append a port number to the HTTP_HOST variable, and your rule will fail in that case. I suggest you omit the end-anchor from the pattern above, and just use:

RewriteCond %{HTTP_HOST} !^(www\.)?mydomain\.info [NC]

This will allow a port number to be appended with no ill effect.

These issues are well-covered in the documentation cited in our forum charter [webmasterworld.com].

Jim

dawnray

12:01 am on Oct 4, 2004 (gmt 0)

10+ Year Member



Thanks Jim

I have read the docs, some several times but its easy to miss details like the last rewrite cond matched thing.

so now I'm thinking I should put the subdomain extraction in to the rewriterule so it goes

RewriteRule ^(www\.)?([^.]+)\.mydomain\.info /subs/$2.php [L]

Then I can do whatever tests I like in the rewriteconds and backreference $2 for my subdomain name.

The other question I have is whether I need to use rewritebase to prevent the subdomain being reinjected into the URL after the rewrite is completed.

jdMorgan

12:48 am on Oct 4, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



> so now I'm thinking I should put the subdomain extraction in to the rewriterule

You don't have that choice, unfortunately. The subdomain is not directly available to RewriteRule, that's why we use RewriteCond %{HTTP_HOST} to access it.

Mod_rewrite can be seen as a very powerful, very precise, but small "language." Therefore there are few right ways to do things, and many wrong ways. Once you get used to "the rules," it's fairly easy to use, because regular expressions and mod_rewrite are both compact-but-powerful "languages" and can be fully learned in a few months.

I should also note that both share another characteristic: They are both far easier to write than to read. This is one reason they are not easy to learn.

I'm afraid it takes some dedication to sit down and read through the documentation of "API phases," "RewriteCond," and "RewriteRule" thoroughly and carefully -- and several times, and then make reference to those sections while working. After a few months, you won't have to look stuff up. However, when I want something to work the first time, I usually look it up and check it anyway.

If you run into a brick wall with RewriteCond back-reference order, remember that you can use multiple RewriteRules in sequence to do a rewrite step-by-step, and you can also create and reference server variables using mod_rewrite, allowing you to create extremely complex rules.
See RewriteRule...[E=Var:Val] and RewriteCond %{ENV:Var}

Let's see what the log file says about the most recent 500 error. In the meantime, don't add further complexity until that problem is resolved.

Jim

dawnray

3:02 pm on Oct 4, 2004 (gmt 0)

10+ Year Member



Thanks Jim

The specific confusion about backreferences has arisen as I have seen quite a few rewrite rules which appear to backreference the same variable in a sequence of rewrite conds. I can't find any reference to this in the rewrite mod documentation which as you point out says you can only backreference the last matched condition.

Does this work or not? If I backreference a pattern in one rewritecond is it available to the next rewritecond?

dawnray

10:44 pm on Oct 4, 2004 (gmt 0)

10+ Year Member



The server errors from yesterday were not that helpful they simply say illegal option rewrite engine.

I'm currently trying to get around the backreferencing problem with a repeat of an earlier condition just before the rewrite rule.

Still gives me 500 errors when I test it.

my rewrite code currently is
RewriteEngine On
RewriteCond %{HTTP_HOST} .
RewriteCond %{HTTP_HOST}!^(www\.)?mydomain\.info [NC]
RewriteCond %{REQUEST_URI}!^/subs/
RewriteCond %{HTTP_HOST} ^(www\.)?([^\.]+)\.mydomain [NC]
RewriteCond /path/to/public_html/subs/%2.php -f
RewriteCond %{HTTP_HOST} ^(www\.)?([^\.]+)\.mydomain [NC]
RewriteRule ^rewriteme(.*)$ /teachers/%2.php [L]

I stuck rewriteme into the rewrite rule to try to simplify things. As I understand it the whole rewrite thing should fail on the first condition unless I have rewriteme in the request. However I still get a 500 error just from requesting www.mydomain.info.

Once again, we need to wait for the logs.

jdMorgan

11:43 pm on Oct 4, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



The code above seems to be mixing "teachers" and "subs." It only really makes sense if you have one or the other.

Just double-checking here, but do you have the required space after each "}" in the RewriteConds? Posting code on this board removes spaces preceding "!" and always leaves this question open.

Do you have access to another host where you could test this code without having to e-mail them for a copy of your log files? This is a rather serious handicap to deal with, and it's hard for me to believe you can work without access to your own log files!

Jim

jdMorgan

12:09 am on Oct 5, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I tested a version of this code on one of my servers, and it appeared to work. No server error was returned. However, in addition to different domain name and paths, the following highlighted changes were made:

[b]Options +FollowSymLinks[/b]
RewriteEngine On
RewriteCond %{HTTP_HOST} .
RewriteCond %{HTTP_HOST} [b]![/b]^(www\.)?mydomain\.info [NC]
RewriteCond %{REQUEST_URI} [b]![/b]^/subs/
RewriteCond %{HTTP_HOST} ^(www\.)?([b][^.][/b]+)\.mydomain [NC]
RewriteCond /path/to/public_html/subs/%2.php -f
RewriteCond %{HTTP_HOST} ^(www\.)?([b][^.][/b]+)\.mydomain [NC]
RewriteRule ^rewriteme(.*)$ /[b]subs[/b]/%2.php [L]

Jim

jamesa

1:37 am on Oct 5, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



>> using textedit

Just in case you didn't know: In TextEdit be sure you've selected "Make Plain Text" from the Format menu. Otherwise you'll end up with an RTF file.

dawnray

7:08 am on Oct 5, 2004 (gmt 0)

10+ Year Member



Yes, the teachers / subs different arose because I was anonymising the code and didn't finish the job.

I have textedit preferences set to plain text

I have apache running on my local development machine, but I'm not too sure how to set it up for testing. I'll look into it.

dawnray

8:10 am on Oct 7, 2004 (gmt 0)

10+ Year Member



I've searched high and low for some error logs on my dev machine and can't see them. Its a mac running mac OS X 2 (not OS X server).

The last error logs I got back from my host said

RewriteCond: bad flag delimiters

jdMorgan

2:18 pm on Oct 7, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



"Bad flag delimiters" usually indicates a syntax error, specifically, either a missing space preceding a "!" in a RewriteCond pattern, or an unescaped space in a regular-expressions pattern.

Posting on this board removes spaces preceding any "!"; make sure you correct those deletions where they occur. You can work around this problem when posting on this forum by using two spaces or by preceding the "!" with a bold-unbold tag pair. I have done so in all the eaxmples I posted.

Jim

dawnray

7:10 pm on Oct 9, 2004 (gmt 0)

10+ Year Member



Its working at last.

The problem was Option +FollowSymLinks. Once I switched this off and built up the rule gradually from nothing, it all fell into place.

Thanks Jim, for all the help and suggestions and hanging in with me. I would probably have given up by now without you.

Best wishes

Philip