Forum Moderators: phranque

Message Too Old, No Replies

Mixed up redirects with mod_rewrite

rewrite works but noew 301 line does not...

         

cousins

9:56 am on May 30, 2004 (gmt 0)

10+ Year Member



I have been struggling to understand mod_rewrite for some time but through trial and error and a little restructuring of my web sites, I have finally got something working the way I want it. However this one has me really puzzled and I need to implement something quickly to ensure continuity...

I have a site a.b.com that was originally built as a static site for my local club. I recently converted it over to php and mysql and used this rewrite to send everything to the server that now handles the site:


RewriteEngine On
RewriteBase /
RewriteCond %{HTTP_HOST} ^(www\.)?a.b.com$ [NC]
#RewriteRule ^http://a\.b\.com/classifieds/(.+)$ http://www.classifieds.zzz.com/$1 [R=301,L]
RewriteRule ^$ http://a.b.com/23.html [P,L]
RewriteRule ^(.*)$ http://yyy.com/server/$1 [P,L]

When I discovered requests for the classifieds section would not work I tried working out an exclusion for that directory but couldn't get one working so I established a whole new site called classifieds.zzz.com, changed a few links and bingo it worked. However I then realised that requests bookmarked for the classifieds section would get a 401 error so I figured redirecting requests for a.b.com/classifieds to the new site (classifieds.zzz.com) before sending things off to the new server would catch these ... my attempt at doind so is commented out in the above .htaccess - obviously it doesn't work as I'm getting an error "The requested URL /server/classifieds does not exist" - indicating to me the request has been sent to the server at yyy.com instead of the 301 redirect being invoked.

Where have I gone wrong - I don't want the club to miss out on its regular visitors to its classifieds section (the club is non-profit and I built and maintain the site for free as a hobbyist enthusiast)

gergoe

4:03 pm on May 30, 2004 (gmt 0)

10+ Year Member



You've made quite a lot of mistakes; I'll take them into points;
1.) In the RewriteRules you should not use the full url, only the local part of it. So [domain.com...] will never be matched, but if you use / only, then it will match. See the RewriteRule syntax [httpd.apache.org]
2.) The rewriting is grouped into rules, and each group is represented by a RewriteRule as a last (closing) command. RewriteConds are only evaluated for the next RewriteRule (the last one in the group), or if you put more RewriteConds, then the RewriteRule after the last RewriteCond will be evaluated together, any RewriteRule following this will not be conditional. If you want the RewriteRules evaluated only if the request is for a specific domain, then put the RewriteCond in front of each RewriteRule. See the RewriteCond syntax [httpd.apache.org]. By the way I don't think you really need this RewriteCond, the only case when you need this is if the directory where this htaccess file resides has more than one hostname pointing to, and you want the rules evaluated only for some of these hostnames.
3.) The Proxy flag for the RewriteRule means, that the apache will fetch the url from a remote server through the internal proxy. Using proxy request for the local apache is a bit silly, and using it for a remote server when you don't need it, is unnecessary load on the server. Use the Proxy flag only if that's the only way to implement something.
4.) You've made some (theoretical and practical) mistakes with the Regular Expressions, I suggest you to find some resources about this topic on the internet, by searching for Regular Expressions on google (for example), it will give you a lot of choices.

After correcting the rules, it will look like this:


RewriteEngine On
RewriteBase /
RewriteRule ^classifieds/(.*)$ http://www.classifieds.zzz.com/$1 [R=301,L]
RewriteRule ^$ 23.html [L]
RewriteRule ^(.+)$ http://yyy.com/server/$1 [P,L]

cousins

10:06 am on Jun 10, 2004 (gmt 0)

10+ Year Member



Gergoe - my apologies for the delay in responding - I don't want you to think I'm the "ask question then disappear" type - I've been ill for the last month. :(

First let me thank you for taking the time to respond - I found your reply very informative and helpful in pointing out my errors. I have re-read the syntax for both RewriteCond and RewriteRule and hope my understanding has improved a little.

I've read your comments re rules and conditions however I'm still puzzled by this comment in point #2:

[i]By the way I don't think you really need this RewriteCond, the only case when you need this is if the directory where this htaccess file resides has more than one hostname pointing to, and you want the rules evaluated only for some of these hostnames.[/i]

I was under the impression that this would make sure that requests for both www.a.b.com and just a.b.com would be accepted.... a lot of sites seem to be set up so that a 404 is issued if you try to get there without using the www. Am I wrong in my understanding?

[i]3.) The Proxy flag for the RewriteRule means, that the apache will fetch the url from a remote server through the internal proxy. Using proxy request for the local apache is a bit silly, and using it for a remote server when you don't need it, is unnecessary load on the server. Use the Proxy flag only if that's the only way to implement something.[/i]

I think this is in reference to my using the proxy on the line that reads "RewriteRule ^$ [a.b.com...] [P,L]" and I take your point that for the local redirect its not necessary (after re-reading the syntax and some tutorials)

[i]4.) You've made some (theoretical and practical) mistakes with the Regular Expressions, I suggest you to find some resources about this topic on the internet, by searching for Regular Expressions on google (for example), it will give you a lot of choices.[/i]

Thanks I'll do that when I have more time - I've always struggled with regular expressions and the use of various slashes, dots etc.

So here is my current attempt at the file which appears to work in the way I want it to - if you have time I'd appreciate any comments you might wish to make.

[b]RewriteEngine On

RewriteBase /
# This line sends all requests for classifieds to the new url
RewriteRule ^classifieds/(.*)$ http://www.classifieds.zzz.com/$1 [R=301,L]

# This line sends requests for files under directory "articles" straight there
# (I found I had some hard linked photos that have not yet been taken care of)
RewriteRule ^articles/(.+)$ - [L]

# This line points everything that is not covered by the lines above to 23.html
RewriteRule ^$ 23.html [L]

# This line ensures the requests are proxied properly so graphics, css, etc
# links are picked up - wasn't sure if it should be a + or *
RewriteRule ^(.+)$ http://yyy.com/server/$1 [P,L]
[/b]

One further question: I have a series of directories on yyy.com that will need to be excluded in a similar fashion to articles/ above - is it best to process these as a directory per line in the access file or to do it as one line in the format "(articles/¦jobs/¦auctions/)" - I'm looking for a solution that will put the least strain on the server.... :)

Once again thanks for taking the time to help me....

jdMorgan

4:59 pm on Jun 10, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



# This line points everything that is not covered by the lines above to 23.html
RewriteRule ^$ 23.html [L]

No, that only points your default index file to 23.html, since "^$" means "blank". You probably want:

RewriteRule .* 23.html [L]

However, that might lead to a "loop", so you'll want to exclude "23.html" itself from being redirected:
RewriteRule !^23\.html$ 23.html [L]

# This line ensures the requests are proxied properly so graphics, css, etc
# links are picked up - wasn't sure if it should be a + or *
RewriteRule ^(.+)$ [yyy.com...] [P,L]

".+" means "one or more of any character." ".*" means "any number (including zero) of any character." In this case, since you're looking for css files, etc., you know that the local path won't be blank, so ".+" is fine.

One further question: I have a series of directories on yyy.com that will need to be excluded in a similar fashion to articles/ above - is it best to process these as a directory per line in the access file or to do it as one line in the format "(articles/¦jobs/¦auctions/)" - I'm looking for a solution that will put the least strain on the server

The local OR version will be slightly faster. But use as many lines as you need to make the code maintainable and 'neat'.

RewriteRule ^(articles¦jobs¦auctions)/. - [L]

Make sure to change all broken pipe "¦" characters you see on this board to solid pipes before use in .htaccess; Posting on this board modifies the character, and the broken pipe will cause a 500-Server Error.

Refs:
Apache mod_rewrite documentation [httpd.apache.org]
Apache URL Rewriting Guide [httpd.apache.org]
Regular Expressions Tutorial [etext.lib.virginia.edu]

Jim

cousins

10:16 pm on Jun 10, 2004 (gmt 0)

10+ Year Member




# This line points everything that is not covered by the lines above to 23.html
RewriteRule ^$ 23.html [L]

No, that only points your default index file to 23.html, since "^$" means "blank". You probably want:

RewriteRule .* 23.html [L]

Thanks JD - you are right - I actually wanted to say this line points all "blank" requests to 23.html - all other requests are handled by the line that points and proxies to the "server" location.

The local OR version will be slightly faster. But use as many lines as you need to make the code maintainable and 'neat'.

RewriteRule ^(articles¦jobs¦auctions)/. - [L]

Make sure to change all broken pipe "¦" characters you see on this board to solid pipes before use in .htaccess; Posting on this board modifies the character, and the broken pipe will cause a 500-Server Error.

Again thanks for the clarification - I'd picked up that the board changes pipes to the broken pipe character and the Linux access file contains the correct character.

Now to carry on and try to sort out some other stuff.... :)