Forum Moderators: phranque

Message Too Old, No Replies

excluding and htaccess rewrites

having trouble understanding what specific effect order and filename has

         

Sardine

3:10 pm on Feb 21, 2009 (gmt 0)

10+ Year Member



hi all,

I've been reading topics in this forum for a couple of days and have been exploring the documentation over at the Apache site regarding rewrites in htaccess as well. I'm new to this, but I'm really trying to understand.

I've moved my site content over to a completely new server/domain, and want to redirect most requests, but not all (for example, there are very old databases that need to remain operational on the old site because of the volume of the user base and bandwidth issues).

So basically I have three rules.
1: exclude requests for 4 specific files from redirect
2: exclude requests for any files in one particular directory (these are php files that access the mysql database) from redirect
3: exclude requests to the mysql database from redirect

All other requests are forwarded to the root of my new site.

So, here's what I have so far:

[code]
Options +FollowSymLinks
RewriteEngine on
RewriteCond $1 !^http://www.oldsite.com/path/to/directory
RewriteCond %{HTTP_HOST} ^mysql\.oldsite\.com
RewriteRule .* - [L]
RewriteCond $1 !^file1.zip¦file2.dmg.zip¦file3.sit¦file4.tar.gz
RewriteRule (.*) [newsite.com...] [R=301,L]
{/code]

Sorry for that long preamble. So I have a few problems:

1. The code above does not work perfectly. Specifically, I have never gotten the exclusion of the mysql database to work. (The php files do not seem to be accessible and consequently don't seem to be returning any data.) Does mysql.oldsite.com not count as a subdomain?

2. Depending on where I put the mysql RewriteCond and RewriteRule that corresponds to it, *some* but not all of my file excludes stop working -- if any work, file1.zip works, no matter where in the 'list' of 4 files it is placed. The others simply redirect to the main page. I understand that this might be due to the ordering of my conditions, but I am having trouble understanding exactly why.

3. I understand that order of conditions should go from most to least specific, but when I modify the order at all, redirects generally stop working in one area or another.

4. Do I need to worry about redirects and email accounts which are still used on the old domain?

I'm sorry about all these questions, I'm sort of approaching the end of my rope -- I'd appreciate any help at all!

Sardine

3:11 pm on Feb 21, 2009 (gmt 0)

10+ Year Member



oops, I screwed up that code:


Options +FollowSymLinks
RewriteEngine on
RewriteCond $1 !^http://www.oldsite.com/path/to/directory
RewriteCond %{HTTP_HOST} ^mysql\.oldsite\.com
RewriteRule .* - [L]
RewriteCond $1 !^file1.zip¦file2.dmg.zip¦file3.sit¦file4.tar.gz
RewriteRule (.*) http://www.newsite.com/ [R=301,L]

Sardine

4:14 pm on Feb 21, 2009 (gmt 0)

10+ Year Member



Hold on, I may have stumbled onto something, here...Like I said, I figured it likely something related to order, and I thought combining the two rewrite rules might work.


Options +FollowSymLinks
RewriteEngine on
RewriteCond %{HTTP_HOST} ^mysql\.oldsite\.com
RewriteRule .* - [L]
RewriteCond $1 !^file1.zip¦file2.dmg.zip¦file3.sit¦file4.tar.gz¦path/to/directory
RewriteRule (.*) http://www.newsite.com/ [R=301,L]

Seems to work! Can anyone help me out with why, exactly?

jdMorgan

4:15 pm on Feb 21, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Order should *generally* be from most- to least-specific, but not always. Since the effect of your first rule above is to "quit processing mod_rewrite" if the first rule's conditions match, it's clear that moving it would change the overall function of the code.

The major problem with this code is likely the mis-use (or mis-understanding) of what values the mod_rewrite variables contain. Another problem is that the file-type pattern anchoring was mis-scoped, in that only "file1.zip" was start-anchored.

Try the following changes, replacing two rules with one, and see if they get rid of most of the problems:


Options +FollowSymLinks
RewriteEngine on
#
# Redirect all requests to corresponding URL on newsite.com, except for requests for the
# mysql subdomain, files in "/path/to/directory" and specific .zip, .sit, and .gx files
RewriteCond %{HTTP_HOST} !^mysql\.oldsite\.com
RewriteRule $1 !^path/to/directory
RewriteCond $1 !^(file1\.zip¦file2\.dmg\.zip¦file3\.sit¦file4\.tar\.gz)$
RewriteRule (.*) http://www.newsite.com/$1 [R=301,L]

Replace the broken pipe "¦" characters above with solid pipes before use; Posting on this forum modifies the pipe characters.

Check my comments. The code will do exactly what they say, and that may or may not be what you want. Please carefully and concisely comment any code you post back here, to make your intent clear.

Make sure that all URL-paths in the regex patterns are correct and complete. URL-paths must be relative to the directory where this code resides, and must not include the protocol or domain name.

I cannot comment on your questions about "particular php files" and such, because I don't know how you define those terms. Apache mod_rewrite works based on URLs or their associated filepaths, and everything must be defined in terms of either the client-requested HTTP_HOST and URL-path, or the corresponding internal file-path. If you need to exclude a particular URL-path, directory, or file, then it must be explicitly named -- both here for clear discussion, and also in the code.

[edit] Cross-posted. See code update. [/edit]

Jim

[edited by: jdMorgan at 4:19 pm (utc) on Feb. 21, 2009]

Sardine

4:18 pm on Feb 21, 2009 (gmt 0)

10+ Year Member



oops, forgot to add one thing -- I did a little more research and learned that my redirects don't affect email, so that's solved. Hooray!

Sardine

5:08 pm on Feb 21, 2009 (gmt 0)

10+ Year Member



Thanks for your help, Jim, I really really appreciate it.

There are some things I was not clear on that I'll clarify here: The new site is entirely different (pages on the new site don't correspond to pages on the old site), so I can't rewrite to newsite.com/$1 -- not a problem, redirecting to the main page is totally fine by me and desired behaviour.

About your mysql condition+rule -- sorry, I was unclear on that. I need both [any requests to the mysql database] and [any requests to the php directory] to be ignored and served without redirect/rewrite.

I've tested this code further and have to change my assessment to: it works, sort of:


Options +FollowSymLinks
RewriteEngine on
RewriteCond %{HTTP_HOST} ^mysql\.oldsite\.com
RewriteRule .* - [L]
RewriteCond $1 !^file1\.zip¦file2\.dmg\.zip¦file3\.sit¦file4\.tar\.gz¦path/to/directory
RewriteRule (.*) http://www.newsite.com/ [R=301,L]

Using this code, almost everything functions exactly as I'd like it to. The only problem is that the first file specified in the second RewriteCond line never works (it simply redirects to the main page of newsite), but if it is repeated at the end of that condition, it functions as I'd like (doesn't redirect). But obviously, that's hacky and inelegant.

So I tried making my code more closely match yours.
If I enclose the files and directory in () and add a $ to the end of the line, so:


RewriteCond $1 !^(file1\.zip¦file2\.dmg\.zip¦file3\.sit¦file4\.tar\.gz¦path/to/directory)$

then it stops working. Similarly, if I add ! to the mysql condition, like so:


RewriteCond %{HTTP_HOST} !^mysql\.oldsite\.com

or any combination of those two changes, the all redirects stop working -- the old site is up and visible as if there was nothing in my .htaccess.

I know there's something wrong with the syntax, since removing


RewriteRule .* - [L]

makes everything stop working, but shouldn't that omission cause everything to redirect, including the files and directories I'd like to exclude?

So...I'm still trying to figure out how my hacky code functions, but is so obviously faulty.

jdMorgan

5:36 pm on Feb 21, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Use what I posted in its entirety and exactly as shown except for the obscured "example" domain name, and please do not try to "guess" at solutions. The chances of guessing correct code are essentially zero.

If I enclose the files and directory in () and add a $ to the end of the line, so:

RewriteCond $1 !^(file1\.zip¦file2\.dmg\.zip¦file3\.sit¦file4\.tar\.gz¦path/to/directory)$

then it stops working.


That clearly indicates that these URL-paths are incorrect or incomplete. Also, you have incorrectly assumed that you could add "¦path/to/directory" to that line. You can't, because then only *exactly* "example.com/path/to/directory" will be excluded, and not "example.com/path/to/directory/" or "example.com/path/to/directory/<some directory or file>". Please do not make changes to the code structure, just test it as-is and report the results, so we here can all follow along.

Look carefully at those URL-paths, though:

Make sure that all URL-paths in the regex patterns are correct and complete. URL-paths must be relative to the directory where this code resides, and must not include the protocol or domain name.

If the comments and code I posted don't make sense, or if you do not understand exactly what the directives do or how they work, or if my comments do not agree *exactly* with what you are trying to do, then stop and let's address these issues before doing any more code tweaks. Coding and testing are the last things you do; Precisely defining the problem and researching all the correct variable values has to be done first. Rushing through the requirements and research phases directly to coding almost always results in a huge waste of time.

Jim

Sardine

5:52 pm on Feb 21, 2009 (gmt 0)

10+ Year Member



Can you explain to me why it works, even though the syntax is so wrong? That doesn't make sense to me.

Okay I decided to remove the mysql database. That cleared some things up for me.

This works:


Options +FollowSymLinks
RewriteEngine on
RewriteCond $1 !^path/to/directory
RewriteCond $1 !^(file1\.tar\.gz¦file2\.dmg\.zip¦file3\.sit¦file4\.zip)$
RewriteRule (.*) http://www.newsite.com/ [R=301,L]

Does that address people entering 'path\to\directory' with a '\' and without?

Thank you again for your help!

jdMorgan

6:02 pm on Feb 21, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Does that address people entering 'path\to\directory' with a '\' and without?

No, because backslashes are not used in URLs. However, it will "address" people entering
path/to/directory<anything> because that pattern is no longer end-anchored.

Please define all of your statements in terms of URLs or file-paths. What URL(s) are used for "the mysql database"?

Jim

[edited by: jdMorgan at 6:03 pm (utc) on Feb. 21, 2009]

Sardine

6:24 pm on Feb 21, 2009 (gmt 0)

10+ Year Member



Sorry, that URL is still the same:

mysql.oldsite.com

Jim, thanks again for your help -- frankly, I'm used to programming with a debugger, or the ability to trace strings and vars -- when I have to program blind like this, research, trial and error, plus a little deductive reasoning seemed like the way to go. Thanks for helping me, and all of us, make sense of this.

jdMorgan

6:33 pm on Feb 21, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



So what code did you end up with after including the mysql hostname?

Jim

Sardine

7:14 pm on Feb 21, 2009 (gmt 0)

10+ Year Member



Well, to be honest, I sort of cheated -- because of the particulars of oldsite, I discovered that I can avoid having to exclude mysql.oldsite.com by simply moving the .htaccess file from oldsite.com/ to oldsite.com/folder1/ (which happens to be where all my other exclusions are located) and then updating paths to affected files/directories in the RewriteCond lines. This leaves oldsite.com/index.html as it always was, but the nature of the site is such that the only things that *really* need to be redirected are located in oldsite.com/folder1, so it's actually fine this way.

Ultimately, I am still not sure what the correct solution is, which does disappoint me... However, it looks like my problems have been pretty much solved, for which I am extremely grateful. Thanks again for your patience and help!