Forum Moderators: phranque

Message Too Old, No Replies

minor caveat to working rewrites

         

ragnarokx

4:26 am on Jun 11, 2010 (gmt 0)

10+ Year Member



Hi! I've been searching the internet and this looks like the most informative Apache forum, so here goes (2 questions):

1) I finally made a working .htaccess file to force https on my password protected pages and http on all other pages. Here it is:


RewriteEngine On
RewriteBase /

RewriteCond %{SERVER_PORT} !^443$
RewriteRule ^(login/(.*)|membersection/(.*)|eboardarchives/(.*)|)$ [%{HTTP_HOST}%{REQUEST_URI}...] [R=301,L]

RewriteCond %{SERVER_PORT} ^443$
RewriteRule !^(login/(.*)|membersection/(.*)|eboardarchives/(.*)|)$ http://%{HTTP_HOST}%{REQUEST_URI} [R=301,L]


Example: [csun.edu...]
This works great, except for when http://www.csun.edu/phide/ is requested, it then redirects to https://www.csun.edu/phide/ and I can't figure out why. How do I modify my code to force non-secure for the "naked root" URL?

2) The original address for my website was "http://www.csun.edu/~phide/..." and I got the alias "http://www.csun.edu/phide/..." from my school's IT dept.
How do I redirect from "/~phide/" to "/phide/" for the entire site/all subdirectories? Is there a way to do this without specifying a specific redirect to "http://www.csun.edu/phide/..." since some of my pages use http and some use https?

This is a website for my pre-med/philanthropy fraternity. Any help is greatly appreciated!

[edited by: jdMorgan at 3:03 pm (utc) on Jun 11, 2010]
[edit reason] Disabled smiley-faces in code [/edit]

g1smd

6:57 am on Jun 11, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Whatever that "|)" represents is responsible for the redirect when "/" is requested. That last character after the final | matches "or blank path".

The pattern is maybe better expressed as

^(login|membersection|eboardarchives)/(.*)$


or just

^(login|membersection|eboardarchives)/


if you don't need to capture a backreference.


Beware that your current ruleset uses %{HTTP_HOST} in the target and therefore does not fix any www and non-www duplicate content issues. The www.example.com hostname should be explicitly coded into the rule's target URL to fix that particular problem.

[edited by: jdMorgan at 3:03 pm (utc) on Jun 11, 2010]
[edit reason] Changed reference to smiley-faces. [/edit]

ragnarokx

7:17 am on Jun 11, 2010 (gmt 0)

10+ Year Member



Great! That worked perfectly, thank you. Now for my other obstacle - how to always redirect "csun.edu/~phide/..." to "csun.edu/phide/...' ?

g1smd

7:35 am on Jun 11, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



You could do this with a single redirect ahead of your existing redirects and retain the originally requested protocol, but that might lead to a redirection chain for some requests.

The alternative is to have two rules ahead of the existing rules. One redirects specific requests to https at the new URL, and the other redirects specific requests to http at the new URL. They will each need a "list" like the existing two rules have.

ragnarokx

8:28 am on Jun 11, 2010 (gmt 0)

10+ Year Member



The alternative is to have two rules ahead of the existing rules. One redirects specific requests to https at the new URL, and the other redirects specific requests to http at the new URL. They will each need a "list" like the existing two rules have.


I've been searching through the forum and found a few examples. I hope this is roughly what you were talking about:

#RewriteRule ^https://www.csun.edu/~phide(.*)$ [csun.edu...] [R]
#RewriteRule ^http://www.csun.edu/~phide(.*)$ http://www.csun.edu/phide(.*) [R]

I tried this and it didn't work, but hopefully I'm close.

Beware that your current ruleset uses %{HTTP_HOST} in the target and therefore does not fix any www and non-www duplicate content issues. The www.example.com hostname should be explicitly coded into the rule's target URL to fix that particular problem.


Sorry, I missed your edit to your previous post. If requesting "http://csun.edu/phide/..." always gives a 404 error with "Not Found The requested URL /phide was not found on this server." do I still need to worry about duplicate content? And thanks so much for your help!

g1smd

9:39 am on Jun 11, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



OK, now you see the 404 error for that, you now know why the /phide/ requests were redirected to https - so that the visitor got where they wanted anyway.

I am not sure why you coded the new rules the way you did. There's multiple errors in there that you didn't make in your original post.

The RegEx pattern cannot 'see' domain names.

[R] should be [R=301,L].

You'll need the RewriteCond, like last time, to test the port number.

You'll also need to test REQUEST_URI for a match with "~phide" and redirect if true.

In fact, I was expecting to see two new rules based on the code in your very first post, modified to detect the "~phide", etc.

jdMorgan

4:11 pm on Jun 11, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



If I understand the problem, then it takes something like...

# Redirect non-SSL requests for secure pages to https
RewriteCond %{SERVER_PORT} !=443
RewriteRule ^(~([^/]+/))?((login|membersection|eboardarchives)(/.*)?)$ https://www.example.edu/$2$3 [R=301,L]
#
# Redirect SSL requests for non-secure pages to http
RewriteCond %{SERVER_PORT} =443
RewriteCond $1 !^(~[^/]+/)?(login|membersection|eboardarchives)(/.*)?$
RewriteRule ^(~([^/]+/))?(.*)$ http://www.example.edu/$2$3 [R=301,L]
#
# Redirect all other ~userdir requests to /userdir/ retaining original protocol
RewriteCond %{HTTP_PORT}s ^(443(s)|[0-9]+s)$
RewriteRule ^~(.+)$ http%2://www.example.edu/$1 [R=301,L]
#
# Redirect all other non-canonical hsotname requests to canonical hostname, retaining original protocol
RewriteCond %{HTTP_HOST} !^(www\.example\.edu)?$
RewriteCond %{HTTP_PORT}s ^(443(s)|[0-9]+s)$
RewriteRule ^~(.+)$ http%2://www.example.edu/$1 [R=301,L]

The optional-subpattern "(~[^/]+/)?" you see in several places reperesents the "userdir" and its trailing slash. If it is present in the requested URL-path, it is copied into the redirect URL, dropping the leading "~".

Note that the code above depends on a big assumption. And that is that this code is placed in /userdir/.htaccess, and that URL-path-part /~userdir/ is mapped to filepath-part /userdir/ at the server config level.

Also, it assumes that no objects such as images, favicons, CSS files, or external JavaScripts are shared between the SSL and non-SSL pages. If they are, then the URL-paths for these objects will need to be excluded from the first two redirects above, either by directory location or by "filetype." The filetype exclusion may be explicit, or perhaps you could implement it in the RewriteRule by requiring ".html" and/or ".php" extensions on the RewriteRule pattern if any "filetype" is present in the request. I'd prefer the explicit exclusion myself, as it is a lot less limiting for future needs. Something like:

RewriteCond $1 !\.(gif|jpe?g|png|ico|css|js)$

This exclusion need only list filetypes that are to be shared between SSL and non-SSL pages, in order to prevent "Mixed secure/non-secure content" warnings in the browser.

I threw the last rule in as a 'freebie' to force hostname canonicalization.

I hope all the parentheses and subpatterns are correct... It was a bit difficult to 'compact' the rules from six to four for efficiency, and my brain hurts.

Jim

ragnarokx

8:44 pm on Jun 11, 2010 (gmt 0)

10+ Year Member



First off thank you for taking the time to type that code! I've spent the last several hours examining it and learning what almost all of its language means (or tried at least).

The optional-subpattern "(~[^/]+/)?" you see in several places reperesents the "userdir" and its trailing slash. If it is present in the requested URL-path, it is copied into the redirect URL, dropping the leading "~".

Note that the code above depends on a big assumption. And that is that this code is placed in /userdir/.htaccess, and that URL-path-part /~userdir/ is mapped to filepath-part /userdir/ at the server config level.


.htaccess file is in a folder called "phide" which is a shared/group directory for my fraternity, and is the "root folder" where all my webpage files are located. The default URL for a directory on my school's server is csun.edu/~folder/, so my website folder "phide" is linked to URL /~phide/. It is also linked to URL /phide/ (the alias I had to apply for). So I hope we're good there.

# Redirect non-SSL requests for secure pages to https
RewriteCond %{SERVER_PORT} !=443
RewriteRule ^(~([^/]+/))?((login|membersection|eboardarchives)(/.*)?)$ [example.edu...] [R=301,L]
#
# Redirect SSL requests for non-secure pages to http
RewriteCond %{SERVER_PORT} =443
RewriteCond $1 !^(~[^/]+/)?(login|membersection|eboardarchives)(/.*)?$
RewriteRule ^(~([^/]+/))?(.*)$ http://www.example.edu/$2$3 [R=301,L]
#
# Redirect all other ~userdir requests to /userdir/ retaining original protocol
RewriteCond %{HTTP_PORT}s ^(443(s)|[0-9]+s)$
RewriteRule ^~(.+)$ http%2://www.example.edu/$1 [R=301,L]
#
# Redirect all other non-canonical hsotname requests to canonical hostname, retaining original protocol
RewriteCond %{HTTP_HOST} !^(www\.example\.edu)?$
RewriteCond %{HTTP_PORT}s ^(443(s)|[0-9]+s)$
RewriteRule ^~(.+)$ http%2://www.example.edu/$1 [R=301,L]


I put this code into my .htaccess file (with "RewriteEngine On" and "RewriteBase /" preceding it) but ran into a couple of problems.

When a secure request is redirected to http, the "phide" or "~phide" is always removed. So instead of "http://www.csun.edu/~phide/about/about.html" it points to "http://www.csun.edu/about/about.html" giving a 404 error.

The redirect from /~phide/ to /phide/ does not seem to be working at all.

The (login|membersection|eboardarchives) pages don't seem to redirect to https, but they do redirect and remove the /~phide/ or /phide/ giving a 404 error.


I tried to modify the code you gave and came up with this:

RewriteEngine On
RewriteBase /

# Redirect non-SSL requests for secure pages to https
RewriteCond %{SERVER_PORT} !=443
# Next line added to stop excising of phide; before: "csun.edu/login/" now: "csun.edu/phide/login/"
# RewriteCond $1 ^(~[^/]+/)?(login|membersection|eboardarchives)(/.*)?$

RewriteRule ^(~([^/]+/))?((login|membersection|eboardarchives)(/.*)?)$ [csun.edu...] [R=301,L]
#
# Redirect SSL requests for non-secure pages to http
RewriteCond %{SERVER_PORT} =443
RewriteCond $1 !^(~[^/]+/)?(login|membersection|eboardarchives)(/.*)?$
# Redirects were not inserting /phide/ so added it to the end of http://csun.edu in next line
# But redirect loop for (login|membersection|eboardarchives) cause by next line?

RewriteRule ^(~([^/]+/))?(.*)$ http://www.csun.edu/phide/$2$3 [R=301,L]
#
# Redirect all other ~userdir requests to /userdir/ retaining original protocol
RewriteCond %{HTTP_PORT}s ^(443(s)|[0-9]+s)$
RewriteRule ^~(.+)$ http%2://www.csun.edu/$1 [R=301,L]
#
# Redirect all other non-canonical hostname requests to canonical hostname, retaining original protocol
RewriteCond %{HTTP_HOST} !^(www\.csun\.edu)?$
RewriteCond %{HTTP_PORT}s ^(443(s)|[0-9]+s)$
RewriteRule ^~(.+)$ http%2://www.csun.edu/$1 [R=301,L]

Secure and non-secure pages do share CSS, JS, etc but I did not yet add in "RewriteCond $1 !\.(gif|jpe?g|png|ico|css|js)$" because I didn't want to complicate things more before getting them working.

With my revisions I was able to fix the missing /~phide/ or /phide/ after redirects, but now get a redirect loop error when requesting the (login|membersection|eboardarchives) pages. I'm guessing this is due to the $2$3 values being the same in the redirect URL for both http and https rules, but I could not figure out how to correctly change this.

I really tried, but I have a sinking feeling my corrections will cause more problems than good. I've left your unaltered code in the .htaccess file if you'd like to observe the website behavior. I'm at the forum's mercy - any ideas?

ragnarokx

12:00 am on Jun 15, 2010 (gmt 0)

10+ Year Member



I reverted back to the first revision of my code in this thread and added a Redirect 301. This appears to solve all my problems (/~phide/ redirects to /phide/, https is forced for 3 pages and http forced for all else), the only thing is I've read online that it is not good to mix Redirect and RewriteRule in the same .htaccess file.


RewriteEngine On
RewriteBase /

# redirects /~phide/ to /phide/
Redirect 301 /~phide [csun.edu...]

# redirects login/.*, membersection/.*, eboardarchives/.* to secure server
RewriteCond %{SERVER_PORT} !^443$
RewriteRule ^(login|membersection|eboardarchives)/(.*)$ [%{HTTP_HOST}%{REQUEST_URI}...] [R=301,L]

# redirects NOT {above} to non-secure server
RewriteCond %{SERVER_PORT} ^443$
RewriteRule !^(login|membersection|eboardarchives)/(.*)$ [%{HTTP_HOST}%{REQUEST_URI}...] [R=301,L]


Would I be ok sticking with this? If not, what would be the RewriteRule equivalent for my Redirect 301?

jdMorgan

1:44 am on Jun 15, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I don't recommend mixing mod_alias "Redirects" with mod_rewrite RewriteRules.

The reason is that you never know which modules will execute first, so you don't know which of your redirects and rewrites will be applied first.

Directive execution is per-module, not in the line-by-line order of your config code.

I'd suggest changing your new Redirect 301 to use a RewriteRule.

Jim