homepage Welcome to WebmasterWorld Guest from 54.226.10.234
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Pubcon Platinum Sponsor 2014
Home / Forums Index / Code, Content, and Presentation / Apache Web Server
Forum Library, Charter, Moderators: Ocean10000 & incrediBILL & phranque

Apache Web Server Forum

    
Mod Rewrite - http to https + vice versa htacess rewrite
Can anyone help please?
shadowlight



 
Msg#: 4577614 posted 11:19 pm on May 24, 2013 (gmt 0)

Ok I have been setting up some rewrites using htaccess. Basically I want some pages to be forced to use https and others to use http. This is what I have so far:


## FORCE SSL CONNECTION

RewriteCond %{SERVER_PORT} 80
RewriteCond %{REQUEST_URI}%{REQUEST_FILENAME} ^(/dirc/(.*)/)$
RewriteRule ^(.*)$ https://%{HTTP_HOST}%{REQUEST_URI} [R=301]

## FORCE NON SSL CONNECTION

RewriteCond %{SERVER_PORT} 443
RewriteCond %{REQUEST_URI} !^(/dirc/(.*)/)$
RewriteCond %{REQUEST_FILENAME} !-f

RewriteRule ^(.*)$ http://%{HTTP_HOST}%{REQUEST_URI} [R=301]


The above code is working as intended when I visit www.example.com (index page) coming from https://www.example.com/dirc/direx/ in that it changes the protocol and http://www.example.com displays.

Also when I visit http://www.example.com/dirc/direx/ this successfully rewrites to https://www.example.com/dirc/direx/ and a secure connection is successfully made.

However if I visit any other page apart from www.example.com (index page)from https://www.example.com/dirc/direx/ then it breaks and I get a non secure connection e.g https://www.example.com/anyotherpage.php.

Can anyone help? I've spent hours trying to figure this out!

TIA

 

lucy24

WebmasterWorld Senior Member lucy24 us a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



 
Msg#: 4577614 posted 1:30 am on May 25, 2013 (gmt 0)

Before anything else: Can you please add an [L] flag to each of those rulesets? I want to be rock-certain that everything is happening in the intended rule, rather than spilling over to a following rule. ([R] does not imply [L]. Counter-intuitive, but there you are.)

Why are you capturing the request in each pattern (.*) if you're not going to use it in the target?

What's the significance of !-f in the second condition?

I'm also a bit leery about the mismatch between checking port number in the condition vs. redirecting with a protocol in the target. But then, https makes me nervous anyway ;)

shadowlight



 
Msg#: 4577614 posted 7:18 am on May 25, 2013 (gmt 0)

The script performs the same whether I check the port number or check the protocol using RewriteCond %{HTTPS} off etc.

If I add [L] flag I get 500 internal server error?

I wouldn't use (.*) in the last target anyway (FORCE NON SSL) would I? Because I am only redirecting to http from https if %{REQUEST_URI} does not match !^(/dirc/(.*)/)$.

I am new to mod rewrite and I don't really know what I am doing tbh. I put this together from info gathered via search.

Basically I want everything in /dirc including sub directories files etc to be forced to use https and everything else to use http protocol. Files in /dirc actually use includes to included javascript, css etc that reside in another directory also.

As for using !-f in the second condition. All I know is if I don't then the relevant page uses the https protocol but no secure connection (padlock) without it!

lucy24

WebmasterWorld Senior Member lucy24 us a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



 
Msg#: 4577614 posted 5:22 pm on May 25, 2013 (gmt 0)

As for using !-f in the second condition. All I know is if I don't then the relevant page uses the https protocol but no secure connection (padlock) without it!

Now, that makes no sense at all, because
!-f
means "if the request is for a file that does not physically exist". This in turn implies that some other part of the rule is not doing what it's supposed to do, since your https status is supposed to be based purely on "location". (I put this in quotes because when Apache says "location" it doesn't mean the physical location of a file or directory; it means some part of the URL path.)

If I add [L] flag I get 500 internal server error?

Uh-oh, that's bad news. Did you understand that I meant [L] IN ADDITION TO the existing [R] flag?

I wouldn't use (.*) in the last target anyway (FORCE NON SSL) would I? Because I am only redirecting to http from https if %{REQUEST_URI} does not match !^(/dirc/(.*)/)$.

It's always safer to use captures from the body of the rule rather than a condition. If the condition doesn't match, the whole rule is simply ignored. The only reason you need the Condition is that the form !(blahblah) will always be empty. You have to put the capture () and the negation ! in different places.

So everything in ^directory gets https and everything else gets http? Then you can shift things into the body of the rule, as in

RewriteCond {not-https stuff here}
RewriteRule ^(directory/.*) https://www.example.com/$1 [R=301,L]

paired with

RewriteCond {http stuff here}
RewriteCond %{REQUEST_URI} !^/directory
RewriteRule (.*) http://www.example.com/$1 [R=301,L]

Yes, in htaccess you omit the leading / from the target but include it in the %{REQUEST_URI).

The optimal http(s) redirect also cross-checks the port number to ensure that anything using 80 is http and anything on 443 is https. There are earlier threads on this subject, but it isn't one of the Top Ten Questions Asked Day In And Day Out :: realizing after-the-fact that I could have just typed that as-is and then used Title Case function :: so it may involve a little searching.

Does your https directory contain non-page files such as images used by https pages? Or is it strictly for pages? Will there be included material from other directories? I ask this because browsers sometimes make a fuss about encrypted pages with "non-encrypted content".

shadowlight



 
Msg#: 4577614 posted 7:59 am on May 27, 2013 (gmt 0)

Thank you for the reply. Yes the pages in the https directory contain non-page files that are included from other directories. Such as CSS, JS and images. However I have tested in Opera, FF, IE, Safari and Chrome and no warnings or errors appear and I get the SSL connection with padlock ok.

Its just when I click from the secure page to what I want to be non secure pages (apart from the homepage, homepage works fine and uses http protocol) it is still trying to make the connection using https protocol which obviously does come with warnings saying not all content is secure and is not what I want, I want these pages to use http.

Yes I knew you meant [L] in addition to the [R] flag. This is in place now and does not cause 500 errors.

I am still suffering from the original problem though. More experimenting with this today. I have spent approx. 2.5 days on this over the past week and still do not have it functioning correctly!

I will get there eventually! I hope lol

lucy24

WebmasterWorld Senior Member lucy24 us a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



 
Msg#: 4577614 posted 9:27 am on May 27, 2013 (gmt 0)

Its just when I click from the secure page to what I want to be non secure pages (apart from the homepage, homepage works fine and uses http protocol) it is still trying to make the connection using https protocol which obviously does come with warnings saying not all content is secure and is not what I want, I want these pages to use http.

Isn't that your
!-f
flag at work?

The Condition says "only deploy this rule when the request is for a file that doesn't exist". So if you ask for a file that does exist-- this is assuming you've got real physical pages in those other directories-- the rule won't execute. I kinda think the front page is exempt because you didn't really ask for it by name ("index.html"). What about other directory index pages (linked with final / not as "index.html")?

As for using !-f in the second condition. All I know is if I don't then the relevant page uses the https protocol but no secure connection (padlock) without it!

The more I look at this passage, the more it gives me a headache. I think it's got too many negatives :) If you leave out the !-f does the whole rule flipflop? That is, https for the front page but http for everything else? Or do all requests then get handled the same?

shadowlight



 
Msg#: 4577614 posted 9:41 am on May 27, 2013 (gmt 0)

lol its giving me a headache full stop!

If I leave out the !-f then the pages I want to be https actually try to be https but I get warnings about all the content not being secure. The rest of it works as I want it to.

So basically if I leave out the !-f then the whole thing works as it should but without a totally secure connection, no padlock on the pages that need to be secure, but the all other urls are re-directing as I want them to and using http.

It I leave in the !-f then I get the secure pages with padlock on the pages I want but all other pages requested coming from any secure page are requested using the https protocol with security warnings (apart from the homepage) as opposed to using the http protocol as I would like.

shadowlight



 
Msg#: 4577614 posted 10:48 am on May 27, 2013 (gmt 0)

How do I create a condition and rule that any files requested via a https secure page are requested using https instead of http regardless of where they reside in the directory structure?

Does anyone know?

lucy24

WebmasterWorld Senior Member lucy24 us a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



 
Msg#: 4577614 posted 11:19 am on May 27, 2013 (gmt 0)

any files requested via a https secure page

The simplest way is to look at the referer. It won't work if the user doesn't send a referer-- browser privacy settings, type of connection, et cetera. But for everyone else it's

RewriteCond %{SERVER_PORT} 80
RewriteCond %{HTTP_REFERER} /name-of-protected-directory/
RewriteRule \.(css|js|png|jpg)$

Constrain the rule to everything other than pages, so you list the non-page extensions that you actually use.

The referer for a non-page file is the page you're currently on. So this is really just a spin on the classic hotlinking routine where you divide referers into Good Guys and everyone else.

Matter of fact this is why I asked whether you have non-page files in the secure directory. If it isn't convenient to move them elsewhere, you should probably make a mirror-imaged rule for those files: make an http connection if the referring page is in a non-secure directory.

And then go back and constrain your original pair of rules to page requests:

(^|/|\.html)$

I've left out the icky messy part of the Regular Expression: the part where you have to capture the whole thing, but only if the end of the request matches a particular pattern. Matter of fact it may really be simpler to use %{REQUEST_URI} as in your original rule.


At some time when it is no longer 4AM (my time zone) I will delve more deeply into the overall http(s) situation. So far we're got:

--referring page (assuming it's elsewhere on your site) is secure or non-secure, meaning that the incoming request will be either http or https
--requested file is either in secure directory or in non-secure directory
--requested file may be either a page-- whose security level is determined by its directory-- or a non-page-- whose security level is determined by the requesting page
--one of the rules either has or does not have the !-f condition, and this affects the working of the rule in unexpected ways

I make that four dimensions. It would definitely be cleaner if the secure directory used only its own secure non-page files, and vice versa for the non-secure pages.

shadowlight



 
Msg#: 4577614 posted 12:54 pm on May 27, 2013 (gmt 0)

Thanks for taking the time to reply and trying to help. It is 1.45pm here in my time zone so I have the rest of the day to fry my mind with mod-rewrite and try and get it working lol.

I will get there eventually and If I get it working I will update this thread to let you know and anyone else that may come across a similar problem.

lucy24

WebmasterWorld Senior Member lucy24 us a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



 
Msg#: 4577614 posted 9:24 pm on May 27, 2013 (gmt 0)

Meanwhile, after a full night's sleep at my end...

The line that's stuck in my mind is: The purpose of a RewriteCond is, paradoxically, to prevent a rule from executing. A conditionless rule executes all the time. So I went back to the beginning.

#1 given a secure directory-- which I'll call /lockdown/ --within a non-secure site, the first step is

RewriteRule ^lockdown/(.*) https://www.example.com/lockdown/$1 [R=301,L]
paired with
RewriteRule (.*) http://www.example.com/$1 [R=301,L]

No condition needed, because the first rule has already intercepted all requests for /lockdown/. You could capture "lockdown/" as well, but no point since it is always the same.

#2 This pair of rules will, of course, not work as written. Both will create infinite redirects, the kind that makes your browser give up after ten-or-so tries and put up a message saying "This is going nowhere fast".

So then you add a condition to the first rule, the https redirect, saying something like

RewriteCond %{SERVER_PORT} ^80$
RewriteRule ^lockdown/(.*) https://www.example.com/lockdown/$1 [R=301,L]

(adding anchors because port numbers can have more than two digits). Now, I kinda think that this single condition really ought to be

RewriteCond %{SERVER_PORT} !^443$ [OR]
RewriteCond %{HTTPS} !on

but don't quote me. It depends on how imaginative the user's browsers and/or your server can be. ("Correct requests are all alike. Incorrect requests are all different in their own way".--Tolstoy.) We'll continue saying ^80$ and ^443$ to keep things simple.

And conversely for the second rule

RewriteCond %{SERVER_PORT} ^443$
RewriteRule (.*) http://www.example.com/$1 [R=301,L]

#3 But now that you've done this, there will be some requests for /lockdown/ that get past the first rule, so you need to exclude them in the second rule:

RewriteCond %{REQUEST_URI} !/lockdown
RewriteCond %{SERVER_PORT} ^443$
RewriteRule (.*) http://www.example.com/$1 [R=301,L]


#4 You are not done yet. (This part is, ahem, your own fault ;)) The /lockdown/ directory contains non-page files that are used by other directories, and those other directories contain non-page files that are used by /lockdown/ pages. So...

#4a constrain both rules to requests for pages. Assuming their names end in .html:

RewriteRule (^|/|\.html)$ https et cetera

"request is for front page-- i.e. request is empty --or for a directory, or for an .html page" Unfortunately that only works for non-capturing rules; capturing is messier. If you're lucky it's simply

RewriteRule ^lockdown/([^.]+(?:\.html)?)?$ et cetera
and
RewriteRule ^([^.]+(?:\.html)?)$ et cetera

But if your name is apache dot org and you have carelessly allowed literal periods to sneak into your file and/or directory names (hostname doesn't count), you'll have to allow for some server backtracking:

RewriteRule ^(.+(?:\.html|/))?$ et cetera

:: pause here to yawn, twiddle thumbs, check watch, allow a few minutes to elapse ::

#4b take all non-page requests and make them match the requesting page, not their own location. This time you're looking for

\.(?:css|js|jpg|png)$

(et cetera depending on what extensions are involved) with no option for slash alone: there has to be an extension.

Now your two rules have expanded to four. Sticking with the dotless version for simplicity's sake, and grouping them by protocol first, filetype second:

RewriteCond %{SERVER_PORT} ^80$
RewriteRule ^lockdown/([^.]+(\.html)?)?$ https://www.example.com/$1 [R=301,L]

RewriteCond %{HTTP_REFERER} https://
RewriteCond %{SERVER_PORT} ^80$
RewriteRule ^([^.]+\.(?:css|js|jpg|png))$ https://www.example.com/$1 [R=301,L]

(breakthrough here as I realize you don't have to check the non-page file's location, only its port and/or protocol)

RewriteCond %{REQUEST_URI} !/lockdown
RewriteCond %{SERVER_PORT} ^443$
RewriteRule ^([^.]+(\.html)?)$ http://www.example.com/$1 [R=301,L]

RewriteCond %{HTTP_REFERER} http://
RewriteCond %{SERVER_PORT} ^443$
RewriteRule ^([^.]+\.(?:css|js|jpg|png))$ http://www.example.com/$1 [R=301,L]

I think that covers all bases.

Since each of the four rules is free-standing, you can arrange them in order of likelihood. Similarly, when a rule has more than one condition, list the conditions in order of likelihood to fail. If you have a set of conditions linked with [OR], list the members of the set in order of likelihood to succeed. The object in each case is to let the server finish its job and get out of there faster.

Any one nanosecond on any one request will not be noticeable. But if you've ever stumpled across a site that was loaded down with hotlinked images, you've seen what happens when those nano-, micro- and milliseconds start adding up :)

Dideved



 
Msg#: 4577614 posted 11:38 pm on May 27, 2013 (gmt 0)

lucy24: But if your name is apache dot org and you have carelessly allowed literal periods to sneak into your file and/or directory names...


Once upon a time I confidently linked to some jQuery documentation...

api.jquery.com/category/version/1.9/

Little did I know they were being "careless". Apparently PHP too is "careless" with their URLs.

php.net/manual/en/language.basic-syntax.php

And of course Apache itself is oh so "careless".

httpd.apache.org/docs/2.4/

I'm mocking you, of course, but only because evidence and reason has already failed on you.

There is absolutely nothing wrong with using literal periods in URLs. They are legal. They are safe. They are useful. And whether you like it or not, people use them. And when your regex patterns deliberately forbid periods, that's a bug in your regex.

lucy24

WebmasterWorld Senior Member lucy24 us a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



 
Msg#: 4577614 posted 12:48 am on May 28, 2013 (gmt 0)

Over two hours. I'm disappointed.

shadowlight



 
Msg#: 4577614 posted 11:58 am on May 28, 2013 (gmt 0)

Thanks, I got it working yesterday but was too busy/tired to post. Here is the code that I ended up with:

RewriteCond %{HTTPS} =off [OR]
RewriteCond %{SERVER_PORT} ^80$
RewriteCond %{REQUEST_URI} ^(/directory/(.*)/(.*))$
RewriteRule ^(.*) https://www.example.com/$1 [R=301,L]

RewriteCond %{HTTPS} =on [OR]
RewriteCond %{SERVER_PORT} ^443$
RewriteCond %{REQUEST_URI} !\.(js|css|jpe?g|png|bmp|gif)$ [NC]
RewriteCond %{REQUEST_URI} !^(/directory/(.*)/(.*))$
RewriteRule (.*) http://www.example.com/$1 [R=301,L]

Working just how I wanted it to work now.

Thanks again for your help :)

lucy24

WebmasterWorld Senior Member lucy24 us a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



 
Msg#: 4577614 posted 6:53 pm on May 28, 2013 (gmt 0)

RewriteCond %{REQUEST_URI} ^(/directory/(.*)/(.*))$
RewriteRule ^(.*) https://www.example.com/$1 [R=301,L]

Uhm... You do realize, don't you, that in a Regular Expression-- including a RewriteCond-- the . matches any character, including directory slashes. All you need in this location is
^/directory/[^/]+/
without closing anchor. But as previously noted, this element can and should go in the Rule instead. Since you're matching from the front of the pattern, the capture is straightforward.

I should also note that the present wording of the RewriteCond allows it to match requests for
www.example.com/directory//
(two slashes, nothing between them). This is probably not what you intended.

Did you always mean to constrain the rule to subdirectories within the first /directory/ or is this a recent addition?

shadowlight



 
Msg#: 4577614 posted 2:29 pm on Jun 1, 2013 (gmt 0)

Yes anything within the first directory should be secure including subdirectories and any files contained therein. Not sure if the RewriteCond allowing it to match requests for www.example.com/directory// should be much of a problem. But I shall tinker with it once I get the rest of the development sorted.

lucy24

WebmasterWorld Senior Member lucy24 us a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



 
Msg#: 4577614 posted 7:42 pm on Jun 1, 2013 (gmt 0)

anything within the first directory should be secure including subdirectories

The rule as written will only match subdirectories within /directory/ --not top-level files like
/directory/filename.html
or even
/directory/ (the index page)
itself.

g1smd

WebmasterWorld Senior Member g1smd us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 4577614 posted 8:07 am on Jun 3, 2013 (gmt 0)

Rather than testing for ports ^443$ and ^80$ the code is more roboust if you check for ^443$ and !^443$. Secure will always be 443, whereas standard might not always be 80.

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Code, Content, and Presentation / Apache Web Server
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved