Forum Moderators: phranque

Message Too Old, No Replies

rel="canonical" or 301 htaccess

         

imbckagn

1:05 am on Mar 7, 2011 (gmt 0)

10+ Year Member



I recently installed a security certificate on my root domain. I'm using Wordpress and have the rel="canonical" on all pages. I am still worried about the search engines deciding which to index https or http.

I have several pages that are using a htaccess rule to redirect from http to https already. Is there an easy htaccess fix I could implement to tell the rest of the URL's to point to http?

Yes I searched and I'm a newb with htaccess so I'm trying to learn.

g1smd

1:22 am on Mar 7, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Yes, and there are several very detailed code samples in the WebmasterWorld Apache forum answering exactly this question.

In short, your site should force https for some URLs and http for the rest. To do otherwise would let some URLs work with both protocols and that is never a good thing.

imbckagn

4:46 am on Mar 7, 2011 (gmt 0)

10+ Year Member



Thanks g1smd

I did search and was able to force the folder I need to https butt can't figure out how to force the rest of the site to http.

# Point Non WWW to WWW
RewriteCond %{HTTP_HOST} ^website.com
RewriteRule (.*) http://www.website.com/$1 [R=301,L]

# Secure Folder Http to Https
RewriteCond %{SERVER_PORT} !443$
RewriteRule ^/?(apply/.*) https://www.website.com/$1 [R=301,L]


The above code forces the pages I want to https and works OK. Would you happen to know how to force the rest of the website to http?

g1smd

8:39 am on Mar 7, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Your second rule forces https if PORT is NOT 443 and URL IS /apply/.*

You need another rule to force http if PORT IS 443 and URL is NOT /apply/.*

Your first rule forces http for all URLs so some requests will see a double redirect. That should be avoided. Adding an extra condition to both rules fixes the problem.

# Secure Folder http or non-www to https and www
RewriteCond %{SERVER_PORT} !443$ [OR]
RewriteCond %{HTTP_HOST} !^(www\.example\.com)?$
RewriteRule ^(apply/.*) https://www.website.com/$1 [R=301,L]

# Rest of Site https or non-www to http and www
RewriteCond %{REQUEST_URI} !^/apply
RewriteCond %{SERVER_PORT} 443$ [OR]
RewriteCond %{HTTP_HOST} !^(www\.example\.com)?$
RewriteRule (.*) http://www.example.com/$1 [R=301,L]


Make sure that links to images and scripts begin with a leading slash and do NOT contain protocol or domain information. This avoids the "mixed security" warning pop-up on the users browser.

Make sure that internal links to pages DO contain protocol and domain information so that clicking internal links does NOT result in a redirect.

imbckagn

7:05 pm on Mar 7, 2011 (gmt 0)

10+ Year Member



The code above seems to work except for one important thing. Any files in the /apply/ directory get redirected back to the root domain. So when a user would try to visit a secure page they get redirected to the home page.

Side note Google is already starting to index my https pages that didn't take long :(

[edited by: imbckagn at 7:45 pm (utc) on Mar 7, 2011]

g1smd

7:45 pm on Mar 7, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Apart from the code in my post, what other code is present in the .htaccess file?

imbckagn

7:48 pm on Mar 7, 2011 (gmt 0)

10+ Year Member



This is the complete file with your rules from above

<Files 403.shtml>
order allow,deny
allow from all
</Files>

# Secure Folder http or non-www to https and www
RewriteCond %{SERVER_PORT} !443$ [OR]
RewriteCond %{HTTP_HOST} !^(www\.WEBSITE\.com)?$
RewriteRule ^(apply/.*) https://www.WEBSITE.com/$1 [R=301,L]

# Rest of Site https or non-www to http and www
RewriteCond %{REQUEST_URI} !^/apply
RewriteCond %{SERVER_PORT} 443$ [OR]
RewriteCond %{HTTP_HOST} !^(www\.WEBSITE\.com)?$
RewriteRule (.*) http://www.WEBSITE.com/$1 [R=301,L]

# Point IP to Domain
rewritecond %{http_host} ^IP ADDRESS HERE [nc]
rewriterule ^(.*)$ http://www.WEBSITE.com/$1 [r=301,nc]
# End Point IP to Domain

# iPhone Home Page Redirect
redirect 301 "/redirect-to-application/"
https://www.WEBSITE.com/apply/
# iPhone Home Page Redirect

# Deny IP Address
deny from 74.220.207.159
# End Deny IP Address

# BEGIN Deflate
<ifmodule mod_deflate.c>
<filesmatch "\.(js|css|png|jpg|gif|jpeg|htm|html|php)">
SetOutputFilter DEFLATE
</filesmatch>
</ifmodule>
# END Deflate

# Caching for One Year
<FilesMatch "\.(flv|gif|jpeg|png|ico|swf)$">
Header set Cache-Control: "max-age=29030400"
</FilesMatch>
# End Image Caching

# BEGIN WordPress
<IfModule mod_rewrite.c>
RewriteEngine On
RewriteBase /
RewriteRule ^index\.php$ - [L]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . /index.php [L]
</IfModule>

g1smd

7:59 pm on Mar 7, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



# iPhone Home Page Redirect
redirect 301 "/redirect-to-application/" [example.com...]
The above code should be:
# iPhone Home Page Redirect
RewriteRule ^redirect-to-application https://www.example.com/apply/ [R=301,L]

and this code should be first in the .htaccess file.

# Point IP to Domain
rewritecond %{http_host} ^75.125.132.51 [nc]
rewriterule ^(.*)$ http://www.example.com/$1 [r=301,nc]

The above code should be:
# Point IP to Domain
RewriteCond %{HTTP_HOST} ^75\.125\.132\.51
RewriteRule (.*) http://www.example.com/$1 [R=301,L]

but in fact it isn't needed as the other two redirects already fix all requests that are not for exactly www.example.com.

Use "Live HTTP Headers" for Firefox to see the HTTP transaction between browser and server. It might give some clues.

imbckagn

8:08 pm on Mar 7, 2011 (gmt 0)

10+ Year Member



Thanks for the tips. I did make the suggested changes and I still have the same problem.

EDIT

Narrowed it down to this being the code that is causing the /apply/ directory to redirect to the root.

# Rest of Site https or non-www to http and www
RewriteCond %{REQUEST_URI} !^/apply
RewriteCond %{SERVER_PORT} 443$ [OR]
RewriteCond %{HTTP_HOST} !^(www\.WEBSITE\.com)?$
RewriteRule (.*) http://www.WEBSITE.com/$1 [R=301,L]

g1smd

8:34 pm on Mar 7, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Ah, it's an interaction with the later rewrite I suspect.

See if changing the single line
RewriteCond %{REQUEST_URI} !^/apply

to
RewriteCond %{REQUEST_URI} !^/(apply|index\.php)

fixes it.

imbckagn

8:56 pm on Mar 7, 2011 (gmt 0)

10+ Year Member



Awesome thank you that was the problem which brings me to my next problem unfortunately.

When I implement the rule I get security warnings now, it was fine before.

Make sure that links to images and scripts begin with a leading slash and do NOT contain protocol or domain information. This avoids the "mixed security" warning pop-up on the users browser.

Make sure that internal links to pages DO contain protocol and domain information so that clicking internal links does NOT result in a redirect.


1. I am using Wordpress and the images I can control through the theme I have using a leading slash. I don't suspect there is a way to control this through htaccess is there?

2. Once again since I'm using Wordpress the URL's generated don't contain the
http://www.website.com/
protocol except for the URL's I manually placed. Is there a way to control these?

g1smd

9:02 pm on Mar 7, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



You'll need to exclude image folder requests from being redirected whatever protocol they are requested with.

Just add that as another exclusion.
RewriteCond %{REQUEST_URI} !^/(apply|images|index\.php)

imbckagn

9:13 pm on Mar 7, 2011 (gmt 0)

10+ Year Member



g1smd I'm in over my head with this. I added the exclusion above and still get the security warnings. If you could provide a service to fix my problems please PM me.

imbckagn

2:09 am on Mar 10, 2011 (gmt 0)

10+ Year Member



Just wanted to say thanks for all your help. I finally think I got it all figured out. It wasn't just the images directory I had to go all the way down to the wp-includes and wp-content to get rid of the security warnings.

To bad Google indexed about 600 https pages before I figured it out.

g1smd

8:30 am on Mar 10, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Glad you got there. Google will find the redirects and fix everything up over the next few weeks.

Maybe you could post your code for a final check.

imbckagn

7:31 pm on Mar 10, 2011 (gmt 0)

10+ Year Member



Sure here it is:

<Files 403.shtml>
order allow,deny
allow from all
</Files>

# Secure Folder http or non-www to https and www
RewriteCond %{SERVER_PORT} !443$ [OR]
RewriteCond %{HTTP_HOST} !^(www\.WEBSITE\.com)?$
RewriteRule ^(apply/.*) https://www.WEBSITE.com/$1 [R=301,L]
# Secure Folder http or non-www to https and www

# Rest of Site https or non-www to http and www
RewriteCond %{REQUEST_URI} !^/(apply|wp-content|wp-includes|index\.php)
RewriteCond %{SERVER_PORT} 443$ [OR]
RewriteCond %{HTTP_HOST} !^(www\.WEBSITE\.com)?$
RewriteRule (.*) http://www.WEBSITE.com/$1 [R=301,L]
# Rest of Site https or non-www to http and www

# Deny IP Address
deny from 74.220.207.159
# End Deny IP Address

# BEGIN Deflate
<ifmodule mod_deflate.c>
<filesmatch "\.(js|css|png|jpg|gif|jpeg|htm|html|php)">
SetOutputFilter DEFLATE
</filesmatch>
</ifmodule>
# END Deflate

# Caching for One Week
<FilesMatch "\.(flv|gif|jpeg|png|ico|swf)$">
Header set Cache-Control: "max-age=16934400"
</FilesMatch>
# End Image Caching

# BEGIN WordPress
<IfModule mod_rewrite.c>
RewriteEngine On
RewriteBase /
RewriteRule ^index\.php$ - [L]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . /index.php [L]
</IfModule>
# END WordPress

g1smd

9:09 pm on Mar 10, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Looks good but do add some extra RewriteCond patterns ahead of the very slow and inefficient -f and -d "exists" checks to stop those being performed for every request hitting your server.

Exclude requests for images and media files as a bare minimum. There's another thread with the code and detailed description.

In short, adding the extra code speeds up your site, preserves the life of your hard drive and is altogether A Good Thing.


I'd move your single "deny from IP" code to the very top of the page to be next to the other "allow/deny" code.

I'd also move the "caching" and "deflate" code to be the very last stuff in the file - after the Wordpress rewrite stuff.

jdMorgan

8:44 pm on Mar 17, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



RewriteCond %{SERVER_PORT} 443$ [OR]

Both this and the corresponding "Not 443" pattern are incompletely-anchored, which means that unexpected port numbers will be matched. In fact, it it would be faster to use an "exact-string match" instead in both cases... e.g
RewriteCond %{SERVER_PORT} =443 [OR]

and
RewriteCond %{SERVER_PORT} !=443 [OR]

Where an exact match is desired, exact-string matches are both simpler and faster.

The point about excluding filetypes from being checked for "file exists" is a good one. In fact, excluding the script-path itself (or the script's .php filetype) and *all filetypes that the script cannot generate or create* from this rule will improve performance. In addition, you should also use filetype exclusion in your http/https rules, so that included objects shared between SSL and non-SSL pages will not be redirected, regardless of what directory you may place them in...

Jim