Welcome to WebmasterWorld Guest from 34.228.115.216

Forum Moderators: Ocean10000 & phranque

Message Too Old, No Replies

Mod Rewrite Quicky

     
12:37 am on Dec 17, 2010 (gmt 0)

Full Member

10+ Year Member

joined:Apr 9, 2003
posts:336
votes: 0


I need to redirect non-www to www, and recently added some code to my .htaccess file. However, I noticed soon after that the shopping cart for this site starting behaving in a strange manner (items not adding to the cart). I can only think I mucked up the code. I changed it back and it started working again, so I'm wary of trying again.

At the top of my .htaccess file is some code for the affiliate program we have. This is as follows:

# Start Affiliate SEO Code

RewriteEngine On
RewriteCond %{HTTP_HOST} ^example.com
RewriteRule (.*) http://www.example.com/$1 [R=301,L]

I know that this will redirect the visitor to www.example.com when using the affiliate code.

My question is, what should I put above this code for the standard non-www to www redirect?

Any ideas?
1:34 am on Dec 17, 2010 (gmt 0)

Senior Member

WebmasterWorld Senior Member g1smd is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:July 3, 2002
posts:18903
votes: 0


That *is* a standard non-www to www redirect (though you do need to remember to escape the period in the pattern).

It's a redirect where requests with attached port number in URL will fail, but for ordinary http non-www requests it should work just fine.

However, the more important question is how are HTTPS requests handled? That may be where the problem is.
10:50 am on Dec 17, 2010 (gmt 0)

Full Member

10+ Year Member

joined:Apr 9, 2003
posts:336
votes: 0


That *is* a standard non-www to www redirect (though you do need to remember to escape the period in the pattern).


I'm not sure what you mean by "the period in the pattern".

At the moment if I type example.com into the browser it doesn't redirect to www.example.com. But I have discovered that if I use an affiliate link without the www then it does redirect. So this code seems to be just for the affiliate code (which is it's purpose). I also require it to work for regular non-www to www.

But I'm not sure what to write above this code. Last time I tried it made the shopping cart behave strangely.
3:24 pm on Dec 17, 2010 (gmt 0)

Senior Member

WebmasterWorld Senior Member wilderness is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Nov 11, 2001
posts:5496
votes: 3


I'm not sure what you mean by "the period in the pattern".


NOTE: backslash (escape) preceding period.

RewriteCond %{HTTP_HOST} ^example\.com

The escape is explained in the Forum Library [webmasterworld.com]and specifically in the Mod_Rewrite & Regular Expressions [webmasterworld.com]

Just do a page search on "escape"
6:49 pm on Dec 17, 2010 (gmt 0)

Full Member

10+ Year Member

joined:Apr 9, 2003
posts:336
votes: 0


Thanks for the reply. I have looked at the forum libary and have a better understanding of the subject. Yet, this still doesn't redirect. So, do you think this code would work:

RewriteEngine On

RewriteCond %{HTTP_HOST} ^example\.com [NC]
RewriteRule ^(.*)$ http://www.example.com/ [L,R=301]

RewriteEngine On

RewriteCond %{HTTP_HOST} ^example.com [NC]
RewriteRule ^(.*)$ http://www.example.com/$1 [L,R=301]


I'm basically duplicating, but the second rewrite is for the affiliate code, which works. The first section is just to redirect every non-www to www.

I have a feeling I've broken some rules. Maybe the L? Or do I not need to duplicate the line "RewriteEngine On"?
8:48 pm on Dec 17, 2010 (gmt 0)

Senior Member

WebmasterWorld Senior Member wilderness is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Nov 11, 2001
posts:5496
votes: 3


Or do I not need to duplicate the line "RewriteEngine On"?


ONLY once per htaccess file.

You have also failed to escape the period in your affiliate COND line.
8:48 pm on Dec 17, 2010 (gmt 0)

Senior Member

WebmasterWorld Senior Member g1smd is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:July 3, 2002
posts:18903
votes: 0


The first ruleset redirects ALL non-www URLs to the root of www.

The second ruleset redirects ALL non-www URLs to the www and preserves the page name in the request.

The second ruleset will never run, because the first ruleset matches all non-www requests.

The second ruleset is missing the escaping.

The first ruleset is redundant.

You also need the
RewriteEngine On
directive exactly once.
2:17 pm on Dec 19, 2010 (gmt 0)

Full Member

10+ Year Member

joined:Apr 9, 2003
posts:336
votes: 0


The first ruleset redirects ALL non-www URLs to the root of www.

The second ruleset redirects ALL non-www URLs to the www and preserves the page name in the request.

The second ruleset will never run, because the first ruleset matches all non-www requests.

The second ruleset is missing the escaping.

The first ruleset is redundant.

You also need the RewriteEngine On directive exactly once.


I had to re-read this several times before I understood what you meant. You certainly got my brain working, so thanks for that.

But this doesn't explain why my existing code fails to redirect non-www to www, unless the \ (escape) will make the difference. After the problems I had last time I'm wary of changing it. So do you think this is the only issue?

I also found another redirect which has the following syntax:

RewriteEngine On
RewriteCond %{HTTP_HOST} !^www\.
RewriteRule ^(.*)$ [%{HTTP_HOST}...] [R=301,L]

I might try that. Thanks for your help.
2:18 pm on Dec 19, 2010 (gmt 0)

Full Member

10+ Year Member

joined:Apr 9, 2003
posts:336
votes: 0


duplicate post, please delete.
3:37 pm on Dec 19, 2010 (gmt 0)

Senior Member

WebmasterWorld Senior Member wilderness is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Nov 11, 2001
posts:5496
votes: 3


unless the \ (escape) will make the difference.


The escape is required in order for the line to function correctly.

Syntax errors are odd creatures!
I've had syntax errors cause an immeduate 500 taking the site down, others cause surrounding line or even distant lines to not function.

Even had syntax errors that have been in place for months and then added a new line and other lines go haywire.
Check the recent additions and everything is ok.
Start pouring over syntax and find the cause was something that has been in place for months and didn't react (even though it never functioned as originally intended) until some unforeseen limit was reached.
12:20 am on Dec 21, 2010 (gmt 0)

Senior Member

WebmasterWorld Senior Member jdmorgan is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:Mar 31, 2002
posts:25430
votes: 0


> After the problems I had last time I'm wary of changing it.

I'd advise that you test the code suggested in our members' responses exactly as-posted. If it causes a problem, then report that problem, and it will get fixed. Our regularly-contributing members are not beginners in any sense of that word, averaging seven or more years each of mod_rewrite/htaccess experience. So if you are hesitant to try suggestions posted here, then there is really no use asking for our members' help...

The likely reason your original code failed is that there are probably two problems: First, the rule redirects all requests for any page in the non-www domain to the home page of the www domain. Second, I suspect that your cart may be linking to or including non-www URLs. The effect of this would be that all cart links or includes would end up accessing only the www home page, regardless of whether another page or an image, or a stylesheet, or a JavaScript was being requested.

The first fix is to use a correct canonicalization redirect -- one that preserves the requested "page" or "object." The second is to look at you cart configuration, and make sure that it links and includes only objects from "www."

RewriteEngine on
#
# Redirect requests for non-blank non-canonical non-www hostname to same URL-path on canonical host
RewriteCond %{HTTP_HOST} !^(www\.example\.com)?$
RewriteRule ^(.*)$ http://www\.example\.com/$1 [R=301,L]

The original code you posted has nothing to do with affiliates. It is simply a redirect from "example.com/<anything-or-nothing-here>" to "www.example.com/" (your www home page).

Also, please do note that g1smd has given you a warning that if your site uses HTTPS (secure pages), then this code will need to be modified to preserve the http/https in the incoming requests.

RewriteEngine on
#
# Redirect requests for non-blank non-canonical non-www hostname to same
# URL-path on canonical host, preserving requested http/https protocol
RewriteCond %{HTTP_HOST} !^(www\.example\.com)?$
RewriteCond %{SERVER_PORT}>s ^(443>(s)|[0-9]+>s)$
RewriteRule ^(.*)$ http%2://www\.example\.com/$1 [R=301,L]

Jim
8:18 pm on Dec 30, 2010 (gmt 0)

Moderator

WebmasterWorld Administrator ergophobe is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Apr 25, 2002
posts:8627
votes: 274


Hey Jim,

If you don't mind explaining a little, why make the hostname option in the first rule? Or to ask the more fundamental question, what leads to a blank %{HTTP_HOST} ?

Do I avoid rewriting that because it means it's not getting passed and I'm running the risk of an infinite loop?

And thanks so much g1smd for flagging the https problem and Jim for answering - I've rarely had need of https on sites so haven't paid any attention the implications for mod_rewrite. However, tomorrow I am setting up rewrites for an e-commerce site (so https for checkout pages). I had wondered about preserving the protocol, but hadn't had the chance to go research it yet and voila! My answer right here. Probably saved me an hour or two and countless headaches just with that offhand comment.
8:47 pm on Dec 30, 2010 (gmt 0)

Senior Member

WebmasterWorld Senior Member g1smd is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:July 3, 2002
posts:18903
votes: 0


HTTP/1.0 requests do not send a hostname in the request. The hostname option allows for blank hostname, and stops an infinite redirect loop should there be a HTTP/1.0 request made to your server.

Another problem with HTTP/HTTPS is that often these two types of requests may resolve to different folders within the webserver filesystem. That always makes things fun and interesting.

Since I have no idea about your folder structure, domain/protocol mapping to folders, nor what any of your cart URLs look like, it seemed like a good idea to suggest being wary of HTTPS issues.

They are often overlooked and have tripped me up multiple times in an infinite number of ways over the years. The trick is to try to remember all those things and then try to avoid them next time - not always successfully.

As Jim has said several times before "Error 500? Great! Only 499 to go..."
2:38 am on Jan 2, 2011 (gmt 0)

Moderator

WebmasterWorld Administrator ergophobe is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Apr 25, 2002
posts:8627
votes: 274


>> "Error 500? Great! Only 499 to go..."

:-)

Okay, didn't realize that about HTTP/1.0.

As for URL/folder structure - all virtual folders on database driven site (except of static resources such as images). The script will take the page protocol and apply it to static resources and already checks that if someone is on a page that takes personal info, it has to be https.

So far so good in testing. Thanks again for the heads up though - I can't believe I hadn't heard of the blank hostname issue before, but then the things I don't know would fill many many volumes!
6:47 pm on Jan 5, 2011 (gmt 0)

Senior Member

WebmasterWorld Senior Member jdmorgan is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:Mar 31, 2002
posts:25430
votes: 0


You'll never see a true HTTP/1.0 request on a name-based shared server because since they're name-based, they require a hostname to even access your site. But handling the blank-hostname requests is cheap insurance should your site ever move to a dedicated IP address (IP-based hosting, shared or non-shared).

In that case, either a true HTTP/1.0 request from a very old Web client, or a badly-written or even malicious request with a blank hostname could put you into a client-server 'infinite' redirection loop if the blank hostname case isn't handled in your code. I see a lot of requests to my IP-based servers using just their IP address, and often with a blank hostname.

So again, it's just cheap insurance...

BTW, you will often see requests logged as HTTP/1.0 even on name-based hosts, but these are not 'true HTTP/1.0' requests as defined above. They are requests from HTTP/1.1-capable clients which *do* include the HTTP Host header containing your hostname, but claim to be HTTP/1.0 for maximum legacy-server support. As such, they typically come from legitimate search engine spiders and malicious scrapers as well, since both want to fetch pages and objects from as many sites as possible.

Jim
6:55 pm on Jan 6, 2011 (gmt 0)

Moderator

WebmasterWorld Administrator ergophobe is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Apr 25, 2002
posts:8627
votes: 274


As always, thanks for the patience and erudition.

This site is on on a shared server, but has it's own IP and the DNS A record goes straight to that IP (rather than a general mapping to the DNS server at the host) because email is on a separate server.

Not sure how that fits into your scenario, but I have it set up to handle the blank hostname, so it should be good.

Suddenly realizing that despite all the rewrites I've done over the years, they've generally been quite simple situations.
 

Join The Conversation

Moderators and Top Contributors

Hot Threads This Week

Featured Threads

Free SEO Tools

Hire Expert Members