Welcome to WebmasterWorld Guest from 54.224.5.186

Forum Moderators: Ocean10000 & incrediBILL & phranque

Redirect non-www page to end with /

     
9:36 am on May 1, 2017 (gmt 0)

Preferred Member

10+ Year Member Top Contributors Of The Month

joined:Aug 16, 2004
posts:354
votes: 1


Following on from my other questions relating to HTACCESS, I have just one more, but thought it'd be better to start a new topic for this.

I currently redirect all traffic to the www. version of my site. However, upon checking redirected links in a specific folder, I've noticed something that may / may not be an issue.

Basically, if I do a header check on www.example.com/example/examplepage/ it shows 200 OK

If I check www.example.com/example/examplepage (without "/") it shows 301 redirect to the above as it should.

BUT, if I check example.com/example/examplepage (without www and "/"), it redirects to www.example.com/example/examplepage (without the "/" at the end)...which then in turn redirects to www.example.com/example/examplepage/ (the correct page).

I hope this makes sense?

a) is this OK? b) am I missing something in my htaccess?


RewriteOptions inherit

RewriteEngine On

RewriteCond %{THE_REQUEST} ^.*/index.htm

RewriteRule ^(.*)index.htm$ http://www.example.com/$1 [R=301,L]

RewriteCond %{HTTP_HOST} ^example\.com [NC]

RewriteRule (.*) http://www.example.com/$1 [R=301,L]


RewriteCond %{THE_REQUEST} \.php
RewriteRule ^tips/([^.]+)\.php$ http://www.example.com/example/$1/ [R=301,L]

RewriteRule ^tips/([^.]+[^./])$ http://www.example.com/example/$1/ [R=301,L]

RewriteRule ^(tips/[^.]+)/$ /$1.php [L]


Please note: the above php rule was created to "hide" php extensions on a specific folder (didn't want them hidden anywhere else).
9:50 am on May 1, 2017 (gmt 0)

Senior Member

WebmasterWorld Senior Member topr8 is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Apr 19, 2002
posts:3287
votes: 23


think about combining rules
9:58 am on May 1, 2017 (gmt 0)

Senior Member

WebmasterWorld Senior Member topr8 is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Apr 19, 2002
posts:3287
votes: 23


eg. do the 2 in one redirect


RewriteCond %{HTTP_HOST} ^example\.com [NC]
# redirect non-www tips to www + correct
RewriteRule ^tips/([^.]+)\.php$ http://www.example.com/example/$1/ [R=301,L]
# rewrite all other non-www to www
RewriteRule (.*) http://www.example.com/$1 [R=301,L]

# carry on with the_request rules
10:04 am on May 1, 2017 (gmt 0)

Preferred Member

10+ Year Member Top Contributors Of The Month

joined:Aug 16, 2004
posts:354
votes: 1


Thanks, topr8 - I'll take a look.
11:05 am on May 1, 2017 (gmt 0)

Preferred Member

10+ Year Member Top Contributors Of The Month

joined:Aug 16, 2004
posts:354
votes: 1


Aahh...thought I'd found the solution by moving part of the htaccess around, but nope....so infuriating...
11:21 am on May 1, 2017 (gmt 0)

Preferred Member

10+ Year Member Top Contributors Of The Month

joined:Aug 16, 2004
posts:354
votes: 1


I'm wondering if this is even an issue worth wondering about? I use "<link rel='canonical'" in the header of each page which shows the URL with the "/"....htaccess gives me a headache - it may as well be in another language :-(

Basically, I'm worried that having the two 301 redirects to get to the final page isn't a good idea: ie: redirecting http://example.com/example/examplepage to http://www.example.com/example/examplepage "with the www" and then again to http://www.example.com/example/examplepage/ (with the www and trailing slash at the end?
1:26 pm on May 1, 2017 (gmt 0)

Preferred Member

10+ Year Member Top Contributors Of The Month

joined:Aug 16, 2004
posts:354
votes: 1


My God, I think I've done it...well...that is until I find something else I've missed. In case this helps anyone else in the same boat, this is what got it working (just involved moving some of the rules around in htaccess). So now, my htaccess shows the following to ensure all redirects work properly:


RewriteOptions inherit
RewriteEngine On

RewriteCond %{THE_REQUEST} ^.*/index.htm

RewriteRule ^(.*)index.htm$ http://www.example.com/$1 [R=301,L]

RewriteCond %{THE_REQUEST} \.php

RewriteRule ^tips/([^.]+)\.php$ http://www.example.com/example/$1/ [R=301,L]

RewriteRule ^tips/([^.]+[^./])$ http://www.example.com/example/$1/ [R=301,L]

RewriteRule ^(tips/[^.]+)/$ /$1.php [L]

RewriteCond %{HTTP_HOST} ^example\.com [NC]

RewriteRule (.*) http://www.www.example.com/$1 [R=301,L]


If anyone can see anything wrong with this or anything that could potentially cause problems; please let me know...but before hand, let me take something to calm my nerves as I'm not sure how much more of htaccess I can take! Again, I appreciate everyone's help from this, and my other couple of posts.
3:10 pm on May 1, 2017 (gmt 0)

Preferred Member

10+ Year Member Top Contributors Of The Month

joined:Aug 16, 2004
posts:354
votes: 1


*insert crying face*

Nope - spoke to soon...still not working on http://example.com/example/examplepage/

This STILL redirects to http://www.example.com/example/examplepage.php and then again to http://www.example.com/example/examplepage/

I swear when I checked earlier it worked?!? This must be driving me crazy.

I suppose, going back to my earlier question - does it matter that one that one specific redirect (without www but with /), there's an extra 301 step to get the final destination. As much as I play around with it, I just can't seem to get it to redirect in one.
9:06 pm on May 1, 2017 (gmt 0)

Administrator

WebmasterWorld Administrator phranque is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Aug 10, 2004
posts:10846
votes: 61


Order your rules so that the external redirects precede the internal rewrites and then the most specific rules precede the most general rules.
9:13 pm on May 1, 2017 (gmt 0)

Administrator

WebmasterWorld Administrator phranque is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Aug 10, 2004
posts:10846
votes: 61


The first two RewriteCond directives are redundant.
The pattern match of the URL path in the RewriteRule will fire first making the additional conditional unnecessary.
9:53 pm on May 1, 2017 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:Apr 9, 2011
posts:13674
votes: 439


and then the most specific rules precede the most general rules

Let me give a little more detail here, because I believe eight different people have already pointed out it's a rule-ordering problem ;)

External redirects ([R] flag) ordinarily go in this order (#3 will not apply to all sites)

#1 redirects that apply to specific pages

#2 index redirect (requests for index.html, index.php or whatever applies to your site)

#3 extension-related redirects, such as .php to extensionless, or without final slash to with final slash (other than real, physical directories)

#4 domain-name canonicalization and/or protocol (with/without www, http vs https). This is always the very last external redirect

All redirect targets should give the full correct hostname, so #4 is purely a catchall for requests that are perfectly correct in every way except that they're using the wrong form of your hostname (and/or wrong protocol, if applicable).

I recommend leaving a blank line after each RewriteRule, but NO blank line between a rule and its accompanying RewriteCond. The server doesn't care, but it makes it much easier for humans to read--and also makes it more likely that each ruleset stays together when you're rearranging.
10:56 pm on May 1, 2017 (gmt 0)

Preferred Member

10+ Year Member Top Contributors Of The Month

joined:Aug 16, 2004
posts:354
votes: 1


I won't throw a party just yet but I think that thanks to everyone's help here, it's now working as it should!

In relation to a previous post about some lines being redundant, if I removed anything, the redirects didn't work. In the end, what did work was this:


RewriteEngine On

RewriteCond %{THE_REQUEST} \.php

RewriteRule ^tips/([^.]+)\.php$ http://www.example.com/tips/$1/ [R=301,L]

RewriteRule ^tips/([^.]+[^./])$ http://www.example.com/tips/$1/ [R=301,L]

RewriteCond %{THE_REQUEST} ^.*/index.htm
RewriteRule ^(.*)index.htm$ http://www.example.com/$1 [R=301,L]

RewriteCond %{HTTP_HOST} ^example\.com [NC]
RewriteRule (.*) http://www.example.com/$1 [R=301,L]

RewriteRule ^(tips/[^.]+)/$ /$1.php [L]


What has made this difficult for me is that I do not understand re-write rules / htaccess at all - it may as well be in Chinese. Even with everyone here kindly helping me, I often look back and just think "huh?"

I used to have a web developer employed to help me, but he messed things up worse than I would have and ever since I've never trusted anyone else to tweak anything on the website.

Anyway, thanks AGAIN to everyone who has helped here. I suppose the one upside to my multiple posts / questions recently is that it might help someone else in the future in a similar conundrum!

I will be glad to not have to spent anymore time staring at my htaccess folder now!
2:36 am on May 2, 2017 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:Apr 9, 2011
posts:13674
votes: 439


I do not understand re-write rules / htaccess at all - it may as well be in Chinese

Welcome to the club. People who have been here for a long time already know my particular dirty secret, which is: I don't speak a word of Apache. What I do know, thanks to massive text-editing experience, is Regular Expressions. Fortunately, at least 90% of mod_rewrite questions are really just about how to construct the perfect RegEx.

The quirk about Apache, as opposed to just about any other computer-related language you can name, is that it doesn't have operators. No + signs or = signs or {put-stuff-in-brackets-and-do-stuff-to-it} or {if A then B else C}. Instead, every blank space in every line carries meaning. When you meet something like
RewriteRule (.*) http://www.example.com/$1 [R=301,L]
you can think of it as four cells in a table. (This is assuming you know at least a tiny bit of html.) And then you think of the whole "table" as having a header:
<th>who's in charge (“RewriteRule” tells you it's mod_rewrite)</th>
<th>what you're looking for (the "pattern")</th>
<th>what you're doing to it (the "target")</th>
<th>extra stuff to make the server happy</th>
and the only sign that you've moved from column 1 to column 2 to column 3 in your hypothetical table is those blank spaces.

:: :: ::

I've just put an 800-page book into the marinade after ignoring it for several years. By comparison, everything else in the world looks blissfully simple and straightforward.
9:39 am on May 2, 2017 (gmt 0)

Administrator

WebmasterWorld Administrator phranque is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Aug 10, 2004
posts:10846
votes: 61


i would delete those first two RewriteCond directives:
RewriteCond %{THE_REQUEST} \.php
RewriteRule ^tips/([^.]+)\.php$ http://www.example.com/tips/$1/ [R=301,L]

this says if the url path ends with .php make sure THE_REQUEST also contains .php before redirecting the request.
try this instead:
RewriteRule ^tips/([^.]+)\.php$ http://www.example.com/tips/$1/ [R=301,L]


RewriteCond %{THE_REQUEST} ^.*/index.htm
RewriteRule ^(.*)index.htm$ http://www.example.com/$1 [R=301,L]

this says if the url path ends with index.htm make sure THE_REQUEST starts with nothing-or-anything followed by /index.htm before redirecting the request.

try this instead:
RewriteRule ^(.*)index\.htm$ http://www.example.com/$1 [R=301,L]

(also note the added backslash in the Pattern regex)


i would suggest this change to your hostname canonicalization redirect ruleset:
RewriteCond %{HTTP_HOST} !^(www\.example\.com)?$ [NC]
RewriteRule (.*) http://www.example.com/$1 [R=301,L]
9:51 am on May 2, 2017 (gmt 0)

Preferred Member

10+ Year Member Top Contributors Of The Month

joined:Aug 16, 2004
posts:354
votes: 1


Thanks, phranque.

If I change those first two lines at all, it messes everything up.

In relation to the index.htm redirect to / and canonicalization - these both work with your rules and look good. Could I just ask why these rules are better than what I already had? (sorry - this just helps me understand better).
1:04 pm on May 2, 2017 (gmt 0)

Administrator

WebmasterWorld Administrator phranque is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Aug 10, 2004
posts:10846
votes: 61


If I change those first two lines at all, it messes everything up.

what response to you get?

In relation to the index.htm redirect to / ... Could I just ask why these rules are better than what I already had?

the dot is a special character in a regular expression which means any character.
by preceding it with a backslash that makes it a literal dot.

RewriteCond %{HTTP_HOST} ^example\.com [NC]

this fires the RewriteRule if HTTP_HOST is exactly example.com (case insensitive)

RewriteCond %{HTTP_HOST} !^(www\.example\.com)?$ [NC]

this fires the RewriteRule if HTTP_HOST is not exactly www.example.com (case insensitive) or nothing at all.

the "not exactly the canonical hostname" test is more robust than "this exact noncanonical hostname"
the nothing at all case is to avoid a chained/infinite redirect for HTTP/1.0 user agent requests which don't supply a hostname.
1:09 pm on May 2, 2017 (gmt 0)

Preferred Member

10+ Year Member Top Contributors Of The Month

joined:Aug 16, 2004
posts:354
votes: 1


Thanks! In response to the first question - by changing those rules, it suddenly comes up with a 404 error on any directed pages (the redirects work but the pages come up as 404).
1:25 pm on May 2, 2017 (gmt 0)

Administrator

WebmasterWorld Administrator phranque is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Aug 10, 2004
posts:10846
votes: 61


It redirects to the same URL but the subsequent request has a different response?

if so it would be interesting to understand why...
1:31 pm on May 2, 2017 (gmt 0)

Preferred Member

10+ Year Member Top Contributors Of The Month

joined:Aug 16, 2004
posts:354
votes: 1


hmm...I'd love to know why. Leaving the rules as mentioned earlier, everything works fine?!?
6:22 pm on May 2, 2017 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:Apr 9, 2011
posts:13674
votes: 439


this says if the url path ends with .php make sure THE_REQUEST also contains .php before redirecting the request.

It has to, because otherwise you get an infinite loop when there's an internal request for .php files--which, after all, is what the entire site is built around. Unlike index redirects, it isn't covered by the [NS] flag.

try this instead:

RewriteRule ^(.*)index\.htm$ http://www.example.com/$1 [R=301,L]

Oh, phranque, this time I have to vigorously disagree. A leading .* or .+ is an absolute last resort, because the server only works in one dimension; it doesn't know that "index.htm" is coming up, so it has already captured all the way to the end and then has to backtrack. And all this for a capture that, on the overwhelming majority of requests, will end up being thrown away.

Lee, have you ever at any time had URLs in "index.htm"? If no, there's really no reason for this rule at all; search engines won't make the request, and obviously humans won't. Search engines might request "index.html" (the default filename) for entrapment purposes, and here it gets tricky because the [NS] flag won't work if your real index files are called "index.php". But on 999 requests out of 1000--at a minimum--the capture will end up getting thrown away. So this is the are case which it may be more efficient to defer the capture until the condition--in fact, to create a Condition for the sole purpose of capturing from it. If the server never gets as far as evaluating a condition, no capture takes place. If you want to play it safe you can express the pattern as "index\.htm" without closing anchor, and then it works both ways.

:: detour to check something in raw logs ::

Thought so. I have never, ever received a legitimate request for "index.htm" on any site. (In fact, since I've never used .htm, the only .htm requests at all were from assorted malign robots.) Requests for "index.php" are slightly more common--but those, of course, are from malign robots hoping to find CMS files. Oh, and piwik, which really does use index.php. Other than that, no humans, no legitimate search engines. Requests for "index.html" do occur, even if the site in question has never used "index.html" in visible URLs. (This is one of the things I had to check.)

Well, I do get some "index.php" requests from the BLEXBot. But since this robot is notoriously inept at managing its database, it is impossible to tell whether these are honest requests, or someone else's URL getting attached to my filepaths. Shrug.
 

Join The Conversation

Moderators and Top Contributors

Hot Threads This Week

Featured Threads

Free SEO Tools

Hire Expert Members