Welcome to WebmasterWorld Guest from 34.204.168.57

Forum Moderators: Ocean10000 & phranque

Message Too Old, No Replies

rewrite rule fails on some browsers

     
2:01 pm on Jul 14, 2014 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month

joined:Oct 15, 2004
posts:942
votes: 0


In our site i had a set of rules to add www in our domain and to remove index.
These rules are:

#######################################################
# remove index.htm/html from url
# rewrite non-www into www
#
# RULE A = combine both lack of www and presence of index
# RULE B = only lackof www
# RULE C = presence of index (htm or html)
######################################################
#rule A
RewriteCond %{HTTP_HOST} !^www\. [NC]
RewriteCond %{REQUEST_URI} ^(.*/)(index\.html|index\.htm)$ [NC]
RewriteRule ^(.*)$ http://www.%{HTTP_HOST}/$1 [R=301,L]
#rule B
RewriteCond %{HTTP_HOST} !^www\.
RewriteRule ^(.*)$ http://www.%{HTTP_HOST}/$1 [R=301,L]
#rule C
RewriteCond %{REQUEST_URI} ^(.*/)(index\.html|index\.htm)$ [NC]
RewriteRule . %1 [R=301,NE,L]

This was working as intended up to a few days ago.
Now this rule fails in IE8-IE11 and in Opera but keeps working in other browsers in my PC (FF, Chrome,Safari).


Note1: when i say it fails, I mean only the www adding part. Removing index still works in all browsers. (in other words, example.com is NOT redirected to www.example.com)

Note 2: all tests in all browsers are with clear cache/browser/website data.

I understand that htaccess is executed in server and is irrelevant to the user's browsers. Yet, this is happening .

Any ideas on what to look/fix/update?
It's a shared server and i don't have access to anyplace else except the ftp root.

[edited by: incrediBILL at 3:14 am (utc) on Jul 15, 2014]
[edit reason] fixed links [/edit]

2:51 pm on July 14, 2014 (gmt 0)

Senior Member

WebmasterWorld Senior Member wilderness is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Nov 11, 2001
posts:5507
votes: 5


Not sure why you need three rules to accomplish what you may in one.

There are numerous examples of the following in the archives.

RewriteCond %{HTTP_HOST} !^(www\.example\.com)?$
RewriteRule (.*) [example\.com...] [R=301,L]

P.S. if this is incorrect (been in place on my sites for more than a decade), than somebody will be along shortly to provide more info.
3:44 pm on July 14, 2014 (gmt 0)

Senior Member

WebmasterWorld Senior Member penders is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:July 3, 2006
posts: 3150
votes: 4


Rule A doesn't look quite correct... Shouldn't that be %1 in the RewriteRule substitution, not $1? Otherwise I can't see how it would remove the directory index?

As far as browsers are concerned, rule C perhaps looks the most problematic, since you are rewriting to a root-relative, rather than absolute path - but you say this is working OK?

Is Opera the later WebKit version?

What do the HTTP response headers report?

Do you have other directives in your .htaccess file?

What has changed? ;)


Not sure why you need three rules to accomplish what you may in one.


Well, the OP is doing more than just redirecting non-www to www, they are also removing the index document (and trying to do it in just one redirect, rather than two). Your rule is essentially the same as #rule B above.

[edited by: penders at 3:49 pm (utc) on Jul 14, 2014]

3:48 pm on July 14, 2014 (gmt 0)

Administrator from US 

WebmasterWorld Administrator not2easy is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Dec 27, 2006
posts:4527
votes: 350


Best to start at the top - there are two rules needed, not 3. (1 rule can work on some sites, depending on the structure) but generally 2 rules:
First take care of the index, then come back for the www or else you are needing to loop back. Less work to process it once.
when you use "{REQUEST_URI}" it only processes that rule for that one page. When you are capturing whatever page is being requested to apply the rule to, it is better to use "%{THE_REQUEST}" for your rule Condition.

IF the rules for index apply to more than one directory, this is done by capturing whatever comes before index.html (or.php) and adding it to before the reuslting Rewrite URL.

RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /.*index\.html\ HTTP/
RewriteRule ^(.*)index\.html$ /$1 [R=301,L]


I don't know what Rule B is supposed to do, but it is mostly saying, "please loop" since the condition and target are the same (except the server is looking for something else in that position).

Rule C:
The first line is only for a specific request so other pages won't have the same behavior.
The target for your redirect should simply be the the domain name and not have %{HTTP_HOST} in it, you may be lucky that some browsers deciphered your intentions and stopped looping.

For the second rule, try something like:
RewriteCond %{HTTP_HOST} ^example\.com$ [NC]
RewriteRule ^(.*)$ http://www.example.com/$1 [R=301,L]
4:17 pm on July 14, 2014 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:Apr 9, 2011
posts:15900
votes: 876


RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /.*index\.html\ HTTP/

And when you say ".*" you of course meant "\S+" ;) though for at least 90% of sites [^.\ ]+ is closer to what you want. In fact you can generally get away with "index\.html" alone, unless you have reason to fear that links or search engines are sending query strings containing the "index.html" text. (In fact if you did need to exclude these, your condition would have to be a lot more complicated.)

I understand that htaccess is executed in server and is irrelevant to the user's browsers.

Reason and logic says you are right. But this is not the first time someone has discovered a browser-specific issue in an area where the browser can't possibly have anything to say.

Have you got multiple domains passing through the same htaccess? If not, use the actual name rather than HTTP_HOST. (This is a general principle. Never do something with a lookup or capture that can be done with literal text.)

Redirects go from most specific to most general. So the index redirect --just one-- belongs immediately before the with/without www redirect, no matter what other rules you've got.

The form
(index\.html|index\.htm)
is never necessary. Just say
index\.html?
and then only if your site really uses both extensions. The earlier rules make it look as if it doesn't.

Oh yes and...
RewriteCond %{HTTP_HOST} ^example\.com$ [NC]

Express this as a negative: "if the host is anything other than the one acceptable form". So it becomes
RewriteCond %{HTTP_HOST} !^(www\.example\.com)?$
8:34 am on July 15, 2014 (gmt 0)

Junior Member

5+ Year Member

joined:Apr 6, 2013
posts:149
votes: 0


> though for at least 90% of sites [^.\ ]+ is closer to what you want.

You keep trying to tell yourself that, but it just isn't true. Facebook can have periods in the URL path. Wikipedia can have periods in the URL path. Amazon. Google forums. Alexa. In addition to the Apache and PHP websites that you should already know about. If you bother to look, you'll find many more. Fact is periods are perfectly valid and reasonably common, and if your code breaks in the presence of a period, then that's a bug in your code. You don't even get any benefit from a pattern such as [^.]. It isn't any faster, and it's less robust and less correct.
10:01 am on July 15, 2014 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month

joined:Oct 15, 2004
posts:942
votes: 0


first of all, thanks to all of you for your suggestions.
try some of your ideas but still i dont manage to solve my issue

@penders - you are correct for the typo on rule A. Opera version 12.17. The htacceess is 170kb, so yeah.. too many rules inside. As to what has changed I cannot tell on server side, but on my machines, I upgraded to winxp SP3 (previous i had sp2). I know i am ancient, but i am a creature of habit.
As for the responce.. "Could not locate remote server"

@not2easy. Thanks for the tips.

@lucy24. No, single domain, single htaccess in root. Thanks for tips


some debugging on response headers i get from various browsers follows:


For Chrome (v.35) i get this response :
Request URL:http://example.com/some-page.htm
Request Headers CAUTION: Provisional headers are shown.
Accept:text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
Cache-Control:max-age=0
Referer:
User-Agent:Mozilla/5.0 (Windows NT 5.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/35.0.1916.153 Safari/537.36

and on screen: Oops! Google Chrome could not find example.com


For Safari (v.5.1.7) I get this response :
Request URL:http://example.com/some-page.htm
Request Header
Accept:text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
User-Agent:Mozilla/5.0 (Windows NT 5.1) AppleWebKit/534.57.2 (KHTML, like Gecko) Version/5.1.7 Safari/534.57.2

and on screen: Safari can't find server


In IE8, using fiddler i get:
Client:
Accept: */*
Accept-Encoding: gzip, deflate
Accept-Language: en-us
User-Agent: Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; .NET CLR 2.0.50727; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729)
Transport:
Host: example.com
Proxy-Connection: Keep-Alive

Also i get this:
HTTP/1.1 502 Fiddler - DNS Lookup Failed
Date: Tue, 15 Jul 2014 09:19:02 GMT
Content-Type: text/html; charset=UTF-8
Connection: close
Cache-Control: no-cache, must-revalidate
Timestamp: 12:19:02.203

[Fiddler] DNS Lookup for "example.com" failed. System.Net.Sockets.SocketException No such host is known

In IE9, win7 sp1 it works.
In IE11 win7 sp2, it doesnt work (dont know headers, i must rely on a friend's findings to test it)


On Firefox (v.30) i get this (using live http headers plugin)
http://www.example.com/some-page.htm

GET /some-page.htm HTTP/1.1
Host: www.example.com
User-Agent: Mozilla/5.0 (Windows NT 5.1; rv:30.0) Gecko/20100101 Firefox/30.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-US,en;q=0.5
Accept-Encoding: gzip, deflate
Connection: keep-alive

HTTP/1.1 200 OK
Date: Tue, 15 Jul 2014 09:44:25 GMT
Server: Apache
X-Powered-By: PHP/5.3.3-7+squeeze15
Expires: Thu, 19 Nov 1981 08:52:00 GMT
Cache-Control: max-age=1, private, must-revalidate
Pragma: no-cache
Vary: Accept-Encoding
Content-Encoding: gzip
Content-Length: 10371
Keep-Alive: timeout=15, max=100
Connection: Keep-Alive
Content-Type: text/html
12:17 pm on July 15, 2014 (gmt 0)

Senior Member

WebmasterWorld Senior Member penders is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:July 3, 2006
posts: 3150
votes: 4


The htacceess is 170kb, so yeah.. too many rules inside.


How do you know there isn't a conflict with another directive? (Although if nothing has changed in this respect then I guess this is unlikely?)

As for the responce.. "Could not locate remote server"


Hhhmmm... this looks like a more fundamental problem? Are you getting this error on the initial request (ie. before the server has responded with a redirect) or on the second request, after the server has responded with the redirect?

To be honest, this looks a lot like a DNS error? It is quite probable that some browsers have cached DNS responses - although you say you've cleared all caches? Including your machines DNS cache?

For Chrome (v.35) i get this response :


These are actually the request headers, not the response headers. If you are getting this error after the initial request and not actually receiving a response then (see above) the server/.htaccess isn't even getting a look-in.

On Firefox (v.30) i get this (using live http headers plugin)
http://www.example.com/some-page.htm


You need to test "http://example.com/some-page.htm" (ie. without the www). By testing the above URL you are not testing the redirects in your original post. I would expect to see a "Location:" header (and a status code of 301) if you are. If the "Location:" header is malformed then this might cause problems in browsers (as Lucy suggests, try "using the actual name rather than HTTP_HOST".)
12:57 pm on July 15, 2014 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month

joined:Oct 15, 2004
posts:942
votes: 0


How do you know there isn't a conflict with another directive?

All of the site is working with same htaccess for 15 months now. Problem with www/non-www redirections appeared in the last week or so.

Are you getting this error on the initial request (ie. before the server has responded with a redirect) or on the second request, after the server has responded with the redirect?

These are actually the request headers, not the response headers

On initial request. I dont see any response from server, cant see any 301 redirection or 404 or any other header.
Only browser's response of error page.

You need to test "http://example.com/some-page.htm" (ie. without the www)

Although i keep cleaning cache, temp internet files and whatever else i can think of, FF insists on adding www even though i type the url with www to test it.


A little while ago i was informed that the ssl library(?) in server was updated recently. Is it possible that something in there is causing a conflict/problem?

I start to think i must contact server administrator to help me debug the issue.
2:02 pm on July 15, 2014 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month

joined:Oct 15, 2004
posts:942
votes: 0


It seems that Firefox also has the same problem. After creating a new profile, I tested and saw that non-www urls are not redirected to www urls. So as of now, all browsers behave the same, with general error message: server not foud
2:31 pm on July 15, 2014 (gmt 0)

Senior Member

WebmasterWorld Senior Member penders is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:July 3, 2006
posts: 3150
votes: 4


This does look more like a DNS or server configuration issue. "example.com" (without the www) is not even reaching the server - no response from server - the .htaccess is not even processed. (?)

What does the DNS report for "example.com"?

(Wild curiosity... does "www.www.example.com" (Yes, 2 x www) return anything?!)
6:43 pm on July 15, 2014 (gmt 0)

Administrator

WebmasterWorld Administrator phranque is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Aug 10, 2004
posts:11846
votes: 242


Have you tried using nslookup?
6:51 pm on July 15, 2014 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:Apr 9, 2011
posts:15900
votes: 876


Wild curiosity... does "www.www.example.com" (Yes, 2 x www) return anything?!

If you know how this kind of thing works, I hope you're following ivanvias's thread next door. (The original one, not the follow-up.)
6:06 am on July 16, 2014 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month

joined:Oct 15, 2004
posts:942
votes: 0


Yes DNS seems the only logical solution at the moment.

www.www.example.com returns "server not found"

nslookup - that's new to me. I don't know how to process that information.
I found a free online tool and run 2 tests (example.com and www.example.com).
Both tests return the same named server if i understand that correctly. example.com return the MX record as well.
9:26 am on July 16, 2014 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month

joined:Oct 15, 2004
posts:942
votes: 0


i got response from server admin
it was indeed a dns problem
it should be ok in a few hours

thanks for tolerating my ignorance :)