Forum Moderators: phranque

Message Too Old, No Replies

RewriteCond {HTTP HOST}!=""

better than RewriteCond {HTTP_HOST} .?

         

dhiggerdhigger

4:36 pm on Nov 14, 2007 (gmt 0)

10+ Year Member



My force-canonical URL commands look like this:

RewriteEngine on
Options +FollowSymLinks
RewriteCond %{HTTP_HOST} ^example\.com [NC]
RewriteCond %{HTTP_HOST} .
RewriteRule ^(.*) http://www.example.com/$1 [L,R=301]

would

RewriteCond {HTTP_HOST}!=""

be better for the 4th line? If so, why? (I have read that it is, but I haven't seen that advised before.)

jdMorgan

9:52 pm on Nov 14, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Only if you are absolutely sure that the hostname will be an exact match. For example, if the client appends a port number, your RewriteCond won't match because "example.com:80" is not exactly equal to "example.com"

For this reason, I recommend never end-anchoring the hostname when using the regular-expressions method. Or, if it's deemed necessary or desirable to end-anchor it, accounting for the possibility of an appended port number in the pattern, e.g. ^example\.com(:80)?$ or ^example\.com(:[0-9]{1,5})?$ etc.

If you had asked this question in the context of strings that could always be expected to be exact matches, then the answer would be that it's largely a matter of style.

Jim

jdMorgan

10:18 pm on Nov 14, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Actually, I mis-counted and so answered a question you didn't actually ask.

The above discussion applies to the third line.

You don't need the fourth line at all, because it's redundant with any positive-match hostname compare. The only time you need to check the hostname for non-blank is when you're doing a negative compare, as in:


RewriteEngine on
Options +FollowSymLinks
RewriteCond %{HTTP_HOST} .
RewriteCond %{HTTP_HOST} !^www\.example\.com
RewriteRule (.*) http://www.example.com/$1 [R=301,L]

Here, the 'check for non-blank' is required if the host is reachable via HTTP/1.0 -- That is, if it has a unique IP address. This is because many HTTP/1.0 clients do not send a Hostname header, that header being undefined in HTTP/1.0. So, if no header is received, and the code above did not have the blank check, then it would cause a redirection loop because the client would make a request without a hostname header, get redirected, and then make another request without the hostname header, and so get redirected again, and so on, until either it or the server reached its maximum redirection limit.

Understand that the number of true HTTP/1.0 clients --as opposed to 'extended' HTTP/1.0 clients like Googlebot-- is very, very small. But because any true HTTP/1.0 can put your server into a redirection loop, it's worth taking preventative measures when a negative-match hostname pattern is used.

Jim

[edited by: jdMorgan at 10:19 pm (utc) on Nov. 14, 2007]

dhiggerdhigger

10:30 pm on Nov 14, 2007 (gmt 0)

10+ Year Member



Thanks Jim, very educational/helpful!

Dave

dhiggerdhigger

10:58 pm on Nov 14, 2007 (gmt 0)

10+ Year Member



Just realised I still don't know (!) :

Is


RewriteCond %{HTTP_HOST}!=""

better than

RewriteCond %{HTTP_HOST} .

?

jdMorgan

11:19 pm on Nov 14, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



A matter of style -- and three additional characters that must be parsed.

Jim