homepage Welcome to WebmasterWorld Guest from 54.197.108.124
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member
Visit PubCon.com
Home / Forums Index / Code, Content, and Presentation / Apache Web Server
Forum Library, Charter, Moderators: Ocean10000 & incrediBILL & phranque

Apache Web Server Forum

    
Anything unexpected in here?
Quick check for anything unintentional please.
Wayder




msg:4559061
 10:56 pm on Mar 27, 2013 (gmt 0)

For a while I have been using a separate domain for final checking and I 403 everyone exept myself on the test domain. When I move the site to the correct domain I have had to change the domain name in .htaccess.

I have been playing with ENV and have come up with the following so I no longer have to change .htaccess and I can just upload to the live site.

I'm not too familiar with .htaceess, so could you have a quick look and tell me if I have done someting that will be detremental that I havent realised.

-------------------------------------------------------
SetEnvIfNoCase Host example\.com block
SetEnvIfNoCase Request_URI "^(/403\.htm|/robots\.txt)$" allow
SetEnvIf Cookie memyselfandi allow
SetEnvIfNoCase Host example\.com HTTP_DOMAIN=example.com
SetEnvIfNoCase Host example\.net HTTP_DOMAIN=example.net

order deny,allow
deny from env=block
allow from env=allow

Options All -Indexes -IncludesNOEXEC
RewriteEngine on

ErrorDocument 403 /403.htm
ErrorDocument 404 /404.htm

# Rewrite files that I do not wish accessed to a 404
RewriteRule (my_file.htm|my_other_file.htm) - [R=404,L]

# Redirect index.htm to folder
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /([^/]+/)*index\.htm\ HTTP/
RewriteRule ^(([^/]+/)*)index\.htm$ http://www.%{ENV:HTTP_DOMAIN}/$1 [R=301,L]

# Redirect non-canonical to www
RewriteCond %{ENV:REDIRECT_STATUS} !(403|404)
RewriteCond %{HTTP_HOST} !^w{3}\. [NC]
RewriteRule (.*) http://www.%{ENV:HTTP_DOMAIN}/$1 [R=301,L]
-------------------------------------------------------

The 403|404 RewriteCond is to prevent the redirect. I'm not 100% sure if this is working the way I think it should, but in the testing I did, it seems to.


Thanks for your help

Ray...

[edited by: phranque at 1:00 pm (utc) on Apr 20, 2013]
[edit reason] unlinked urls [/edit]

 

phranque




msg:4566610
 1:51 pm on Apr 20, 2013 (gmt 0)

RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /([^/]+/)*index\.htm\ HTTP/
RewriteRule ^(([^/]+/)*)index\.htm$ http://www.%{ENV:HTTP_DOMAIN}/$1 [R=301,L]


i'm not sure why you need the RewriteCond in that ruleset.

RewriteCond %{ENV:REDIRECT_STATUS} !(403|404)
RewriteCond %{HTTP_HOST} !^w{3}\. [NC]
RewriteRule (.*) http://www.%{ENV:HTTP_DOMAIN}/$1 [R=301,L]


REDIRECT_STATUS might be a CGI variable that doesn't require the ENV: prefix like other environment variables.

depending on how your DNS and web server are configured, it's possible that ruleset could fail to do the canonical hostname redirect of a request for http://www.www.example.com/

it might be better if you constructed a more specific CondPattern for HTTP_HOST.

lucy24




msg:4566673
 7:54 pm on Apr 20, 2013 (gmt 0)

Huh. How did this post get missed for almost a month?

i'm not sure why you need the RewriteCond in that ruleset.

You need to make sure the rule isn't messing with the results of mod_dir activity. (Belt and suspenders: mod_dir normally executes after mod_rewrite, but you should protect yourself anyway.) You can achieve the same result with the [NS] flag. This flag works on SSIs and on both parts of mod_dir, but it doesn't work on the results of internal rewrites.

For a while I have been using a separate domain for final checking and I 403 everyone exept myself on the test domain.

Hm... OK... But really, a 500-class response would be more appropriate.

The conventional format for a domain-name-canonicalization condition is
!^(www\.example\.com)?$
meaning "exactly this form or exactly nothing". The "exactly nothing" is to allow for HTTP/1.0 requests. These are becoming rare for humans, but do still exist. I looked through my own logs recently, prompted by another thread, and found only proxies. But I think some exceedingly remote satellite-only areas are also 1.0.

Wayder




msg:4566688
 8:44 pm on Apr 20, 2013 (gmt 0)

Firstly, thanks for unlinking the url's phranque and REDIRECT_STATUS is definitely not a CGI variable for me.

Lucy24
500 response

As I understand it: "Response status codes beginning with the digit 5 indicate cases in which the server is aware that it has erred or is incapable of performing the request."

I use the 403 Forbidden response because my server did not err
"The server understood the request, but is refusing to fulfill it. Authorization will not help and the request SHOULD NOT be repeated."

So I figured 403 "I don't want you to be here, go away" to be the correct response. Willing to be enlightened on the 500 errors though.

The reason I used
RewriteCond %{HTTP_HOST} ^w{3}\. [NC]

is because when I tried using

RewriteCond %{HTTP_HOST} !^(www\.%{ENV:HTTP_DOMAIN})?$ [NC]

I got an infinite loop. If anyone has a solution for this, I would be grateful.

Thanks for your help

Ray...

lucy24




msg:4566738
 12:11 am on Apr 21, 2013 (gmt 0)

500 doesn't have to mean that the server is physically unable to meet the request. It simply means "sorry, can't help you". In particular it's the appropriate response if the site is closed for maintenance. A 403 might send the entirely wrong message, since it essentially says "I don't like your face". So the focus is on the visitor, not the host.

Suppose that, say, haha, a friend came to your house at a time you weren't able to receive them. Sure, slamming the door in their face conveys the message quickly and effectively. But what happens next week when the house is clean, your in-laws have gone back to Peoria and you're wondering why nobody wants to visit?

RewriteCond %{ENV:REDIRECT_STATUS} !(403|404)
RewriteCond %{HTTP_HOST} !^w{3}\. [NC]
RewriteRule (.*) http://www.%{ENV:HTTP_DOMAIN}/$1 [R=301,L]
-------------------------------------------------------

The 403|404 RewriteCond is to prevent the redirect.


I think you really don't need this. You definitely don't need the 404, since the server does not even look for the requested file until all rewrites, redirects and access-control business are out of the way. And there's no point to the 403, since the access-control parts of mod_rewrite will always come before the rewrite-or-redirect parts. Ahem. They do, don't they? If you've already locked the visitor out, they will never get this far. And, since each module is an island, the same goes for people who have been locked out by some earlier mod. That's assuming for the sake of discussion that there were earlier mods affecting the request; usually there aren't.

Oh, and the {number} construction is often useful: the form [A-Z]{3,9} for example. But "w{3}" is not only unnecessary; it's counterproductive. It's making the server do more work than if you'd said "www" in the first place. Not only does it have to read one more byte (four instead of three), it has to parse an expression instead of simply matching literal text.

when I tried using

RewriteCond %{HTTP_HOST} !^(www\.%{ENV:HTTP_DOMAIN})?$ [NC]

I got an infinite loop.

infinite loop = the rule executes over and over = the condition is always met. Since you've got a negative condition, that means the HTTP host is never "www" plus "http_domain". There are lots of possible ways for this to happen; it's enough to note that two of the possibilities are
http_domain = current hostname (including www, if any)
and
http_domain = undefined

phranque




msg:4566743
 12:28 am on Apr 21, 2013 (gmt 0)

SetEnvIfNoCase Host example\.com HTTP_DOMAIN=example.com
SetEnvIfNoCase Host example\.net HTTP_DOMAIN=example.net


your CondPattern should only match hostnames for these two domains.

Wayder




msg:4566847
 7:31 pm on Apr 21, 2013 (gmt 0)

Thanks, I do enjoy the discussion and critique. Being locked in my bunker on my own doesnít give me much chance to discuss these things in detail very often. It takes me out of my own thought processes, and I do appreciate it.

lucy24
I think for me, a 403 is the right message because (to continue the analogy) all my friends have a key to my house. Anyone else doesnít even get the door opened never mind slammed in their face and Iím a bit too long in the tooth to be polite to everyone I meet but, if I do meet anyone that I wish to allow into my house, I give them a key. I do use a 503 and set the retry header for the live domain when I am updating it I just donít think itís appropriate here as this domain is permanently closed to everyone other than myself, and a changing select few.

I do think I need to look again at the 403 & 404 behaviour because if I take the 403|404 RewriteCond out, I get a 301 to the www, then a 403/404 response. i.e. example.com will be www.example.com (forbidden/error). If I leave it in, I just get the single 403/404 response with or without www. Iím not sure if this behaviour is correct but I would like a single response.

The deny/allow is the first thing I do as you can see in the OP.

If used, change W{3} to www, noted, thank you.

phranque
I wanted to define the domains once and use ENV:HTTP_DOMAIN rather than test for (example\.com|example\.net) as it means editing in multiple locations which is something I try to avoid, but after searching intently, this seems to be a problem for a few people, and it seems the way I wish to use the ENV is not possible.

Unless someone has a solution.

Thank you

lucy24




msg:4566871
 9:23 pm on Apr 21, 2013 (gmt 0)

The deny/allow is the first thing I do as you can see in the OP.

Uh-oh. Big problem here.

Within any one module, commands execute in the order you put them. But each module as a whole executes in the order set by the server. It has nothing to do with the arrangement of directives within the htaccess or config file. The different modules pass through htaccess one at a time. When it's mod_rewrite's turn, it sees only mod_rewrite directives. When it's mod_alias's turn, it sees only mod_alias directives. When it's mod_setenvif's turn... and so on.

Deny/allow is mod_auth-whatsit. (Exact names will depend on Apache version. Go far back enough and it's mod_access instead.) It typically executes immediately before the core, and definitely after mod_rewrite has done its thing.

Wayder




msg:4567331
 6:22 pm on Apr 23, 2013 (gmt 0)

Hmmm. Panic now, or later?

lucy24




msg:4567353
 7:24 pm on Apr 23, 2013 (gmt 0)

Panic is a core function, so it doesn't execute until after anger, denial, bargaining, grief and acceptance.

Whoops! Got two scripts garbled there. Never mind, then.

Oh, btw: Rereading your initial post I see you've got a separate test domain, and that's the one returning the wholesale 403s. You are right, then. It's only when you are remodeling on the "real" domain that a temporary 503 is preferable.

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Code, Content, and Presentation / Apache Web Server
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved