Forum Moderators: phranque

Message Too Old, No Replies

Question regarding tidying up .htaccess.

how to rewrite IfModules

         

ipco

8:34 pm on Oct 2, 2017 (gmt 0)

10+ Year Member



My .htaccess doesn't have much in it - yet, but I want to have the opportunity to learn and find out what it's all about.

I am following Lucy24's
boilerplate on cleaning up an htaccess file.


Step 2: Get rid of all <IfModule> envelopes. My IfModules look like this,

<Files ~ "\.xml$">
<IfModule !mod_authz_core.c>
Deny from all
</IfModule>


How should they be rewritten?

Thanks

phranque

8:47 pm on Oct 2, 2017 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



how this should be rewritten depends upon whether or not that module is installed.

ipco

8:58 pm on Oct 2, 2017 (gmt 0)

10+ Year Member



Thanks for the quick reply Phranque,

I'm using a small CMS and this was preloaded on the install.
XML is the storage of choice for this CMS so I am assuming it is loaded.

I have just rummaged through apache.org docs and came up with something like this.

Would this work.?

<Files ~ "\.xml$">
<RequireNone>
Require mod_authz_core
</RequireNone>
</Files>

phranque

9:22 pm on Oct 2, 2017 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



are there any other access control directives in your configuration?

lucy24

10:00 pm on Oct 2, 2017 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Would this work?
If you are running Apache 2.4, then yes. If you are not, then no. If you are on shared hosting, as implied by the use of htaccess, you are probably not on 2.4 yet.

It's more useful to start by spelling out in English what exactly you want to do. And then we work out the rules based on those plain-English intentions.

Are you trying to prevent people from accessing all .xml files? If so, what are they there for in the first place? (In particular, do you have an xml sitemap?) If, instead, you want the .xml files to be available internally, but not in response to an explicit request, that's a job for mod_rewrite. For example:
RewriteCond %{THE_REQUEST} \.xml
RewriteCond %{REQUEST_URI} !/sitemap\.xml
RewriteRule \.xml - [F]
But that may not be even remotely what you are trying to achieve; it's just one off-the-top-of-my-head ruleset.

ipco

10:42 pm on Oct 2, 2017 (gmt 0)

10+ Year Member



Phranque,
I have,
a 301 redirect
cache control for images, js and css
AddDefaultCharset UTF-8
Options -Indexes
Options +FollowSymLinks
IfModules to block direct access to the XML files
IfModules to allow access to sitemap.xml
IfModule to handle rewrites for fancy urls

It seems fairly generic stuff. Do I need to post the file? There is nothing specific in it.

ipco

11:37 pm on Oct 2, 2017 (gmt 0)

10+ Year Member



@lucy24, yes I am on shared hosting an No I don't know which version. I'll try to find out.

Ok, in English,
I have one .htaccess file in root preloaded on install.

The site is a small CMS that uses xml for data storage - not mysql - so the xml's need to be blocked from the general public.

There is a <Files ~ "\.xml$"> envelope to 'deny from all' with IfModules containing:
!mod_authz_core.c
mod_access_compat.c
mod_authz_core.c

There is a <Files sitemap.xml> envelope that 'Allows from all' with IfModules containing:
!mod_authz_core.c
mod_access_compat.c
mod_authz_core.c
!mod_access_compat.c

There is a mod_rewrite.c to handle rewrites for fancy urls
This contains two RewriteCond and a RewriteRule

I have just added a 301 code redirect and
Cache control for image files, css and js

As I said in the previous post, It seems fairly generic stuff with nothing specific in it.

I don't know if I'm allowed to post it.

thanks

lucy24

12:43 am on Oct 3, 2017 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



There is a mod_rewrite.c to handle rewrites for fancy urls
This contains two RewriteCond and a RewriteRule
The <IfModule> envelope is most often used with directives that use that specific mod--but it doesn't have to be. (I once experimented on my test site. Basically it means "If this mod is not available, pretend you don't see the enclosed lines of text--independent of what they might happen to say.")

But you need to find out what mods you actually do have. Many of them are based on your apache version: 2.2 calls them by one name, 2.4 by another, and if you're still on 1.3 we are all in trouble. Here is one cute way to find out:

-- on your website create a directory with some made-up name that robots are not likely to guess within the next five minutes
-- create a file called index.shtml (it can be index.html if you already allow includes for the .html extension) which consists in its entirety of the line
<!--#printenv -->

-- also make an htaccess file for this made-up directory. It consists of a series of lines like this:
<IfModule mod_access.c>
SetEnv MOD_mod_access 1
</IfModule>
<IfModule mod_actions.c>
SetEnv MOD_mod_actions 1
</IfModule>
<IfModule mod_alias.c>
SetEnv MOD_mod_alias 1
</IfModule>
et cetera, using all the mod names you can think of--or, to keep it shorter, all the mod names that you currently find in your <IfModule> envelopes. If you want to be really thorough, open the apache docs for versions 2.2 and 2.4, and copy all the module names.

Upload the index.shtml and the htaccess. Now open your brand-new directory and, in the resulting page, View Source in your browser. (You can see it as-is, but when I've tried it, the source-code version is more readable.)

-- copy and paste this into the nearest text document to study at your leisure
-- delete the made-up directory; you don't need it any more.

Memo to self: Try this again and see if I've still got the same settings as in 2014 when I last did it.

phranque

2:12 pm on Oct 3, 2017 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



Do I need to post the file?

please post this part:
IfModules to block direct access to the XML files
IfModules to allow access to sitemap.xml

ipco

3:00 pm on Oct 3, 2017 (gmt 0)

10+ Year Member



IfModules for xml

# blocks direct access to the XML files - they hold all the data!
<Files ~ "\.xml$">
<IfModule !mod_authz_core.c>
Deny from all
</IfModule>
<IfModule mod_access_compat.c>
Deny from all
</IfModule>
<IfModule mod_authz_core.c>
<IfModule !mod_access_compat.c>
Require all denied
</IfModule>
</IfModule>
</Files>

<Files sitemap.xml>
<IfModule !mod_authz_core.c>
Allow from all
</IfModule>
<IfModule mod_access_compat.c>
Allow from all
</IfModule>
<IfModule mod_authz_core.c>
<IfModule !mod_access_compat.c>
Require all granted
</IfModule>
</IfModule>
</Files>

ipco

3:51 pm on Oct 3, 2017 (gmt 0)

10+ Year Member



@Lucy24, the script worked but did not see an Apache version, so I called the host. It is 2.4

lucy24

6:34 pm on Oct 3, 2017 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



It is 2.4
Ooh, I'm envious. (Hm, keyplyr, our mutual host seems to have missed their deadline. Wasn't there some blahblah about third quarter of 2017?)

If you are on 2.4, directives involving "require" can be used, and ones with "allow/deny" should be omitted. So that makes it easy.
<tangent>
If you are on shared hosting, you almost certainly have mod_access_compat, whose purpose is to give backward compatibility to old mod_authzzz_thingummy directives (Allow/Deny and Order). But that's just to keep older htaccess files from breaking; you still don't want both sets of directives lying around. (I don't know what happens if you have both; someone who actually speaks Apache will need to weigh in.)
</tangent>

Edit: The printenv script will not, by itself, tell you what Apache version you're on. But a full listing of mods allows you to deduce it.

ipco

9:20 pm on Oct 3, 2017 (gmt 0)

10+ Year Member



Ooh, I'm envious.
Don't be. I'm not there yet but that's probably (most likely is) my fault.
<Files ~ "\.xml$">
<RequireNone>
Require mod_authz_core
</RequireNone>
</Files>

This didn't work so I don't know if it's a syntax problem but I get a 500 error from it.
The host confirmed it is indeed 2.4

I ran the code through a validator and it didn't recognize <RequireNone> although apache.org say it is good.
Do you think the code is ok? If it is then I will open a ticket with the host.

lucy24

10:09 pm on Oct 3, 2017 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



For starters, try replacing the content of each envelope with a
# comment-line like this
to test whether it's the overall structure, or some syntactic detail, that's making your server unhappy.

Require mod_authz_core
I thought “Require" directives pertained to aspects of the request. The <RequireNone> envelope basically corresponds to anything that was previously introduced by "Deny from".

:: poring over docs [httpd.apache.org] ::

Oh, there it is. It was hiding.
Require ip various-numbers-here
is the exact replacement of your old "Deny from" IP-based lines. The other one that works as a direct translation is
Require env blahblah
replacing "Deny from env=blahblah".

I ran the code through a validator
An Apache validator, you mean?

ipco

11:51 pm on Oct 3, 2017 (gmt 0)

10+ Year Member



the code I put in is
<Files ~ "\.xml$">
<RequireNone>
Require mod_authz_core
</RequireNone>
</Files>

and was previously stated should work on 2.4

That gave me a 500
I commented out Require mod_authz_core and still got a 500 (with and without the trailing .c)
I then commented out the
I commented out the <RequireNone> env and it worked.

I ran the code through an Apache validator - 2 actually - that I found on the web but don't know whether to trust or not. Neither validator understood <RequireNone> so I'm at a loss.

What I'm basically trying to do is deny access to the XML files. I could leave everything as is but I saw an opportunity to improve,
thanks.

lucy24

12:08 am on Oct 4, 2017 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Neither validator understood <RequireNone>
Double-check if there's an overall setting for which version of Apache you're validating. If there isn't--or if you've explicitly selected 2.4 and you still get an error--then, well, cross that validator off your list. The three locutions <RequireAll> <RequireAny> and <RequireNone> belong to the same mod, so it's obviously nonsense for a validator to recognize one of them but not another.

If your main purpose is denying access to the .xml files, I'd stick with the mod_rewrite approach as discussed earlier in this thread. You're not actually denying all access to the files; you're just denying them to outside requests. Remember that apache config files apply to all requests, whether external or internal. That's why mod_rewrite, in particular, has the %{THE_REQUEST} condition and also the [NS] flag.

What did you previously use to control access to your .xml files? Has it been an attested problem, or are you trying to cover all bases in advance?

phranque

12:55 am on Oct 4, 2017 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



Remember that apache config files apply to all requests, whether external or internal.

the context in which a directive is placed determines to which requests it applies.

http://httpd.apache.org/docs/current/mod/core.html#files
The <Files> directive limits the scope of the enclosed directives by filename. It is comparable to the <Directory> and <Location> directives. It should be matched with a </Files> directive. The directives given within this section will be applied to any object with a basename (last component of filename) matching the specified filename. <Files> sections are processed in the order they appear in the configuration file, after the <Directory> sections and .htaccess files are read, but before <Location> sections. Note that <Files> can be nested inside <Directory> sections to restrict the portion of the filesystem they apply to.


a <Files> container in a .htaccess file or in a <Directory> section of a config file will provide a similar response, whether the request is internal or external.

phranque

1:04 am on Oct 4, 2017 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



IfModules for xml

# blocks direct access to the XML files - they hold all the data!
<Files ~ "\.xml$">
<IfModule !mod_authz_core.c>
Deny from all
</IfModule>
<IfModule mod_access_compat.c>
Deny from all
</IfModule>
<IfModule mod_authz_core.c>
<IfModule !mod_access_compat.c>
Require all denied
</IfModule>
</IfModule>
</Files>

<Files sitemap.xml>
<IfModule !mod_authz_core.c>
Allow from all
</IfModule>
<IfModule mod_access_compat.c>
Allow from all
</IfModule>
<IfModule mod_authz_core.c>
<IfModule !mod_access_compat.c>
Require all granted
</IfModule>
</IfModule>
</Files>

i would do this:
- determine which relevant modules (mod_authz_core.c and mod_access_compat.c) are installed or not
- work out the <IfModule> logic to determine which access control directive would fire in each <Files> container
- delete the <IfModule> directives and the irrelevant access control directives

in the end you should end up with one of the following:
<Files ~ "\.xml$">
Deny from all
</Files>

<Files sitemap.xml>
Allow from all
</Files>

or
<Files ~ "\.xml$">
Require all denied
</Files>

<Files sitemap.xml>
Require all granted
</Files>

phranque

1:07 am on Oct 4, 2017 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



IfModule to handle rewrites for fancy urls

I have just added a 301 code redirect...


please post these directives in the order in which they appear in the .htaccess file.

ipco

11:54 am on Oct 4, 2017 (gmt 0)

10+ Year Member



@Lucy24
There was no selection option on the validator and I didn't know whether to trust or not - I was just clutching.
You're correct, I am just denying outside requests.
This has always been the control for access to the xml files. The htaccess came preloaded with the CMS installation but is very generic.I just wanted to clean it up using your boilerplate and add 301 redirects and some cache control.

I thought I was just heading into new territory, seems I stepped in the mire and tripped head first lol.

@phranque
# handle rewrites for fancy urls
<IfModule mod_rewrite.c>
RewriteEngine on

# Usually RewriteBase is just '/', but
# replace it with your subdirectory path
#RewriteBase /ipcodesign/

RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule /?([A-Za-z0-9_-]+)/?$ index.php?id=$1 [QSA,L]
</IfModule>
#
#
# 301 code redirect
RewriteEngine On
RewriteCond %{HTTPS} !=on
RewriteRule ^ https://%{HTTP_HOST}%{REQUEST_URI} [L,R=301]

Please note, when I started this exercise I had everything commented out until I understood what each module did and into which order it should go and was testing one by one. Didn't get very far.

The fancy urls was at the end of the file then I dropped the redirect below it. It hadn't been placed yet - just parked.
To me, logic suggests redirect first then decide what happens to it after that.

I'm thinking I should maybe back out of the whole deal until I understand more and just add the redirect and cache control to what I have.

lucy24

5:00 pm on Oct 4, 2017 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Your CMS code is built around internal rewrites. That means it needs to go after any RewriteRules that create an external redirect. (“External” doesn't mean go to some other site. It just means a response is sent back to the requesting browser.)

Ordinarily you'd omit the <IfModule> lines for mod_rewrite too. But if you're using someone else's CMS, they may need to keep these lines so things get updated appropriately. Unlike certain other types of envelopes such as <Files>, the <IfModule> envelope has no effect on execution order.

ipco

7:09 pm on Oct 4, 2017 (gmt 0)

10+ Year Member



Thanks Lucy24 and Phranque for your help.

I think I am going to back out until I understand it more - which was the whole point, but I think I need to do a lot of reading first; starting this evening. it will probably help me sleep :-)

I do need to add the redirect and cache control but that will be it for just now.

Thanks for helping.

phranque

9:58 pm on Oct 4, 2017 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



assuming your canonical hostname is www.example.com i would use something like this:

RewriteEngine on

# 301 code redirect
# if the requested protocol is not HTTPS or
RewriteCond %{HTTPS} !on [OR]
# if the provided Host header value (if any) is not exactly the canonical hostname (or null)
RewriteCond %{HTTP_HOST} !^(www\.example\.com)?$
# externally redirect any such noncanonical protocol or hostname requests using a 301 status code to the same requested path on the canonical protocol and hostname
RewriteRule (.*) https://www.example.com/$1 [R=301,L]

# handle rewrites for fancy urls
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule /?([A-Za-z0-9_-]+)/?$ index.php?id=$1 [QSA,L]

lucy24

3:52 am on Oct 5, 2017 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



RewriteRule /?([A-Za-z0-9_-]+)/?$ index.php?id=$1 [QSA,L]

Typo for
/index.php?id=$1 [QSA,L]
with leading slash representing the root. Personally I would say [\w-]+ at a savings of eight bytes, but we all know about individual coding styles. And since this is htaccess, we can be certain there will be no leading slash, making it
RewriteRule ^([A-Za-z0-9_-]+)/?$ /index.php?id=$1 [QSA,L]
Or was there something about subdirectories that I missed? Shouldn't matter, though, if there's no opening anchor; it's the same capture either way. (And where did the optional final / come from? Usually with rewrites you don't want to leave anything optional or suddenly you're getting Duplicate Content all over the place and search engines asking for /long-complicated-file-name/index.html.)

Now, technically you don't need the capture at all, do you? Since it's an internal rewrite, your php is perfectly capable of extracting the requested URI unaided; it hasn't gone anywhere.

ipco

1:22 pm on Oct 5, 2017 (gmt 0)

10+ Year Member



Thank you Phranque, I really appreciate it.

phranque

8:46 pm on Oct 5, 2017 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



in my example I simply copy and pasted your RewriteRule without examining it.
i would use the modification lucy24 suggested.

ipco

4:40 pm on Oct 9, 2017 (gmt 0)

10+ Year Member



Thank you, I will.
Thank you both for help. Appreciated.