Forum Moderators: phranque

Message Too Old, No Replies

Please critique - multiple websites using shared assets

FINALLY getting the hang of Apache

         

JAB Creations

4:26 pm on Nov 29, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I've finally had enough examples to actually grasp Apache's syntax however I certainly wouldn't claim to be anywhere as talented as some of the regulars here so I'm asking for folks to please critique my code for order (of code), performance, security and syntax in general.

The Setup
The setup is very simple, multiple websites with the same structures but each has their own content and skin/themes. To work professionally I test everything locally first to minimize the chances that a visitor to whatever site I'm working on will encounter a problem because of bad code that was committed before it was written correctly which means the Apache .htaccess rules below have to work on both localhost and a live domain without modifying the server configuration files (as I have to use shared hosting at least for the time being).

Visually the actual site folders may appear like so...

http:// localhost/site1/
http:// localhost/site2/

The asset directories appear like so...

http:// localhost/admin/
http:// localhost/blog/
http:// localhost/forums/
http:// localhost/scripts/
http:// localhost/themes/

The rewrite rules rewrite the specific modules (e.g. admin, blog, forums) and shared assets (e.g. scripts, themes) like so...

http:// localhost/site1/blog/
http:// localhost/site2/blog/

http:// localhost/site1/scripts/
http:// localhost/site2/scripts/

Specific Modules and the CMS Module
I've built my own CMS as well as various specific modules (e.g. blog, forums, private messaging). Regardless of what module Apache rewrites to that given module handles HTTP codes (e.g. 200, 403, 404, etc).

So one of the important goals was to ensure that specific modules captured requests relative to their path while all other requests are handled by the CMS module.

The image module is what I consider a mixed module as clients will be able to upload their own images however there are certain specific paths that get rewritten. Thankfully these two paths don't conflict so I get both shared assets in the shared images directory and clients get to still use their own image directory for uploading and using their own images (without those images becoming accessible to other clients).

# Shared Image Module
http:// localhost/images/

# Client-specific Images
http:// localhost/site1/images/

Exceptions
One of my goals was to retain my personal homepage that contains quick links to various things I use including links to client websites.

I also noticed that some files that had similar strings for names were catching. I'm not entirely sure I've resolved all of those issues (e.g. scripts/admin.js was being effected by the admin directory rule until I added the condition however in the unlikely condition that I have the work 'admin' in another directory it may or may not effect that URL.

Security Concerns
When you share anything I think security concerns automatically minimally double. Concerns like one client uploading an undesirable image and having it appear on other client's websites are things I've taken in to consideration. In regards to CMS and other module content each client has their own dedicated copy of the database. The database (and file path structures) are determined by the domain in PHP (or the sub-directory for localhost when I do interchangeable testing). So the software can work with any domain I allow it to work with though all the content remains separate. I'm very open to any concerns this setup may incur. No clients have permission to execute any of their own server side code (e.g. PHP) and eval isn't used at all. Since I write literally all of my own software I don't have to worry about third parties deciding to do those things on my behalf. I use exceptionally strict coding practices, log all errors (JavaScript, PHP and MySQL) as well as HTTP responses that aren't 200. I know exactly what's going on and can easily see if someone is attempting to do an SQL injection attack in example or if JavaScript is being from a different domain and numerous other things. I think paranoia is a good code of ethic for security concerns so if anyone has a thought as how to approach an attack against my setup I'd love to hear about any details as I want to address any and every possible security concern.

The Code
http:// localhost/.htaccess

AddHandler application/x-httpd-php .css .js
AddType text/javascript .js

RewriteEngine on
RewriteRule ^$ index.php [QSA]
RewriteCond %{REQUEST_URI} !\.(css|js|zip)$
RewriteRule .*/admin(.+) admin$1 [QSA]
RewriteRule .*/blog(.+) blog$1 [QSA]
RewriteRule .*/forums(.+) forums$1 [QSA]
RewriteRule .*/images/$ .*/images/$ [QSA]
RewriteRule .*/messages(.+) messages$1 [QSA]
RewriteRule .*/redirect\.php redirect\.php [QSA]
RewriteRule .*/scripts(.+) scripts$1 [QSA]
RewriteRule ^(index\.php|test1\.php|test2\.php|images/|scripts/|themes/|redirect\.php) - [L]
RewriteCond %{REQUEST_URI} !.*/(admin|blog|forums|images|messages)
RewriteRule !\.(css|js|zip)$ rewrite.php


What the code does...
The first two lines allow me to execute PHP inside of JavaScript and CSS files. This essentially is used for site visitor preferences.

----

The third line (of code) I obviously turn the RewriteEngine on.

----

The fourth line of code I create an exception for the root index (http:// localhost/). the ^ (starts with) and $ (ends with) with nothing in between equates to http:// localhost/ which is wonderfully simple although I did attempt ="" to avoid regular expressions unsuccessfully. This line allows me to continue using my customized homepage as it works just fine.

----

RewriteCond %{REQUEST_URI} !\.(css|js|zip)$


The fifth line of code is for exceptions for the rules to follow. I added this because scripts/admin.js was being effected (HTTP 404) by the admin rule below. I don't think this line prevents different matches though (e.g. localhost/site1/something/administrative-conduct).

----

RewriteRule .*/admin(.+) admin$1 [QSA]
RewriteRule .*/blog(.+) blog$1 [QSA]
RewriteRule .*/forums(.+) forums$1 [QSA]
RewriteRule .*/images/$ .*/images/$ [QSA]
RewriteRule .*/messages(.+) messages$1 [QSA]
RewriteRule .*/redirect\.php redirect\.php [QSA]
RewriteRule .*/scripts(.+) scripts$1 [QSA]


These lines are for specific module rewrites that are for all URLs that aren't handled by the CMS module. The .* bit dynamically matches the second directory (e.g. "site1" in localhost/site1/whatever.html) up until a forward slash and the matching directory name. I left out the ending slash and used (.+) instead so that the index and all requests inside of those directories would be rewritten to the shared directory paths (e.g. localhost/blog/).

Two things...

First I should note again is that I'm not entirely sure these rules won't match localhost/site1/example/blog-page-concerns, so in other words I'm not certain matching is strictly limited to the first directory (that example the term blog is in the second directory for the client).

Secondly I'm not sure (though I imagine it would be possible) if I can merge these rules in to a single rule (item1|item2|item3). I've gotten this far though I'm not that good or at least not yet.

----

RewriteRule ^(index\.php|test1\.php|test2\.php|images/|scripts/|themes/|redirect\.php) - [L]


These are general exceptions, I intend to keep this list as minimal as possible. I've added test file as examples. Neither the specific modules nor the CMS modules will rewrite these URLs.

----

RewriteCond %{REQUEST_URI} !.*/(admin|blog|forums|images|messages)


This line essentially is an exception list for the specific modules so that the CMS module doesn't rewrite them. If I don't add a directory here then the CMS module rewrite (in the last/next line) is applied.

----

RewriteRule !\.(css|js|zip)$ rewrite.php


Everything else gets rewritten to rewrite.php which handles the CMS module. The list of file extensions is truncated intentionally.

Final Thoughts
I never thought I'd ever get this far with Apache though I have. I don't think the code is perfect hence why I've posted it here for others to critique though I can say at least it works! I'm sure there is redundant code in there and I could very likely adjust some of the rewrite rules to be more restrictive so I'm open to trying that out. If I can understand the why aspect of why the code exists as it does then I can usually grasp on to the how aspect of the syntax involved, that has been the main issue I've had in trying to understand how to write Apache syntax.

So I'm looking for any critiquing no matter how minimal the concerns or syntax changes may be. Any potential security concerns (doesn't have to be an explicit security hole) are especially important though I also want to improve performance and eliminate redundancy wherever possible as well please.

Alright, fire away please! :)

- John

lucy24

9:05 pm on Nov 29, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



This is the one that jumped right out at me:
RewriteRule .*/admin(.+) admin$1 [QSA]
RewriteRule .*/blog(.+) blog$1 [QSA]
RewriteRule .*/forums(.+) forums$1 [QSA]
RewriteRule .*/images/$ .*/images/$ [QSA]
RewriteRule .*/messages(.+) messages$1 [QSA]
RewriteRule .*/redirect\.php redirect\.php [QSA]
RewriteRule .*/scripts(.+) scripts$1 [QSA]


NEVER EVER start a RegEx with .* Here luckily you are not capturing, so you don't need it at all. But the wording is problematic because the Rule as written will apply both to

www.example.com//{name you're looking for}/blahblah
AND
www.example.com/folder1/folder2/folder3/{name you're looking for}/blahblah

If you want to skip over exactly one directory, say

^[^/]+/

I assume the Rule involving /images/ is a typo so let's set that aside.

And then you can consolidate most or even all of those separate Rules:

RewriteRule ^[^/]+/(admin|blog|forums|messages|scripts|redirect\.php)(.*) $1$2 [QSA]

though you might want to keep the "redirect.php" separate. Note that (.*) does not have to have any content; RegEx and mod_rewrite don't care.

Are you intentionally omitting the [L] flag? Do this with EXTREME CAUTION. You don't need [QSA] if you have not set a fresh query; it is default mod_rewrite behavior.

Reduce each of your explanations to 1 or 2 lines and incorporate them into the htaccess itself as #comments so you will remember what the Rule is supposed to do. Similarly, leave a blank line after each Rule, unless-- maybe-- you've got a package of Rules that all do the same thing. But always space around anything with Conditions so you remember that the Condition only applies to the first Rule it meets.