Forum Moderators: phranque

Message Too Old, No Replies

is there such a thing of an htaccess that's too big?

         

jake66

2:17 am on Jan 9, 2006 (gmt 0)

10+ Year Member



is there a limit to how much information htaccess can contain before it starts slowing down the load of the site?

jdMorgan

3:20 am on Jan 9, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Yes, but that depends on how busy your site is. The only real way to tell is to test with and without some of the larger 'chunks' of code. While 20kb might be way too big for one site, another might get away with 60kB, or even larger.

It is good practice when writing code for .htaccess to make the conditions under which that code is activated as specific as possible. It's also good to be aware of the functions that consume the most time and the most CPU time, and avoid those when possible. For example, a very popular thing to do is to rewrite all files that don't exist to a php script:


RewriteCond %{REQUEST_FILENAME} -f
RewriteRule .* /make_dynamicpage.php [L]

The problem with this is that every single request to the server results in a search of the filesystem to see if the requested resource exists. This is in addition to the filesystem access that will be required to serve that resource if it does in fact exist.

So it's better if you can limit the circumstances unders which that filesystem check takes place. For example:


RewriteCond %{REQUEST_FILENAME} -f
RewriteRule \.html /make_dynamicpage.php [L]

will do the filecheck only if the request is for a .html page, and not for images, css files, external JS, or .php pages. Since most pages contain multiple images, this one little change can make a huge difference in server performance impact.

It's not obvious, but RewriteConds are not processed unless the RewriteRule pattern matches, and they are then processed in order. So another way to minimize the server impact is to use multiple RewriteConds, and put the one that checks for file exists last.

Another resource hog is the use of hostnames rather than IP addresses in access control code; If you require your server to look up a hostname, then your server has to send a request to the DNS system. This hangs the user request until your server gets an answer from DNS, and can lead to big problems if the DNS requests fail.

If you have a long list of 'bad-bots' then it's good to check to see which ones are actually a problem on your site by reviewing your access logs or 'stats'. Most bad-bot lists contain obsolete entries -- many 'bots that haven't been around for a year or more.

Anyway, just some random thoughts on the subject.

Jim

jake66

3:42 am on Jan 9, 2006 (gmt 0)

10+ Year Member



thank you for the informative post =)
at present i do not have any url entries (except for the banned referral url's to avoid referral spamming)

i have 32 ip's
13 banned referral words (sex-related, spammy sites, etc)
url rewrite:
RewriteEngine on
RewriteBase / ....and 5 rules
and hotlink protect
5 banned countries (which doesn't seem to work)
404 redirect from .shtml to .php
... total size: 2.53 KB

it isn't a very active site compared to what most here probably have. i'd say an average of 300+ hits a day

jdMorgan

5:45 pm on Jan 11, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



At 300 uniques per day, you could probably go to 250kB without noticing anything -- not that I'd recommend it... ;)

Jim