homepage Welcome to WebmasterWorld Guest from 54.167.10.244
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member
Visit PubCon.com
Home / Forums Index / Code, Content, and Presentation / Apache Web Server
Forum Library, Charter, Moderators: Ocean10000 & incrediBILL & phranque

Apache Web Server Forum

    
How to optimize / profile resources used by .htaccess
Need to check what is causing high httpd load
surfgatinho




msg:3999156
 10:07 am on Oct 1, 2009 (gmt 0)

One of my sites now has a fairly bloated htaccess file. The server is also reporting a constant high load.

Having gone through all the possibilities I thought it'd be good to check if the .htaccess file was causing any problems.

Any ideas how to go about doing this?

Thanks,
Chris

 

jd01




msg:3999630
 3:19 am on Oct 2, 2009 (gmt 0)

The biggest thing is probably going to be your regular expressions...

Go ahead and post an examplified snip-it, so we can get an idea of what types of expressions you are using... My guess is if you see the * character very often, there is a more efficient way of doing things.

You might want to have a look at the Regular Expression Tutorial in the Forum Charter (link @ the top left of the page) to get an idea of how regular expression processing works.

surfgatinho




msg:3999795
 11:12 am on Oct 2, 2009 (gmt 0)

Hi jd,

Below are a few choice snippets from what I'd guess are around 150 rules:


<FilesMatch "\.(css夸s如hp多tm多tml?)$">
php_flag zlib.output_compression On
php_value zlib.output_compression_level 6
</FilesMatch>

RewriteEngine on
RewriteCond %{HTTP_HOST} ^example.com$
RewriteRule ^(.*)$ http://www.example.com/$1 [R=301,L]
redirect 301 /add.php http://www.example.com/page/add

RewriteRule ^accommodation/West_xyzxyzx\.htm$ /accommodation/region/1 [L]

RewriteRule ^accommodation/special_offers\.htm$ /accommodation/last-minute-availability.htm [R=301]
RewriteRule ^accommodation/last-minute-availability\.htm$ /accommodation/special_offers/xyzxyzx[L]
RewriteRule ^accommodation/([A-Z][A-Za-z-]+)/special_offers\.htm$ /accommodation/$1/last-minute-availability.htm [R=301]
RewriteRule ^accommodation/([A-Z][A-Za-z-]+)/last-minute-availability\.htm$ /accommodation/special_offers/$1 [L]
RewriteRule ^accommodation/([A-Z][A-Za-z-]+)/last-minute-availability-([0-9]+)\.htm$ /accommodation/special_offers/$1/$2 [L]

RewriteRule ^(transport圯ntertainment圩ood存ports_and_activities字egional地rts_and_crafts圯xcursions存hopping安ebcams安ildlife圭ulture安eather)/$ /listing/get_list/$1 [L]
RewriteRule ^(transport圯ntertainment圩ood存ports_and_activities字egional地rts_and_crafts圯xcursions存hopping安ebcams圭ulture安eather)/([a-z_]*)/$ /listing/get_list/$1/$2 [L]

RewriteCond $1 !^(xyzxyz\.php圩avicon.ico夸s夷mages如ublic.*add.*php.*\.htm.*\.css字obots\.txt)
RewriteRule ^(.*)$ /xyzxyz.php/$1 [L]

note: xyzxyzx aren't mine - they are replacements for lots of X's

That gives pretty much the range of expressions used.

The other thing I was going to ask was isn't there a place to put the htaccess file that means it is only called once and not every time a request is made. In a conf file somewhere?!

[edited by: jdMorgan at 1:27 pm (utc) on Oct. 2, 2009]
[edit reason] Cleaned up cussword filtering by changing "xxx" to "xyz" [/edit]

g1smd




msg:3999800
 11:28 am on Oct 2, 2009 (gmt 0)

Change [R=301] to [R=301,L] unless you have a very good reason to omit the [L].

Edit your post to fix the #*$!#*$! characters. It is impossible to see where YOU placed $ and ! symbols in your code. Use a different letter to 'x'. Forum replaces a string of 'x' in text.

Put [ code ] [ /code ] tags round your example code. Remember that a BLANK line clears the code formatting in the forum.

jd01




msg:3999885
 1:04 pm on Oct 2, 2009 (gmt 0)

# Let's Start Simple...
# Here's what you have. I'll make some adjustments below.

RewriteEngine on
RewriteCond %{HTTP_HOST} ^example.com$
RewriteRule ^(.*)$ http://www.example.com/$1 [R=301,L]
redirect 301 /add.php http://www.example.com/page/add

RewriteRule ^accommodation/West_xyzxyzx\.htm$ /accommodation/region/1 [L]

RewriteRule ^accommodation/special_offers\.htm$ /accommodation/last-minute-availability.htm [R=301]
RewriteRule ^accommodation/last-minute-availability\.htm$ /accommodation/special_offers/xyzxyzx[L]
RewriteRule ^accommodation/([A-Z][A-Za-z-]+)/special_offers\.htm$ /accommodation/$1/last-minute-availability.htm [R=301]
RewriteRule ^accommodation/([A-Z][A-Za-z-]+)/last-minute-availability\.htm$ /accommodation/special_offers/$1 [L]
RewriteRule ^accommodation/([A-Z][A-Za-z-]+)/last-minute-availability-([0-9]+)\.htm$ /accommodation/special_offers/$1/$2 [L]

RewriteRule ^(transport圯ntertainment圩ood存ports_and_activities字egional地rts_and_crafts圯xcursions存hopping安ebcams安ildlife圭ulture安eather)/$ /listing/get_list/$1 [L]
RewriteRule ^(transport圯ntertainment圩ood存ports_and_activities字egional地rts_and_crafts圯xcursions存hopping安ebcams圭ulture安eather)/([a-z_]*)/$ /listing/get_list/$1/$2 [L]

RewriteCond $1 !^(xyzxyz\.php圩avicon.ico夸s夷mages如ublic.*add.*php.*\.htm.*\.css字obots\.txt)
RewriteRule ^(.*)$ /xyzxyz.php/$1 [L]

##### ### #####

# First, Let's get the Mod_Alias out of the Mod_Rewrite

redirect 301 /add.php http://www.example.com/page/add

RewriteEngine on

# I usually use a negative match for canonicalization, so I can get all variations of subdomains I don't want included, because I usually run with 'wildcard subdomains' on, which makes it so if someone types in wwww. or ww. I can redirect them. (It's just a courtesy... If it will not work for you for some reason, you can use all the other adjustments with the Condition you have.)

# Efficiency Adjustments:
# Don't Store What's Already Stored...
# Change ^(.*)$ to .? and $1 to %{REQUEST_URI}

# We know we are going to check everything and we already have a variable with the REQUEST_URI in it, so we don't need to store and back-reference. .? Says 'any single character, or 0 characters' and 'implicitly' matches everything, because it's not an anchored pattern... ^.?$ would match 1 or 0 characters total, but by removing the anchors we match empty or any URL with a single character anywhere in it, which moves us to the condition much faster...

# Since we are going to be Redirecting some URLs, if we redirect those to www.example.com during the Redirect process we don't need to redirect to www.example.com before we send them at the right page, because we can do it all at once, so let's move the canonicalization to the end of the Redirects. (This will also help with SEO, because right now you have 'stacked' redirects, meaning in some cases a visitor is initially Redirected to www.example.com, then Redirected again to the page they were hoping to find. Search Engines will generally pass link weight through a Single Redirect, but not where there are two in a row.)

# We know there are some things we just plain do not want to redirect and it really doesn't matter too much were they come from* IOW: Unless you are trying to rank your .js or .css file who cares if a SE algo determines there is a duplicate of them on example.com? No One. So, let's stop processing them ASAP, by moving the condition from above the PHP rule at the end of the file to the beginning of the file, make it a rule, and leave the PHP where it is.

# * Image files may matter and I'll pretend they do and need to come from www.example.com for some reason, just so you can see what to do. It'll be obvious how to edit what I am going to do once you see it.)

RewriteRule (ico夸s圭ss宇xt)$ - [L]

# The above rule is not start anchored '^', but is end anchored, so it checks to see if the file ends with ico, js, css, txt and if it does, stops processing them.

RewriteRule ^images如ublic - [S=7]

# I'm not sure what public is above, so maybe you can stop processing on it right away (prior to canonicalization), and if you're not worried about your images ranking you can stop the images right away too... All you have to do to stop both from being canonicalized at the end of the file is change the S=7 to L. The match is start anchored, but not end anchored, because we really only need to know if the requested URL matches the beginning and if it does we're going to send it to the end of the file. To stop either one independently, remove it from the current rule and make a new rule *above* the current rule and use a last flag [L] rather than a skip [S=N] on the new rule. (If for some reason you decide you would like to place it below the current rule, MAKE SURE you change S=7 to S=8. If you add any other rule to the file below the S=7 and the canonicalization Rewrite, you will want to add the number of rules added to the S=7 above.)

# Redirects should always be the full URL, including the http, and like g1smd said, L (last) should always be included, unless you know you don't need it, so let's change those.

# Since we know the next six rules all start with accommodation, let's not match them every time all the was down and let's not try to match every other request that's made it this far to make sure it doesn't change to accommodation on us after it doesn't match once... Let's just check to see if it's accommodation once, then move on if it's not.

RewriteRule !^accommodation - [S=6]

# The above says if it's not accommodation at the start, skip the next 6 rules.

# Now we know we've matched accommodation at the start and skipped past this set of rules with everything else, so we don't really need to check and see if it's still accommodation all the way through till another part of the URL matches the pattern, so let's just get to the next part of the pattern with [^/]{13} (not a / 13 times = accommodation in this situation).

RewriteRule ^[^/]{13}/West_xyzxyzx\.htm$ /accommodation/region/1 [L]

RewriteRule ^[^/]{13}/special_offers\.htm$ http://www.example.com/accommodation/last-minute-availability.htm [R=301,L]

RewriteRule ^[^/]{13}/last-minute-availability\.htm$ /accommodation/special_offers/xyzxyzx [L]

RewriteRule ^[^/]{13}/([A-Z][A-Za-z-]+)/special_offers\.htm$ http://www.example.com/accommodation/$1/last-minute-availability.htm [R=301,L]

RewriteRule ^[^/]{13}/([A-Z][A-Za-z-]+)/last-minute-availability\.htm$ /accommodation/special_offers/$1 [L]

RewriteRule ^[^/]{13}/([A-Z][A-Za-z-]+)/last-minute-availability-([0-9]+)\.htm$ /accommodation/special_offers/$1/$2 [L]

RewriteCond %{HTTP_HOST} !^(www.example.com)?$
RewriteRule .? http://www.example.com%{REQUEST_URI} [R=301,L]

RewriteRule ^(transport圯ntertainment圩ood存ports_and_activities字egional地rts_and_crafts圯xcursions存hopping安ebcams安ildlife圭ulture安eather)/$ /listing/get_list/$1 [L]

RewriteRule ^(transport圯ntertainment圩ood存ports_and_activities字egional地rts_and_crafts圯xcursions存hopping安ebcams圭ulture安eather)/([a-z_]*)/$ /listing/get_list/$1/$2 [L]

# We know we've already run all of our checks on .htm files, and we don't really need the full URL to keep them from hitting the next rewrite, so let's just check to see if they end in htm and if they do, stop processing, rather than eating up memory with .*

RewriteRule htm$ - [L]
RewriteRule add[^.]*\.php$ - [L]

# I wasn't sure what the above was, but it looked like *anything* then the word add, then *anything* and php, which I assumed would have a preceding . to define the file type, so rather than all the recursive matching you get with .* .* in a pattern, since there was no back-reference I switched to no start anchor, then anything except a . (dot) 0 or more times, added the literal dot \. to break the pattern matching and switch to see if the URL ended in .php

RewriteRule ^xyzxyz\.php$ - [L]

RewriteRule .? /xyzxyz.php%{REQUEST_URI} [L]

# Basically the same as the canonicalization portion... We know we need to match anything we're still checking and we already have the REQUEST_URI stored, so let's get done with the 'does it match' & 'store it for back-reference' section as quick as we can and just reference the variable we already have stored instead.

##### ### #####

# This may not be copy and paste ready but it should get you close...
I HIGHLY RECOMMEND TESTING on a TESTING SERVER before attempting to use live.

# You WILL need to double check on exactly where the canonicalization rule needs to go and adjust accordingly. I don't know the site and can't test to make sure everything is getting sent through it, so make sure you test each section of the site to make sure it's in the correct location. I just went with a 'close guess' on where I thought it should go.

[edited by: jdMorgan at 1:24 pm (utc) on Oct. 2, 2009]
[edit reason] Cleaned up cussword filtering by changing "xxx" to "xyz" [/edit]

jd01




msg:3999890
 1:18 pm on Oct 2, 2009 (gmt 0)

# This is what it looks like without all the comments.
# DO NOT COPY & PASTE! (SEE ABOVE)
# Made a couple of change in the order
# from above after looking at it for a min.

RewriteEngine on
RewriteRule (ico夸s圭ss宇xt)$ - [L]
RewriteRule ^images如ublic - [S=7]
RewriteRule !^accommodation - [S=6]

RewriteRule ^[^/]{13}/West_xyzxyzx\.htm$ /accommodation/region/1 [L]
RewriteRule ^[^/]{13}/special_offers\.htm$ http://www.example.com/accommodation/last-minute-availability.htm [R=301,L]
RewriteRule ^[^/]{13}/last-minute-availability\.htm$ /accommodation/special_offers/xyzxyzx [L]
RewriteRule ^[^/]{13}/([A-Z][A-Za-z-]+)/special_offers\.htm$ http://www.example.com/accommodation/$1/last-minute-availability.htm [R=301,L]
RewriteRule ^[^/]{13}/([A-Z][A-Za-z-]+)/last-minute-availability\.htm$ /accommodation/special_offers/$1 [L]
RewriteRule ^[^/]{13}/([A-Z][A-Za-z-]+)/last-minute-availability-([0-9]+)\.htm$ /accommodation/special_offers/$1/$2 [L]

RewriteCond %{HTTP_HOST} !^(www.example.com)?$
RewriteRule .? http://www.example.com%{REQUEST_URI} [R=301,L]

RewriteRule htm$ - [L]
RewriteRule add[^.]*\.php$ - [L]
RewriteRule ^xyzxyz\.php$ - [L]

RewriteRule ^(transport圯ntertainment圩ood存ports_and_activities字egional地rts_and_crafts圯xcursions存hopping安ebcams安ildlife圭ulture安eather)/$ /listing/get_list/$1 [L]

RewriteRule ^(transport圯ntertainment圩ood存ports_and_activities字egional地rts_and_crafts圯xcursions存hopping安ebcams圭ulture安eather)/([a-z_]*)/$ /listing/get_list/$1/$2 [L]

RewriteRule .? /xyzxyz.php%{REQUEST_URI} [L]

[edited by: jdMorgan at 1:26 pm (utc) on Oct. 2, 2009]
[edit reason] Cleaned up cussword filtering by changing "xxx" to "xyz" [/edit]

surfgatinho




msg:4001477
 3:29 pm on Oct 5, 2009 (gmt 0)

jd,

Wow! Thank you so much for that. A tutorial and solution all in one.

Much appreciated,
Chris

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Code, Content, and Presentation / Apache Web Server
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved