Welcome to WebmasterWorld Guest from 100.26.182.28

Forum Moderators: Ocean10000 & phranque

Message Too Old, No Replies

mod rewrite and server load

     
7:16 pm on Jul 10, 2007 (gmt 0)

New User

10+ Year Member

joined:Mar 12, 2007
posts:26
votes: 0


Is there a rule of thumb how badly rewrite rules impact load times? My website has been loading pretty slow the last few weeks, and I'm wondering if the ~20 rules in my .htaccess might be the reason.
8:11 pm on July 10, 2007 (gmt 0)

Senior Member

WebmasterWorld Senior Member jdmorgan is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:Mar 31, 2002
posts:25430
votes: 0


That would depend on how well-coded those rules are, and what your traffic levels are. 20 well-coded rules on a site with 1,000,000 hits per day and a fast server is negligible, while 20 very-badly-written rules with ambiguous patterns and unmitigated recursion problems might bring a 10,000-hit-per-day site to its knees on a slow server.

mod_rewrite is much more efficient than PHP or any other non-native 'language' so let that be your guide.

My "average site" has anywhere between two dozen and three hundred rules on it, with no visible impact on performance with 10,000 to 100,000 hits and mostly fairly-decent-but-shared virtual servers. But I try to write very tight code using very-efficient patterns.

Jim

5:39 pm on July 11, 2007 (gmt 0)

New User

10+ Year Member

joined:Mar 12, 2007
posts:26
votes: 0


I read somewhere that you should stay away from character classes ([a-z0-9]) and use wildcards (.*) or negated ones ([^/]) instead, so I tried to do that wherever possible. Are there any other optimization hints I could take a look at?

I'm getting around 400,000 hits a day.

7:41 pm on July 11, 2007 (gmt 0)

Senior Member

WebmasterWorld Senior Member jdmorgan is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:Mar 31, 2002
posts:25430
votes: 0


If a character-class is very long, then it may indeed be possible to express it more efficiently as a negative class.

Using the wild-card multiple times in one pattern is the biggest problem -- character classes not so much.

A really bad way to do a pattern is something like:


^(.*)/(.*)/(.*)$

because it will potentially require hundreds or even thousands of back-off-and-retry attempts to match on a long URL-path, and because it is ambiguous -- That is, it will match any URL-path with three *or more* slashes in it.

Remember, ".*" is both ambiguous and 'greedy' -- On the first matching attempt, the pattern-matcher will try to put the entire input string into the first ".*" and fail because the rest of the pattern is left 'starved'. So it will move one character from the end of the string and try to match the rest into the first ".*" again. This will again fail, so it will move one more character from the end of the string and try to match the rest into the first ".*" yet again. This will repeat until the input string is apportioned sufficiently for all of the subpatterns to match.

A better way to code the above pattern is:


^([^/]+)/([^/]+)/([^/]*)$

if you don't want to match local URL-paths with more than the two slashes, or

^([^/]+)/([^/]+)/(.*)$

if you want the 'extra' path info matched into the last back-reference, or

^(([^/]+/)+)([^/]+)/([^/]*)$

if you want the extra path info matched into the first back-reference.

All of these are negative-match patterns, and allow a single left-to-right matching attempt to determine a match or no match. Note that you might have to do a bit of clean-up on the $1 back-reference in the last rule; It would contain a possibly-unwanted trailing slash.

You can avoid the redundant character-class [A-Za-z] by using the [NC] flag on RewriteRules and RewriteConds to make the compare case-insensitive, and then specifying only [a-z] or [A-Z].

Jim

 

Join The Conversation

Moderators and Top Contributors

Hot Threads This Week

Featured Threads

Free SEO Tools

Hire Expert Members