Forum Moderators: phranque

Message Too Old, No Replies

htaccess more efficient

         

FiRe

7:34 pm on Dec 2, 2007 (gmt 0)

10+ Year Member



I have this in my .htaccess file:

Options +FollowSymlinks
RewriteEngine on
RewriteCond %{REQUEST_FILENAME}!-f [NC,OR]
RewriteCond %{REQUEST_FILENAME}!-d [NC]
RewriteRule ^$ index.php
RewriteRule ^([a-zA-Z0-9-_]+)$ portfolio.php?short=$1 [L]

Basically if you go mysite.com/username it uses portfolio.php?short=username assuming its not a file or directory (this works fine). I want it to restrict the rewrite rule to a-z, 0-9, _ and - characters so my question is have I written the regex correctly and the file efficiently?

Also, a minor point, but if I go to mysite.com/images/ (which is a directory) it shows the image list, but if I go to mysite.com/images it also shows the image list but the URL in your browser changes to mysite.com/images/?short=images

Any ideas? It's not a huge concern as they shouldnt be looking at my image directory anyways but hey :p thanks in advance!

jdMorgan

2:25 pm on Dec 3, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



It sounds to me as though both of your problems are caused by rule ordering. Put your most-specific external redirect rules first, followed by the less-specific redirects. Then put your most-specific internal rewrites, ending with your least-specific.

Use the [NC] flag so that comparisons are case-insensitive. This allows you to reduce the [a-zA-Z] pattern to just [a-z] or [A-Z], saving a lot of work for the processor.

Normally, you don't use [OR] on negative-match patterns. In this case, using [OR] guarantees that your pattern will always match, because -for-example- a URL pointing to a file will always be "not a directory", and a URL pointing to a directory will always be "not a file". So your combined RewriteConds will always match any REQUEST_FILENAME.

Your RewriteRule pattern of ^$ means that only requests for "/" will be rewritten to index.php, and only if that URL does not resolve to an existing file or directory. It's doubtful that this is what you intended. I have changed the pattern to ".*" -- meaning "all URL_paths" -- but see the discussion below.


Options +FollowSymlinks
RewriteEngine on
#
RewriteRule ^([a-z0-9-_]+)$ portfolio.php?short=$1 [NC,L]
#
RewriteCond %{REQUEST_FILENAME} !-f [NC]
RewriteCond %{REQUEST_FILENAME} !-d [NC]
RewriteRule .* index.php

You did not show any code related to doing any external redirect, so the following is just a guess: The most likely reason that the path to your script is being 'exposed' by image requests is that you've got an external redirect following the internal rewrite to the script. See the first paragraph above.

Now a more general note: Using the -f -d check is in itself very inefficient. You are making two calls in a row to the filesystem for every request to the server -- the first to check for 'exists as a file' and the second to check for 'exists as a directory'. The filesystem handler undoubtedly contains hundreds of lines of code, and may even have to go read the disk, so this is a slow process.

Therefore it is a good idea to try to limit the scope of these 'exists' checks by making the RewriteRule pattern as specific as possible or by adding additional RewriteConds before the 'exists' checks. For example, you might check only those URL-paths that do not contain a period in the final URL-path-part. This would exclude image files and files like robots.txt from triggering the 'exists' checks. To do that, just add a RewriteCond:


Options +FollowSymlinks
RewriteEngine on
#
RewriteRule ^([a-z0-9-_]+)$ portfolio.php?short=$1 [NC,L]
#
[b]RewriteCond %{REQUEST_URI} [i][/i]!^/([^.]+.)*[^./]+$[/b]
RewriteCond %{REQUEST_FILENAME} !-f [NC]
RewriteCond %{REQUEST_FILENAME} !-d [NC]
RewriteRule .* index.php

I don't know what your actual page URLs and server filepaths look like, so this is just a general suggestion and might not work for you. But in general, it is a good idea to prevent unnecessary 'exists' checks from being invoked.

For more information, see the documents cited in our forum charter [webmasterworld.com] and the tutorials in the Apache forum section of the WebmasterWorld library [webmasterworld.com].

Jim