Forum Moderators: phranque

Message Too Old, No Replies

review my htaccess file?

         

hans121

11:29 am on May 2, 2012 (gmt 0)

10+ Year Member



Hi,

I'm not very good with htaccess and I was wondering if anybody would be so kind to look through my htaccess file and give me some suggestion how to improve or tweak it a bit.
I've got some fairly simple rules which work correctly for me so far. But maybe there are better ways?

I've got some redirects for images (especially requests for images that I have recently moved to a content delivery network. the cdn pulls the images via origin.example.com). anyways here's the code:

Options +FollowSymLinks
RewriteEngine on

#cache headers
<FilesMatch "\.jpg$">
Header set Cache-Control "must-revalidate, max-age=2592000"
</FilesMatch>

<FilesMatch "\.(js|css|png|gif|html|ico)$">
Header set Cache-Control "must-revalidate, max-age=450000"
</FilesMatch>

#turn off default expires values
ExpiresActive Off

ServerSignature Off

#default page for directory
DirectoryIndex index.html index.php index.htm

#turn off directory listing
Options -Indexes

#def charset
AddDefaultCharset UTF-8

#redirect any index.html to the directory it is stored in
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /(.+/)?index\.html\ HTTP
RewriteRule ^(.+/)?index\.html$ http://www.example.com/$1 [R=301,L]

#ifnonwww->www
RewriteCond %{HTTP_HOST} !^www\.example\.com
RewriteCond %{HTTP_HOST} !^origin\.example\.com
RewriteRule ^(.*)$ http://www.example.com/$1 [R=301,L]

#rdirect most jpegs to new location (cdn) exclude referrering addresses which include "com/cfg" and "preview"
#exclude some more files
RewriteCond %{REQUEST_URI} \.jpg$
RewriteCond %{HTTP_HOST} !^origin\.example\.com
RewriteCond %{HTTP_REFERER} !(com/cfg|preview)
RewriteCond %{REQUEST_URI} !(topline5_1|corner_right|corner_left|bottomline).jpg$
RewriteRule ^(.+)$ http://cdn.example.com/$1 [R=301,L]

#rewrite gallery requests with page offset, like "gallery/test-page_1.html" to php files
RewriteRule ^gallery/([^_]+)_([^_]+)\.html$ /gallery/index.php?cat=$1&offset=$2 [L]

#rewrite gallery requests without page-offset
RewriteRule ^gallery/([^/]+)\.html$ /gallery/index.php?cat=$1 [L]

#rewrite guestbook requests with page-offset (german and engl)
RewriteRule ^de/gaestebuch/page-([0-9]+)\.html$ /de/gaestebuch/index.php?offset=$1 [L]
RewriteRule ^guestbook/page-([0-9]+)\.html$ /guestbook/index.php?offset=$1 [L]

#only gif|jpg|png can be accessed via origin.example.com, otherwise 404
RewriteCond %{HTTP_HOST} ^origin\.example\.com
RewriteRule !\.(gif|jpg|png)$ /this_filepath_does_not_exist.html [L]

#some rewrites to php
RewriteRule ^restaurant.html$ /restaurant.php [L]
RewriteRule ^de/kueche.html$ /de/kueche.php [L]
RewriteRule ^information.html$ /information.php [L]
RewriteRule ^de/informationen.html$ /de/informationen.php [L]

#some more rewrites to dynamic pages
RewriteRule ^environmental-management\.html$ /info_subpage.php?page=environmental-management [L]
RewriteRule ^de/umweltmanagement\.html$ /de/info_subpage.php?page=umweltmanagement [L]
RewriteRule ^activities-workshops\.html$ /info_subpage.php?page=activities-workshops [L]
RewriteRule ^de/aktivitaeten-workshops\.html$ /de/info_subpage.php?page=aktivitaeten-workshops [L]
RewriteRule ^([^/]+)-info\.html$ /info_subpage.php?page=$1 [L]
RewriteRule ^de/([^/]+)-info\.html$ /de/info_subpage.php?page=$1 [L]


I've often read it's important to have the rules in the correct order. I'm not sure if I did that correctly.

any improvement suggestions?

thanks a lot :-)

g1smd

11:48 am on May 2, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Arrange the rules in this order: blocking access, redirect to different URL, rewrite to file.


The "only gif|jpg|png can be accessed via origin.example.com, otherwise 404" rule should be first.

The cdn redirect should be next.


Escape all literal periods in patterns.


([^_]+)\. means "keep reading while there is no underscore, the next character after "not an underscore" needs to be a literal period". Not possible.

You probably want ([^.]+)\. or (([^.]+\.)+) here.


([^/]+)- means "keep reading while there is no slash, the next character after "not a slash" needs to be a hyphen". Not possible.

You probably want ([^-]+)- or (([^-]+-)+) or if these are only ever in root use (([^/-]+-)+)

hans121

12:30 pm on May 2, 2012 (gmt 0)

10+ Year Member



Thanks g1smd!

My idea was to put the "only gif|jpg|png can be accessed via origin.example.com, otherwise 404" rule somewhere more towards the end of the file because it will hardly ever be invoked.

I can't really understand why you would put the rewrite first and then the redirect?


when you are talking about ([^/]+)- do you mean the following rules?

RewriteRule ^([^/]+)-info\.html$ /info_subpage.php?page=$1 [L]
RewriteRule ^de/([^/]+)-info\.html$ /de/info_subpage.php?page=$1 [L]


As I said I basically understand rewrites and some regex but I don't understand why this rule wouldn't work or be flawed?

If there is eg. an incoming request for something-info.html wouldn't this rule be invoked?

Why should I use (([^-]+-)+) or (([^/-]+-)+) ?

thanks :-)

g1smd

12:36 pm on May 2, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



"keep reading while the next character is not a slash" will sail right past the hyphen, then have to backtrack multiple steps when it runs out of the very end of the URL.

hans121

12:58 pm on May 2, 2012 (gmt 0)

10+ Year Member



hm... maybe my understanding of regex isn't as good as I thought ;-)

okay, so if I want the rule to be invoked by requests like:

something-info.html or something-more-info.html (no slashes allowed!)
what should my rewrite rule look like?

I mean I just want to match any request without slashes that ends with -info.html
and in a second rule requests that start with de/ followed by some characters ending with -info.html.

similar here:
RewriteRule ^gallery/([^_]+)_([^_]+)\.html$ /gallery/index.php?cat=$1&offset=$2 [L]
RewriteRule ^gallery/([^/]+)\.html$ /gallery/index.php?cat=$1 [L]


first rule: I want to match requests starting with gallery/ and a string without _ followed by an underscore and another string without an underscore followed by .html in the end.

second rule should match request starting with gallery/ followed by any string followed by .html in the end, again.

I'm pretty confused now, because I thought my rules were okay. At least they work the way I want them to. Could tell me the correct way please?

thanks a lot.

g1smd

1:08 pm on May 2, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



([^_]+)_([^_.]+)\.


([^/.]+)\.


or as previous post.

hans121

2:02 pm on May 2, 2012 (gmt 0)

10+ Year Member



thanks :-) works like a charm.

just one more question:

in this (([^-]+-)+) or (([^/-]+-)+) case I also got the last hyphen in $1 right?
I'd need requests like:

some-thing-info.html or
some-thing-more-than-this.html or even longer to invoke the rule.

and I'd need the $1 to be the part before -info.html (without the last hyphen).

What would be the best way to accomplish this?

thanks!

g1smd

2:08 pm on May 2, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



A request for /this-info.html should also invoke the rule. The pattern requires one or more hyphens.

The RegEx pattern captures the final hyphen. There's no easy way to avoid that.

There's no problem with the hyphen being in $1. Strip it off in the first line of your PHP script.

hans121

2:16 pm on May 2, 2012 (gmt 0)

10+ Year Member



okay, if there's no easy way with regex i'll do it in php.

ok thanks g1smd for all your help. you saved me lots of time :-)

g1smd

2:24 pm on May 2, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Do the matching for the right URL in the htaccess file. Constrain the pattern to reject most non-valid requests.

Extract out the exact bit of the URL that you need and sanity check that bit for validity somewhere in the PHP.