Page is a not externally linkable
- Code, Content, and Presentation
-- Apache Web Server
---- Newbie here - please grade my first .htaccess file


lucy24 - 3:03 am on May 20, 2012 (gmt 0)


This code is courtesy of jdmorgan, who posted it here

The problem with using code from jdmorgan is that he really, truly, deeply understands mod_rewrite. So if you cut-and-paste and then change one thing to fit your own installation, the whole thing is likely to fall apart. It's like, uhm, trying to achieve a flawless chicken tikka masala when you can just about, on a good day, manage to boil an egg without setting off the smoke alarm. (Someone will come along presently with a better analogy.)

Disclaimer: I'm not touching the gzip, deflate, expiration etc etc parts, for the rock-solid reason that I don't understand this stuff.

:: looking vaguely around for the people who do understand it, because I know they exist ::

I speak pretty fluent RegEx, and that's where it ends. But lemme see what the rewrites turn into if I give them a closer look.

[S=1]

Leave that kind of thing for jdmorgan. If the Rule is in the right location, a simple [L] will do. If it is in the wrong location, move it. Within each module, rules execute in order, straight down the page.

Now, line by line:

RewriteCond $1 ^(index\.php)?$

Wrong. If the request is for explicitly named "index.php", redirect to www.example.com/ alone. Only if the "index.php" is the result of a Rewrite do you keep it.

RewriteCond $1 \.(gif|jpg|css|js|ico)$

Wherever possible, rules that are constrained to a particular filetype should say so in the Rule itself. Otherwise, your server has to evaluate the Conditions for every single request it ever receives.

RewriteCond %{REQUEST_FILENAME} -f [OR]
RewriteCond %{REQUEST_FILENAME} -d

Boilerplate htaccess files always, always include this pair of lines. They are rarely appropriate. If the request is getting rewritten to a php page, let the php figure out if the user ought to be somewhere else. That's assuming there's a query string involved; the Conditions don't say.

Now, what if they ask for a file or directory that doesn't really exist? The Rule makes no distinction between pseudo-files-- ones that have no physical existence but are created by the program-- and ones that are simply garbage. The net result is that the user can request

www.example.com/anyoldcraphere.html

and the Rewrite will dutifully send them to index.php. This is OK if index.php then evaluates the request, establishes that there ain't no such page, and returns a 404. But if it simply carries on regardless, then you have Infinite URL Space. This is something that search engines will occasionally check for by requesting a garbage filename.

RewriteRule . /index.php [L]

Wrong. It should say .* because if the user requests the sitename by itself, then the request will have no content. In logs and mod_alias, a request for www.example.com/ will come through as / But mod_rewrite strips the slash after the domain name, so that leaves ... nothing.

The basic pattern for the universal rewrite goes

RewriteRule ^(([^/]+/)*([^/]+\.html)?)$ http://www.example.com/index.php?$1 [L]

meaning: If the user requests a page or directory, secretly change the request into a query string and rewrite to the index.php page where it will all be dealt with.

:: pause here to wait for other half of answer to come along later ::


Thread source:: http://www.webmasterworld.com/apache/4454595.htm
Brought to you by WebmasterWorld: http://www.webmasterworld.com