Forum Moderators: phranque
Basically it's a site with static pages in the webroot, and a blog under a subdirectory. But I noticed that rewrite rules in the subdirectory were either conflicting or not being obeyed from the document root.
Let's start with this under /blog/.htaccess which is necessary for a wordpress blog
<IfModule mod_rewrite.c>
RewriteEngine On
RewriteBase /blog
RewriteCond %{REQUEST_FILENAME}!-f
RewriteCond %{REQUEST_FILENAME}!-d
RewriteRule . /blog/index.php [L]
</IfModule>
Now the problem is the site has old multiple domains which all need to be condensed with a 301 to a single domain. Also taking advantage of that to trim off www.
/.htaccess
RewriteCond %{HTTP_HOST}!^example\.com$
RewriteRule ^(.*)$ http://example.com/$1 [QSA,L,R=301]
Last but not least we would like to remove all instances of index.php or index.html from any URL request that doesn't need it.
/.htaccess
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /([^/]+/)*index\.(htm(l)?¦php)\ HTTP/
RewriteRule ^(([^/]+/)*)index\.(htm(l)?¦php)$ [%{HTTP_HOST}...] [R=301,L]
The problem is these last two rules in the document root htaccess don't affect the blog. If I try to move the blog's rules into the document root there are other conflicts such as other .htaccess rules are not obeyed, ie.
/images/.htaccess
ErrorDocument 404 /images/404.jpg
To make it even more complicated some old urls on the blog have dates changed to where they need to find the new dates. This works some of the time depending how the rest of the above is either present or removed to moved to /blog or the /
RewriteCond %{REQUEST_URI} ^/blog/2005/03/02/old-entry-name(.*) [NC]
RewriteRule ^(.*)$ http://example.com/blog/2005/03/13/new-entry-name%1 [R=301,L,QSA]
As a huge bonus, I would like to make sure any url from the /blog/ ends with a trailing slash (/) which is optional on wordpress blogs but causes duplicate content in search engines because of that. This feature is the least of my worries compared to the above requirements.
I definitely need some expert help trying to merge all these, figure out the proper order for the rules, and try to make them behave all together. I've done quite a bit of trial and error but the problem is it's a live, active site and I really don't want to mess with the visitors too much via my mistakes.
Thanks for any assistance!
[edited by: amznVibe at 9:19 am (utc) on May 16, 2007]
Several factors may be important here:
First, if subdirectories are to be subject to higher-level directories' mod_rewrite rules, then RewriteOptions inherit must be set -- See mod_rewrite documentation.
Rule order is important: Do blanket access restrictions (i.e. block IP and user-agents) first, then do per-page (or more properly, per-URL) access restrictions, per-URL external redirects, domain (canonicalization) redirects, per-URL internal rewrites, and finally, any default (or 'catch-all') internal rewrites.
Do not mix the use of mod_alias redirects and mod_rewrite rules unless you have tested the execution order of these modules and are sure that it won't break the 'order recommendations' stated above. To be clear, each Apache module parses ("scans") your .htaccess file looking for directive that it understands, executing those and ignoring the rest -- leaving them for subsequently-invoked modules to handle. So on some servers, mod_alias Redirect directives will be executed first, while on others, the mod_rewrite directives will be executed first. This is determined by the reverse LoadModule list order on Apache 1.x, and by an internal priority scheme on Apache 2.x.
This is of particular concern if you use multiple servers --for example: development, test, and production servers-- or if you contemplate changing hosts at any time in the future.
A general comment: Don't add functions (code) to an existing problematic situation. Get what you already have working first, then add new code and re-test. "Divide and conquer" is a good approach, so don't exacerbate a problem by adding new, unknown factors.
Moving code from .htaccess to httpd.conf and/or conf.d requires changes to the code. URL-paths in .htaccess are relative to the directory in which the code resides -- In other words, the path to the current directory is stripped before RewriteRule directives in that current directory can examine the URL-path. In contrast, URL-paths in httpd.conf and conf.d must be fully specified, relative to the hostname.
So, for example:
RewriteRule ^images/logo\.gif$ http://www.example.com/dir2/images/logo.gif [R=301,L] RewriteRule [b]^/dir/i[/b]mages/logo\.gif$ http://www.example.com/dir2/images/logo.gif [R=301,L] Additionally, in httpd.conf or conf.d, you can take advantage of <Directory> and <Location> containers (and others) to limit the scope of groups of rules to improve performance.
The choice of whether to use <ifModule> should be an informed one; If you use "<IfModule mod_rewrite.c> " and the server does not have mod_rewrite enabled, then the rules will be skipped. Therefore, no error messages will be generated or will be logged; The code will fail silently. Although IfModule appears in many mod_rewrite examples found on the Web, be sure that is what you want.
Jim
[edited by: jdMorgan at 4:26 pm (utc) on May 16, 2007]
Trying to make sure any virtual url (not physical file, mapped to wordpress) ends in a trailing slash.
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_URI} !..+$
RewriteCond %{REQUEST_URI} !/$
RewriteRule (.*) $1/ [R=301,L,QSA]
Why doesn't that work?
Tried it before and after the wordpress rewrite.
Any other suggested techniques?
[edited by: amznVibe at 9:20 am (utc) on May 17, 2007]
Not sure what the original author's intent was but I think I can live without it since anything that fails the physical file test should end with with a trailing slash. I already remove index.php
If the code is intended to do what I believe it does, I'd write it like this:
# If requested URL does not contain a literal period in the final path-part or end with a slash
RewriteCond %{REQUEST_URI} !(\.[^/]+¦/)$
# and if requested URL does not exists as an actual file
RewriteCond %{REQUEST_FILENAME} !-f
# externally redirect to add a slash
RewriteRule (.*) http://www.example.com/$1/ [R=301,L]
[QSA] is not needed, as the original query string will be retained by default.
To prevent problems with conflicts between your configured ServerName and your actual preferred "canonical" domain name, always specify a full URL when doing external redirects as shown.
Always put filesystem and reverse-DNS check RewriteConds last -- No use wasting the (considerable) time and energy to perform them if the other conditions are not true.
Replace the broken pipe "¦" in the pattern above with a solid pipe before use; Posting on this forum modifies the pipe characters.
Jim