Forum Moderators: open
Usually this wouldn't be a problem, but since my directories are all fake mod_rewrite ones, it throws up a 404 if the trailing '/' isn't there.
I was able to fix this by adding duplicate rules that rewrite unslashed version to slashed versions, but it seems ridiculous that I even had to do that. There are *no* links anywhere on my site to the unslashed version, and the fact that Yahoo was able to find and index the pages makes it apparent that they're displaying incorrect URLs.
This only breaks pages that are virtual folders, while all other pages ending with .htm work fine.
Could the length of the URL be a factor? Some of mine are quite long, and maybe it wants to save space by leaving it off?
Is this a common occurance? Am I just a special case?
When using mod_rewrite to create static-looking URLs, I much prefer creating URLs with a file extension (usually .htm or .html) to avoid any chance of confusion. You can also create URLs with no extension, but no trailing slash either.
You should not need duplicate rules. Just use the rule for the slashed version, and put a "?" after the slash in the RewriteRule pattern.
For example, the two rules
RewriteRule ^subdir/([^/]+)/$ /page.php?page=$1 [L]
RewriteRule ^subdir/(.+)$ /page.php?page=$1 [L]
RewriteRule ^subdir/([^/]+[b])/?$[/b] /page.php?page=$1 [L]
Jim
The "?" makes the trailing slash optional in the regular-expressions pattern to be matched, so that one rule matches the URL with or without the trailing slash, and the rewrite will occur. This eliminates the need for duplicate rules, as stated.
But wouldn't that mean that if somebody linked to the unslashed version and another person the slashed version, that the SE's might think these are two separate pages with duplicate content?
Right now I do a 301 from unslashed to slashed. The slashed version is a rewrite to the .php file. This way if a spider follows the unslashed it gets redirected to the slashed, so it knows the original was wrong.
Here's what my .htaccess looks like:
RewriteRule ^(.*)/$ h*tp://www.domain.com/index.php?folder=$1 [L]
RewriteRule ^(.*)/(.*\.html)$ h*tp://www.domain.com/index.php?folder=$1&page=$2 [L]
RewriteRule ^(.*)$ h*tp://www.domain.com/$1/ [R=301,L]
--------------
So if the URL is correct the first time it works perfectly. If it's missing the slash and it doesn't end in .html, it get send through the last rule which adds the slash and sends a 301.
Are there any problems with the method I am using?
RewriteRule ^(.*)/$ h*tp://www.domain.com/index.php?folder=$1 [L]
RewriteRule ^(.*)/(.*\.html)$ h*tp://www.domain.com/index.php?folder=$1&page=$2 [L]
RewriteRule ^(.*)$ h*tp://www.domain.com/$1/ [R=301,L]
The use of negative forward-looking patterns are much more efficient than the .* catch-all, so the file would use considerably less processor resouces by switching to:
Any one or more characters that is not a slash, followed by a slash at the end of a line:
RewriteRule ([^/]+)/$ http://www.domain.com/index.php?folder=$1 [L]
Any one or more characters that is not a slash, followed by a slash, followed by one or more characters that is not a .(dot) followed by html at the end of a line:
RewriteRule ([^/]+)/([^.]+\.html)$ http://www.domain.com/index.php?folder=$1&page=$2 [L]
Any request that does not contain a dot, and does not end in a slash:
RewriteRule ([^.]+[^/])$ http://www.domain.com/$1/ [R=301,L]
Justin