Forum Moderators: phranque
For example:
http://www.example.com/index.php?cName=toys-plush-animal-sets
needs to be changed to
http://www.example.com/index.php?cName=toys-plush-animals
and
http://www.example.com/product_info.php?pName=standing-hippo-40&cName=plush-animal-sets-safari-animal-sets
needs to be changed to
http://www.example.com/product_info.php?pName=standing-hippo-40&cName=plush-animals-safari-animal-singles
and
http://www.example.com/product_info.php?pName=standing-elephant-with-sound-set&cName=plush-animal-sets-safari-animal-sets
needs to be changed to
http://www.example.com/product_info.php?pName=standing-elephant-with-sound-set&cName=plush-animals-safari-animal-sets
There are other specific changes I need to apply, but they follow the same general pattern.
Note that, in the second example above, there are two phrases that need to be changed.
I'm aware that I need to use RewriteCond to test the query string, but it feels as though I've tried every combination of RewriteCond and RewriteRule code possible and not yet hit on the right solution.
I'm trying to apply these rewrites via my .htaccess file.
Any help gratefully received!
Thanks,
Mark.
[edited by: jdMorgan at 10:35 pm (utc) on Dec. 11, 2006]
[edit reason] example.com [/edit]
Thanks for your reply.
I must not have tried every possibility after all, because when attempting to find my best yet, I came up with the following, which seems to work:
# Change the Category Name of plush-animal-sets to plush-animals
RewriteCond %{QUERY_STRING} (.*)plush-animal-sets(.*)
RewriteRule .? %{REQUEST_URI}?%1plush-animals%2 [R]
The new pages aren't live yet, so it does generate a 404 error, but I wanted to make sure I could do this rewrite before I change the name of the product categories in my shopping cart system.
Once it's working fine, should I change the R parameter to a R=301?
Thanks again,
Mark.
Oops!
I forgot to include sample inputs and outputs to my previous reply.
The original URL I entered was:
http://www.example.com/index.php?cName=toys-plush-animal-sets and the URL it ended up at was:
http://www.example.com/resources/error404.php?url=http://www.example.com/index.php&cName=toys-plush-animals The first part of the target URL is my custom 404 error page, but the url= parameter looks like it's the correct destination, once I rename the plush-animal-sets category to plush-animals.
Best wishes,
Mark.
[edited by: jdMorgan at 10:36 pm (utc) on Dec. 11, 2006]
[edit reason] example.com [/edit]
# Change the Category Name of "plush-animal-sets" to "plush-animals"
RewriteCond %{QUERY_STRING} ^(([^&]+&)*)cName=plush-animal-sets(&.+)?$
RewriteRule (.*) http://www.example.com/$1?%1cName=plush-animals%3 [R=301,L]
Jim
[edited by: jdMorgan at 10:48 pm (utc) on Dec. 11, 2006]
Thanks for your advice.
I tried it out, and it seems to work fine.
However, sometimes the cName parameter will be the first parameter in the query string, and sometimes it will be after the first parameter.
For example:
http://www.example.com/product_info.php?pName=standing-elephant-with-sound-set&cName=plush-animal-sets-safari-animal-sets Here, the first paramater is "pName", so the "cName=" will be preceded by an "&", which I believe is what your code looks for.
But in this example:
http://www.example.com/index.php?cName=plush_stuffed-toys (which needs to be changed to
http://www.example.com/index.php?cName=toys-small-plush-toys) the "cName=" is the first parameter and therefore preceded by a "?" instead.
What is the best way to amend your code to cater for this?
Also, the text I'm looking for may occur directly after "cName=", and sometimes there will be other characters in between the "cName=" and the text I need to change, and sometimes there will be other characters after the text I need to change but before the next parameter or end of the query string.
I think I will always know what text would precede / follow the text I'm looking to change, so I suppose the simplest solution, even though it would mean adding more rules than my more generic but inefficient solution might require, would be to specify exactly what I need in both the search and replace strings?
Finally, having renamed my Plush Animal Sets category to Plush Animals in my shopping cart, the code I posted earlier doesn't work (it generates a 404) because it's changing the "? in my substitution URL to an "&", which it doesn't do with your solution, and I'm not sure why that is / what is different.
Thanks again,
Mark.
the "cName=" is the first parameter and therefore preceded by a "?" instead.What is the best way to amend your code to cater for this?
Did you test the code I posted? The "as many as you like, including zero" clause on my first subpattern should allow for any number of name/value pairs to precede "cName=xyz" without any trouble. If you're sure that cName is and always will be the first parameter, then you can dispense with that bit of the pattern altogether, and just use
# Change the Category Name of "plush-animal-sets" to "plush-animals"
RewriteCond %{QUERY_STRING} ^cName=plush-animal-sets(&.+)?$
RewriteRule (.*) http://www.example.com/$1?cName=plush-animals%1 [R=301,L]
You can write one rule per name/value pair that needs to change, or if there are similarities between some or all of the changes, you can take advantage of them to create a smaller number of rules to do all of them. But since I'm not familiar with your site and parameter-naming conventions, I have no idea what shortcuts you might be able to use. Only you can decide or discover them. So indeed, that really is the hard part.
*"Greedy" is a commonly-used description of the ".*" and ".+" patterns, because each will match as many characters as possible. I use the word "promiscuous" because both patterns will also match *any* characters, often leading to unexpected results, a quick example of which is "(.*)/?" where the $1 back-reference will always contain the trailing slash if present in the request, because ".*" is greedier than "/?" and will always consume the trailing slash. Using multiple (.*) patterns can also lead to ambiguity as to exactly which kinds of URLs will be matched -- it's a common source of functional rewriting problems. Use of multiple ".*" subpatterns in one pattern also causes huge processing inefficiencies, since the matching routine often has to "loop and back off" many, many times to find a match. In short, avoiding the use of ".*" whenever possible is a good practice.
Jim
Thanks for your reply.
I have to confess that I only tested it on the one URL, as I didn't want to rename all of my other category names before testing it out on the first one.
However, I understand what you are saying, having studied your code once again in more detail.
The cName parameter will not always be the first one, so I'll have to stick with your first solution.
Having worked in IT for over 25 years before giving up the corporate life to work from home, I also fully understand your comment about the hard part being to define the precise problem. And that's causing me some hard thinking in this case.
There is one more problem I'm encountering, however.
After the rewrite rule that you kindly gave me, I have a rule to trap 404 errors and redirect people to my own page, and it's a bit of RewriteRule I got from a website somewhere:
# Redirect 404 errors to custom error page
RewriteCond %{REQUEST_FILENAME}!-f
RewriteCond %{REQUEST_FILENAME}!-d
RewriteRule ^/?(.*)$ /resources/error404.php?url=$1 [L,QSA,R]
What I've found is that if I leave the "L" flag off my previous rules (as it's possible that, even after changing part of the query string, another part of the URL may still be wrong), then the 404 rule is being triggered, in spite of the fact that the target page, after the rewrite, is valid.
For example, the following URL:
http://www.example.com/product_info.php?pName=rolling-horse-brown-26&cName=plush-animal-sets-farm_domestic-sets gets changed to:
http://www.example.com/product_info.php?pName=rolling-horse-brown-26&cName=plush-animals-farm_domestic-sets using another rule I created earlier today, based on the one you supplied me.
This new target page exists, and if I add the "L" flag to the rewrite rule that does this change, then my browser takes me to the correct page.
However, if I remove the "L" flag, it generates a 404 error instead, presumably because it's triggering the 404 rule.
I would have thought that the
%{REQUEST_FILENAME} should find the file, even though this is meant to be the full filesystem path. (I know you can also use the ErrorDocument 404 command, but I've tried using that before, and it doesn't seem to trap all 404 errors, for some reason. What I found, on a couple of my sites, is that if you try to visit a (page in a) directory that doesn't exist, it works fine, but if you try to visit a file / page that doesn't exist in a directory that does exist, then it still presents the visitor with the host's default 404 page, not my custom one.)
The simple solution for now would be to use the "L" flag on my rules, but I'd be interested in knowing why the 404 rule is being triggered for pages that apparently exist, how I might resolve this, and why the ErrorDocument command doesn't always seem to work.
Sorry for the long post and the continued requests for help, and thanks for your patience and help,
Mark.
I don't know why you're getting the 404 problem -- what's your server error log have to say about it?
Using the ErrorDocument 404 /resources/error404.php method, are you getting the default server 404 error document, or the default 403 error document -- I'd expect the latter.
ErrorDocument works as designed, in that "/" is defined by default not as a file, but as the "index" -- the auto-generated "table of contents," if you will -- of each directory. So, in normal circumstances, it always exists, and you cannot get a 404 on requests for it.
However, that's a simplified answer, applying only to your particular problem. In the wider view, DirectoryIndex, ErrorDocument, and Options +/-Indexes all come into play, along with mod_dir functions, in determining what happens if a "/" URL is requested.
I'd avoid using that 404-handler-rewrite approach, because it makes two (additional and redundant) filesystem searches in addition to the built-in one used by the default Apache missing-file handling. On a busy site, it could really slow down the server.
In addition, since it generates a Redirect response, you're sending a 302-Found, not a 404-Not Found response to the client, and that's very, very bad if you care about search rankings...
If you're not already doing so, use the "Live HTTP Headers" extension to FireFox, and take a look at your response headers for "404" errors -- I think you'll find they're 302's. :(
Jim