Forum Moderators: phranque
I've got that part working, but the problem lies in getting the pages that are already indexed by google switched to the new format using 301 redirects.
It seems that the RewriteRules could be rewriting each other, since the old URLs need to be 301 redirected to the new URLs, but the new URLs are being rewritten to the old URLs.
Its like this:
I need index.php?p=* to permanently redirect to post*, while post* URLs are rewritten to index.php?p=*.
I've tried various permutations of these rules (different order, with/without [L], etc.):
RewriteRule ^(.*)index\.php\?p\=(.*)$ $1post$2 [R=301]
RewriteRule ^(.*)post(.*)$ $1index.php?p=$2 [L]
The post* to index.php?p=*. rewriting works, but I can't get the other rule to send a 301 (I've checked the headers). It seems like Apache sees these two rules as contradictory and throws one out, though they should both be valid and [L] should prevent any looping.
I hope I understand what you're trying to do...
If so, the first problem is that the query string is not available to be tested in a RewriteRule, so add a RewriteCond to test it and put the parameters into backreference %1. Then use the RewriteRule to look only for the index.php page URL, and build the destination URL using both $1 and %1, thus:
RewriteCond %{QUERY_STRING} ^p=(.*)$
RewriteRule ^(.*)index\.php$ http://www.yourdomain.com/$1post%1 [R=301,L]
RewriteRule ^(.*)post(.*)$ /$1index.php?p=$2 [L]
HTH,
Jim
<corrected> Added "http://domain_name/" to 301 Rule </corrected>
From your code I'm getting looping requests for [mydomain.com...]
Adding "?" to the end of the rewriterule prevents the querystring from being passed:
RewriteCond %{QUERY_STRING} ^p=(.*)$
RewriteRule ^(.*)index\.php$ [yourdomain.com...] [R=301,L]
But things still seem to loop when /post10 is requested. Any ideas?
I've found this page [fluidthoughts.com] which uses the following example for something similar:
RewriteCond %{QUERY_STRING} id=([^&;]*)
RewriteRule ^/$ [%{SERVER_NAME}...] [R]
RewriteRule ^/([^\/]*)/?$ /index.php?id=$1 [L]
That redirects a querystring to a directory and then rewrites it back to the querystring. I haven't been able to adapt it for my use, but if it works, it might be a good reference.
BTW, I'd really like to get this right before I try any more possibilities. I get the feeling that if I set off another infinite loop, my web host is going to wish a slow and painful death on me. ;)
Taking that into consideration, I can't imagine how to prevent looping given what I need to do. I'm thinking something like this could work:
1 RewriteCond %{REQUEST_URI}!index\.php
2 RewriteRule ^(.*)post(.*)$ $1index.php?p=$2 [L]
3
4 RewriteCond %{REQUEST_URI}!post[0-9]
5 RewriteCond %{QUERY_STRING} ^p=(.*)$
6 RewriteRule ^(.*)index\.php$ http://www.yourdomain.com/$1post%1? [R=301,L] But then again, what happens when you click on /post* and its rewritten to index.php?p=* on line 2? Wouldn't the .htaccess rules then be applied to the rewritten URL, which would match the RewriteCond's on lines 4 & 5? Then you're back in a loop, unless REQUEST_URI has somehow stayed constant through this process.
I think it's simply a matter of the order of your rules. The external 301 redirect must be first, and it must have an [L] flag. The second rule must also have an [L] flag, but must not be an external (R=301 or R=302) redirect.
The code should not loop if the [R=301,L] Rule is processed first, and the internal redirect is placed after that.
If the [301,L] is processed first, then the client browser is redirected to the new URL, and processing stops for that request.
Then the client returns, this time with the URL pattern matching the second rule, which is an internal rewrite only. So in this case, after the internal rewrite is done no further rewriting takes place, either.
You must not have any other RewriteRules which invoke a 301 redirect for URLs matching the output of the second rule. This includes additional .htaccess files in subdirectories, script outputs, etc. If you do, then you will indeed get an infinite loop.
You could set this test up with some dummy files and URLs so that testing doesn't affect your live pages.
I'm not intimately familiar with your site, so please cite what URL is input and what that URL is rewritten to in each case - I can't tell whether you are telling me the input URL which failed, or what the RewriteRule output was when it failed - and in this case in particular, that can be very confusing!
Try the rules in this exact order. Also, remove (comment out) any other RewriteRules you may have which might affect requests for "index.php". That should sort out the looping, and then let's see about other problems.
RewriteCond %{QUERY_STRING} ^p=(.*)$
RewriteRule ^(.*)index\.php$ http://www.yourdomain.com/$1post%1 [R=301,L]
RewriteRule ^(.*)post(.*)$ /$1index.php?p=$2 [L]
Jim
So
[mydomain.com...]
301 redirects correctly & as intended to
[mydomain.com...]
but
[mydomain.com...]
also 301 redirects to
[mydomain.com...]
If I knew more about how .htaccess/apache worked, I think I could figure this out for myself, but there just isn't much documentation on how this all fits together. I.E., after a redirect, is the new URL completely reprocessed through .htaccess? Or does an internal rewrite change the REQUEST_URI variable?
It makes sense to me that the redirect needs to occur before the internal rewrite, but changing the order of the corresponding lines doesn't seem to affect their function.
There are a fairly limited number of the querystring URLs in the google index, so I could do more hard-coded redirects. Logically, they shouldn't be any different, though.
Another option is that I could have basically another index.php file, completely identical, but just with a different filename that I'd use for the internal rewrite (/post* to /index2.php?p=*). I'm sure this would work, so maybe I'll just do that. At this point, its more like something I need to conquer than a search for the most practical solution, but it may not be worth the time to figure this mess out.
If I knew more about how .htaccess/apache worked, I think I could figure this out for myself, but there just isn't much documentation on how this all fits together. I.E., after a redirect, is the new URL completely reprocessed through .htaccess? Or does an internal rewrite change the REQUEST_URI variable?
After a 301 or 302 redirect which includes the [L] flag, rewriting is terminated, and the 30x response is sent back to the client (browser).
If there is no [L] flag, then subsequent RewriteRules will be processed if their RewriteConds are met.
In the case of a server-internal rewrite, only the REQUEST_URI is changed - the client is NOT notified, and again rewriting will continue in the absence of an [L] flag.
I can't for the life of me figure out how a request for [yourdomain.com...] is matching the first rule, unless the whole requested URL is [yourdomain.com...]
If that's the case, you'll need to add another RewriteCond to block the loop:
RewriteCond %{REQUEST_URI} !post/index\.php
RewriteCond %{QUERY_STRING} ^p=(.*)$
RewriteRule ^(.*)index\.php$ http://www.yourdomain.com/$1post%1 [R=301,L]
RewriteRule ^(.*)post(.*)$ /$1index.php?p=$2 [L]
Then there is another thing... mod_rewrite works only between the receipt of a request and the serving of a resource. It cannot be used to rewrite URLs which are output from a script, unless that script is returning those URLs to the client with a 301/2 redirect header - in which case mod_rewrite will see those as the incoming URLs of new requests. In that case, the script and mod_rewrite code is going to loop unless you make changes to create mutual exclusion.
Again, please be as specific as possible about the requested URLs and their querystrings - I suspect the devil is in the details there.
I hope this makes sense!
Jim
I just can't figure it out either! Its really just as simple as I'm describing it...no fancy URL-writing scripts, complicated query strings, etc. I've stared at the damn thing for hours myself and can't make sense of it, so I think for now I'll stick with the renamed file solution. Its not terribly elegant, but its completely transparent to users, so I can't complain too much.
Again, thanks for all your help and if I do ever figure out, I'll be sure to let you know how. ;)