Forum Moderators: phranque

Message Too Old, No Replies

ReWrite URLs with specific query string to static page

How to make a specific URL with a dynamic query string redirect to page

         

jwlinux

5:59 am on Jul 6, 2011 (gmt 0)

10+ Year Member



I'm about to launch a totally new version of a website (same URL, completely redesigned and recoded site files/content).

I want to make sure that users with bookmarks, as well as Google, get redirected (301) when they hit the old URLs and that they don't get a 404.

Easy enough for the old static pages.

However, I'm having a bad time with the dynamic URLs.

I want to take certain specific old dynamic URLs and redirect them to a specific new static page. It's funny, I can find tons of examples of doing this all dynamically where something like ?id=2&other=3 get redirected to /2/3 for example, or even things like ?id=1234 gets redirected to /my-1234/page.html, where the numbers (value of the variable) get transferred automatically. But I can't find hardly any examples that redirect a specific query string parameter (or combination) to a specific static page.

So first, I want to clearly dictate which query strings get sent to which static page. I don't want any pattern matching going on.

Second, I don't want the old query string passed on to the new URL.

I'll give you an example

If someone ties to hit:
www.site.com/materials/detail.php?id=3&catid=2

I want them redirected with a 301 to:
www.site.com/super-shop/bigstuff/siding.php

And I DON'T want the query string passed on.

I found just one example that almost works. It works as long as the old URL contains only one file name and NO leading directory. I found it here: [webmasterworld.com...]

The original example was:

RewriteCond %{QUERY_STRING} ^id=2$
RewriteRule ^services\.php$ /services/all? [R=301,L]


I modified it to match my own. I decided to start simple and use only ONE of the query string parameters:

RewriteCond %{QUERY_STRING} ^id=3$
RewriteRule ^detail\.php$ /super-shop/bigstuff/siding.php? [R=301,L]

This behaves exactly like I want it to EXCEPT that the original old URL I'm trying to match is not www.site.com/detail.php, it's www.site.com/materials/detail.php

As soon as I try to add the old directory back in, this all goes south. I've tried a number of variations, such as the following:


RewriteCond %{QUERY_STRING} ^id=3$
RewriteRule ^materials/detail\.php$ /super-shop/bigstuff/siding.php? [R=301,L]

And:

RewriteBase /materials
RewriteCond %{QUERY_STRING} ^id=3$
RewriteRule ^detail\.php$ /super-shop/bigstuff/siding.php? [R=301,L]

These to not work. For one, they start passing the query string along to the "new" URL, even though I did not remove the ending "?" (from what I've read, it's the "?" that prevents the query string from getting passed along to the new URL). Also, they cause /materials to get prepended to the NEW URL and bigstuff/siding.php to be REMOVED from the new URL so that instead of:

www.site.com/super-shop/bigstuff/siding.php

it tries to hit:

www.site.com/materials/super-shop/?id=3

Again, this works as long as I don't try to match /materials in the URL. (but of course, that is exactly what I need).

And I haven't even begun to deal with the second parameter of the query string yet. Again, the complete original URL is like this:


www.site.com/materials/detail.php?id=3&catid=2

And I need to direct them to, for example:

www.site.com/super-shop/bigstuff/siding.php

Then I'll need to make other, additional pairs, like this:


www.site.com/materials/detail.php?id=1&catid=6
www.site.com/super-shop/otherbigstuff/roof.php


Make sense?

Can some one please tell me how to do this correctly.

Thank you.

lucy24

7:40 am on Jul 6, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



RewriteCond %{QUERY_STRING} ^id=3$
RewriteRule ^materials/detail\.php$ /super-shop/bigstuff/siding.php? [R=301,L]

And:

RewriteBase /materials
RewriteCond %{QUERY_STRING} ^id=3$
RewriteRule ^detail\.php$ /super-shop/bigstuff/siding.php? [R=301,L]

These do not work. For one, they start passing the query string along to the "new" URL, even though I did not remove the ending "?" (from what I've read, it's the "?" that prevents the query string from getting passed along to the new URL). Also, they cause /materials to get prepended to the NEW URL and bigstuff/siding.php to be REMOVED from the new URL so that instead of:

www.site.com/super-shop/bigstuff/siding.php

it tries to hit:

www.site.com/materials/super-shop/?id=3

Well, you can't blame the poor stupid computer for doing exactly what you're telling it to. RewriteBase is usually more trouble than it's worth unless you are very fluent in Apache; just do it rule by rule. If your server behaves the way it's supposed to (mine doesn't, which saves a lot of typing), any Redirect has give the complete url on the "target" side:

http://www.example.com/super-shop/bigstuff/siding.php?

Have to admit I have no idea where bigstuff/ is sneaking off to.

:: looking vaguely around for g1 ::

Again, the complete original URL is like this:

www.site.com/materials/detail.php?id=3&catid=2

And I need to direct them to, for example:

www.site.com/super-shop/bigstuff/siding.php

Then I'll need to make other, additional pairs, like this:

www.site.com/materials/detail.php?id=1&catid=6
www.site.com/super-shop/otherbigstuff/roof.php

Ouch. How many of these pairs are there? It would definitely be easier if you could grab bits and pieces of the query string and feed them directly into the target, as in

RewriteCond %{QUERY_STRING} ^id=(\d+)&catid=(\d+)$
RewriteRule ^materials/detail\.php$ http://www.example.com/super-shop/new%1/new%2.php? [R=301,L]

(I'm just making that up.) If you have mountains of these old urls you may be looking at a RewriteMap.

Apache says, helpfully,
Always try to understand what a particular ruleset really does before you use it. This avoids many problems.

Yah.

jwlinux

1:58 pm on Jul 6, 2011 (gmt 0)

10+ Year Member



I have about a two dozen pairs. It won't be possible to use the new%1/new%2.php type of path. I don't mind setting up a few dozen directives

I'll try the full path as you suggested and post back with the results.

Thank you.

jwlinux

4:17 pm on Jul 6, 2011 (gmt 0)

10+ Year Member




OK I partly shot myself in the foot working too late last night. Now that i can think better, I see that I got carried away obfuscating my example URLs (in case you're curious why I'm doing it at all - I'm testing on a dev site and I don't want there to be even the smallest possibility that a search engine will find the dev URL, or the keywords in the subpath). I also, for the first time, I've noticed that both with and without the full URL I get this error in the browser:

"Safari can’t open the page.
Too many redirects occurred trying to open “http://www.site.com/materials/super-shop/?id=3”. This might occur if you open a page that is redirected to open another page which then is redirected to open the original page."

I don't get any errors in the apache logs.

Now that it's morning, it dawns on me that the word "materials" is in both the old and new URLs. I have a feeling the pattern is getting into some kind of loop. I've tried adding a leading slash to the path that's supposed to be matched (^/materials/detail\.php$)

I've tried turning "RewriteBase /" on and off

And I've tried using the full URL with these previously mentioned combinations:

RewriteCond %{QUERY_STRING} ^id=3$
RewriteRule ^materials/detail\.php$ [site.com...] [R=301,L]

And still I get this in the URL:

[site.com...]

Any ideas how to make sure it doesn't loop?

Are there any debugging tools for rewrite rules? I'm surprised I'm not getting errors in the error.log.

Thanks.

jwlinux

5:01 pm on Jul 6, 2011 (gmt 0)

10+ Year Member



I've also tried removing the path materials/ from the target (destination) path, like this:

RewriteRule ^materials/detail\.php$ /siding.php? [R=301,L]

(and yes, /siding.php doe exist on the disk)

I still have the same problem. Prepending the materials/ directory is what breaks it.

jwlinux

5:08 pm on Jul 6, 2011 (gmt 0)

10+ Year Member



So in other words I guess having materials/ in the destination path wasn't part of the problem after all.

lucy24

7:49 pm on Jul 6, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I see that I got carried away obfuscating my example URLs (in case you're curious why I'm doing it at all - I'm testing on a dev site and I don't want there to be even the smallest possibility that a search engine will find the dev URL, or the keywords in the subpath).

Actually you should obfuscate the urls. Easiest is to use example.com. The object is to prevent anything containing the "http" element from being auto-converted into an active link. You want people to see what you typed, not go to the page :)

"Safari can’t open the page.
Too many redirects occurred trying to open “http://www.site.com/materials/super-shop/?id=3”. This might occur if you open a page that is redirected to open another page which then is redirected to open the original page."

I don't get any errors in the apache logs.

That's because your browser kindly stopped it in time. A redirect itself isn't an error, but if you have so many that it hits the server limit (Apache's recommended default is ten, which is at least eight more than most people need), then you'd be getting a 500 of some kind.

jwlinux

9:43 pm on Jul 6, 2011 (gmt 0)

10+ Year Member



OK so now that it's obvious I only know enough to be dangerous, I'll just ask again in case anyone knows the answer:

How do I make this old URL/page:

www.site.com/materials/detail.php?id=3&catid=2

Redirected with a 301 to this URL/page:

www.site.com/super-shop/materials/siding.php

?

Thank you

g1smd

9:58 pm on Jul 6, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



For the redirect from parameters to SEF, you must test THE_REQUEST with a preceding RewriteCond in order to be sure you redirect only direct client requests.

If you only test QUERY_STRING you will redirect the previously rewritten request, and the redirected request will then again be rewritten in an infinite loop.

There's many examples of the correct code for this as it is a question that has been asked several times per week ever since the forum first opened its doors.

However, if you have a LOT of URLs to redirect, then you should instead REWRITE (that's rewrite, not redirect) those requests to a special script. The special PHP script holds the new URLs in an array and the old URL parameters are the array keys. The PHP script then issues the 301 redirect to the new URL using the HEADER directive (BEWARE, default is a 302 redirect).

There's a post with most of the code in, just a few months back.

jwlinux

10:20 pm on Jul 6, 2011 (gmt 0)

10+ Year Member



Thanks I'll look for it. I must not have been searching for the right keywords.

jwlinux

1:12 am on Jul 7, 2011 (gmt 0)

10+ Year Member



(Note: I removed http:// from all the URLs so they wouldn't get converted into clickable links)

Ok, the computer is winning this time...

Based on some other posts that looked similar to what I wanted (I'm sure there's thousands but I'm having a hard time finding ones that are exactly what I'm looking for) I _thought_ I had it working with the following directives:

RewriteCond %{THE_REQUEST} ^[A-Z]+\ /materials/detail\.php\?id=3\ HTTP
RewriteRule ^/materials/detail\.php$ www.site.com/super-shop/materials/siding.php? [R=301,L]

I played with it for about 20 minutes and I was sure it worked.

So I expanded it to include both query parameters:

RewriteCond %{THE_REQUEST} ^[A-Z]+\ /materials/detail\.php\?id=3&catid=2\ HTTP
RewriteRule ^/materials/detail\.php$ www.site.com/super-shop/materials/siding.php? [R=301,L]

The first time I tested that, it looked like it worked. Then, I thought that it was also erroneously working when I hit www.site.com/materials/detail.php?id=3 (without the second parameter). I was about to post about that, when I tried it again and correctly got a 404. I chalked it up to browser cache. Then I played with it for another 15 mins or so, and it all seemed to work perfectly correct, redirecting me to the right page. Yay.

Next -- feeling very happy that I'd finally solved the problem -- I proceeded to make 11 copies of that pair of directives and modify them for all 11 old URLs I needed to match and redirect. I tested the very first modification, which was id=1&catid=1 and was surprised when it didn't work. This time, I was sent to [site.com...] which is a whole new problem - now it's chopping off the last directory and file, and keeping (passing) the query string. I tested the previous id=3&catid=2 and it was still working. I checked my modified directives, tested the URLs to make sure they really existed on disk, recreated the id=1&catid=1 directive, and tested -- still the same. So I tested the old id=3&catid=2 which by this time had been working great for about two hours and to my horror, it was now not working either! It's sending me to www.site.com/super-shop/?id=3&catid=2.

I've reverted all my changes and even gone all the way back to the directive that only handled one query parameter, and still it is sending me to /super-shop/?id=3.

Does anyone have any ideas what on earth I did wrong? The thing I can't get over is that it worked perfectly for about two hours then suddenly quit.

Does mod_rewrite get cached in any way?

Do my directives even look correct?

jwlinux

2:54 am on Jul 7, 2011 (gmt 0)

10+ Year Member



If there is a particular post with code examples that you have in mind, would you mind posting it's subject line or URL?

jwlinux

3:43 am on Jul 7, 2011 (gmt 0)

10+ Year Member



Now it seems to be working again. Very strange ...

jwlinux

4:04 am on Jul 7, 2011 (gmt 0)

10+ Year Member



OK so for the record, this is what seems to work correctly:

RewriteEngine on

RewriteBase /


RewriteCond %{THE_REQUEST} ^[A-Z]+\ /materials/detail\.php\?id=3&catid=2\ HTTP/
RewriteRule ^materials/detail\.php$ [site.com...] [R=301,L]

There are a few simple one-to-one static page rewrites and redirects in-between RewriteBase and the RewriteCond shown above.

lucy24

4:11 am on Jul 7, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Does mod_rewrite get cached in any way?

Not in the way you mean, but there are two everyday caches, and one of them is out of your control. Of course you have already done the basic stuff like emptying your browser cache and trying a different browser. (You do have at least five browsers, don't you?)

But further up the road is your ISP. Not the one hosting your www site, the one that you yourself connect with. They tend to be very hush-hush about exactly what sort of remote caching they use. Sometimes a request for a reload will bring up a brand-new version of the page; sometimes they'll grudgingly check the HEAD (of either the page you asked for or the page you got, but not both); sometimes no power on earth can make an ISP give you a fresh copy until they're good and ready. Instead they'll just cough up the same one they had in their cache. If they won't say, you can probably figure out some way to find out by experiment how they work.

If your public library has walk-up terminals where you can race through as much as possible in 15 minutes, use them. You can be tolerably sure nobody else has been to your site lately. (If you were a big site that everyone visits nonstop, you would have an Apache Geek on staff and you would never need to think about this stuff yourself ;))

Oh, and THE_REQUEST needs to be as exact as possible, as in:

[A-Z]+\ /stuffhere\.extension\ HTTP/1\.[01]

g1smd

7:04 pm on Jul 7, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Replace
[A-Z]+\
with
[A-Z]{3,9}\


You can end the pattern with HTTP/

The easiest way to ease your pain is to set an expires header with the current date and time to stop caching of responses.

Use example.com in the forum to stop auto-linking of URLs. [google.com...]