homepage Welcome to WebmasterWorld Guest from 54.163.139.36
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Home / Forums Index / Code, Content, and Presentation / Apache Web Server
Forum Library, Charter, Moderators: Ocean10000 & incrediBILL & phranque

Apache Web Server Forum

    
redirect and remove query string and other symbols
using htaccess to remove query string and redirect to new location
nisiwi

5+ Year Member



 
Msg#: 4441900 posted 11:37 am on Apr 17, 2012 (gmt 0)

I realise there have been dozens of similar questions on this topic and other forums across the web.

In fact I've spent the last 2 days trying every combination I can find to remove the query string before moving onto the other symbols and I just can't get rid of the question mark.

Basically I'm trying to rewrite the following:

http://test.com/this-section/?Tag=other+section

To;

http://test.com/this-section/other-section

This is as far as I got;

RewriteCond %{QUERY_STRING} ^Tag=(.*)$ [NC]
RewriteRule (.*) ?%1 [R=301,L]


However when I test it (live) and using;

[htaccess.madewithlove.be ]

I get the following output;

http://test.com/this-section/?other+section

Any ideas how I get rid of that "?" question mark and then I can move onto replacing the

"+" with "-"

Thanks for helping with this very frustrating, and probably simple problem.

 

nisiwi

5+ Year Member



 
Msg#: 4441900 posted 2:14 pm on Apr 17, 2012 (gmt 0)

Oh yes, I forgot to mention that I'm using the "?" in the rewrite-rule as apparently it's how you get rid of the original query string.

It's at the front, however if I place it at the end of the rewrite rule I get the whole query string added onto the URL once again.

lucy24

WebmasterWorld Senior Member lucy24 us a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



 
Msg#: 4441900 posted 8:42 pm on Apr 17, 2012 (gmt 0)

Basically I'm trying to rewrite the following:

http://www.example.com/this-section/?Tag=other+section
To;
http://www.example.com/this-section/other-section

<snip>

RewriteCond %{QUERY_STRING} ^Tag=(.*)$ [NC]
RewriteRule (.*) ?%1 [R=301,L]

<snip>

I get the following output;

http://www.example.com/this-section/?other+section


Your rule says:
If the first item in the query string begins with "Tag=", then capture the whole requested URI and everything in the query string after the "Tag=" part. Throw away the URI itself, and redirect (not rewrite) to

http://www.example.com making the remainder of the old query string into a new query string.

In other words, your Rule is doing exactly what it was written to do. What's the problem?

nisiwi

5+ Year Member



 
Msg#: 4441900 posted 8:55 pm on Apr 17, 2012 (gmt 0)

Thanks for responding lucy24.

The problem is that it does not do what I want.

Mainly, I don't want the "?" in the output/redirected URL.

I do want to;

a) match those urls with the Tag= query string

b) redirect those URLS to a new URL that does not have the "?Tag="

So, how would I rewrite these rules to do that ?

Lastly I then want to replace / substitute all occurences of "+" with "-".

Thanks for looking at this.

lucy24

WebmasterWorld Senior Member lucy24 us a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



 
Msg#: 4441900 posted 2:09 am on Apr 18, 2012 (gmt 0)

if I place it at the end of the rewrite rule I get the whole query string added onto the URL once again.

There's something you're not telling us, because that is the exact opposite of what is supposed to happen when a RewriteRule (of any kind) has a trailing ? in the target.

I then want to replace / substitute all occurences of "+" with "-"

Do you mean all the places where a space in the original query was changed in transit into a + sign? You can do it in htaccess if there is a very small number of them: make successive Rules for, say, 4 plusses, 3 plusses and so on. But if there is potentially any number of them-- for example if there's a free-response section in the query string-- you will have to let php deal with it.


In theory this kind of open-ended replace can be done with the [N] flag. But you will notice that Apache itself says
Use with extreme caution. I am not entirely certain whether [N] means "run this single Rule over and over until everything rinses clean" or "start mod_rewrite over again from the top". Either way, I'm not touching it with a barge pole. And you don't see a lot of other people recommending it either. I suspect it is the Apache equivalent of "If you have to ask, you can't afford it."
nisiwi

5+ Year Member



 
Msg#: 4441900 posted 1:17 pm on Apr 18, 2012 (gmt 0)

Thanks again lucy24.

If I'm not telling anything, its not because I'm hiding anything.

It's damn frustrating and I can't understand why its not working.

Here are the other main htaccess rules above that one;

Options +FollowSymlinks
RewriteEngine On

Below that I have the standard wordpress and some caching rules.

Ok on the recursive passes as I've got an example that does something similar to change the symbols as it only every goes to 2 max 3.

However, until I get this query string thing sorted I can't move to the next stage.

Any ideas what it could be ?

nisiwi

5+ Year Member



 
Msg#: 4441900 posted 1:25 pm on Apr 18, 2012 (gmt 0)

Oh yes, I've tested just the rules as I have them without all the other stuff I'm using on:

htaccess.madewithlove.be

And it shows the exact same problem output.

So there are no other rules interferring.

Somehow this rules is not quite correct.

lucy24

WebmasterWorld Senior Member lucy24 us a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



 
Msg#: 4441900 posted 9:52 pm on Apr 18, 2012 (gmt 0)

This is the sticking point:
if I place it at the end of the rewrite rule I get the whole query string added onto the URL once again.


Let's see the exact wording of the RewriteRule that has a ? at the end. I can't help but suspect that a [QSA] flag sneaked into the rule, because I can't think of anything else that would make the query not go away.

nisiwi

5+ Year Member



 
Msg#: 4441900 posted 10:09 pm on Apr 18, 2012 (gmt 0)

If you place the question mark on the end (as you should do) and run just these rules in a simulator such as;

[htaccess.madewithlove.be...]

You get the query string attached once again. In fact the URL looks exactly the same as the original.

Here's what I used:

RewriteCond %{QUERY_STRING} ^Tag=(.*)$ [NC]
RewriteRule (.*) %1? [R=301,L]

If this was correct it should work in the simulator but it doesn't ?

I've probably done something wrong and I'm just blind to it.

Again, appreciate another pair of eyes on this.
Thanks again lucy24.

nisiwi

5+ Year Member



 
Msg#: 4441900 posted 10:10 pm on Apr 18, 2012 (gmt 0)

I also tried with the same result:

RewriteCond %{QUERY_STRING} ^Tag=(.+)$ [NC]
RewriteRule (.*) %1? [R=301,L]

nisiwi

5+ Year Member



 
Msg#: 4441900 posted 10:13 pm on Apr 18, 2012 (gmt 0)

Oh! weird.

I just ran this again on the live site and....its working ?

RewriteCond %{QUERY_STRING} ^Tag=(.*)$ [NC]
RewriteRule (.*) %1? [R=301,L]

Not sure what I've done...maybe it was a typo somewhere.

nisiwi

5+ Year Member



 
Msg#: 4441900 posted 10:34 pm on Apr 18, 2012 (gmt 0)

It's all working perfectly now. For the "+" to "-" rewrite I used:

RewriteRule ^(section1)/(section2)/(.*)\+(.*)\+(.*)\+(.*)$ $1/$2/$3-$4-$5-$6 [R=301,L]
RewriteRule ^(section1)/(section2)/(.*)\+(.*)\+(.*)$ $1/$2/$3-$4-$5 [R=301,L]
RewriteRule ^(section1)/(section2)/(.*)\+(.*)$ $1/$2/$3-$4 [R=301,L]

Thanks for helping me relook at this lucy24.

nisiwi

5+ Year Member



 
Msg#: 4441900 posted 10:37 pm on Apr 18, 2012 (gmt 0)

Ok, I tested this in the simulator and it doesn't bloomin work as expected!

Live site works as expected. Don't rely on simulators that don't work ;-)

lucy24

WebmasterWorld Senior Member lucy24 us a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



 
Msg#: 4441900 posted 7:06 am on Apr 19, 2012 (gmt 0)

Oh! How infuriating. The whole point of an emulator is to reduce your chances of blowing up the real site, not to increase your desire to blow your brains out.

Maybe you just need a different (emu|simu)lator. I've been using MAMP (the basic free version) for a while and it works nicely. Apart from having to hunt down a utility that makes dot-files visible, because I have to put "real" htaccess files in all those directories. There's also a WAMP and, I assume, a XAMP or LAMP or something along those lines.

:: sitting on hands to avoid commenting on
(.*)\+(.*)\+(.*)\+(.*)
at least just yet ::

nisiwi

5+ Year Member



 
Msg#: 4441900 posted 7:32 am on Apr 19, 2012 (gmt 0)

I'm no htaccess expert so I choose the short hand (lazy) method of replacing those "+"'s which may not be the most appropriate (but worked).

The alternative might be to use (for 3 +'s):

([^+]+)\+([^+]+)\+([^+]+)\+([^.]+)

Would that be a more reliable option then lucy24 ?

nisiwi

5+ Year Member



 
Msg#: 4441900 posted 7:34 am on Apr 19, 2012 (gmt 0)

Sorry, should be:

([^+]+)\+([^+]+)\+([^+]+)\+(.*)

nisiwi

5+ Year Member



 
Msg#: 4441900 posted 7:38 am on Apr 19, 2012 (gmt 0)

nope, that didn't work for me

lucy24

WebmasterWorld Senior Member lucy24 us a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



 
Msg#: 4441900 posted 8:18 am on Apr 19, 2012 (gmt 0)

The alternative might be to use (for 3 +'s):

([^+]+)\+([^+]+)\+([^+]+)\+([^+]+)$

Yes, that's what you should be doing. For two reasons:

One, to keep mod_rewrite from having to backtrack over and over and over again each time it discovers it was supposed to capture another plus.

And two, you want to be sure you're stopping at each and every plus.

Can you be confident that there will never be more than eight of them? Take a deep breath now:

RewriteRule ^(section1/section2/[^+]*)\++([^+]+)\++([^+]+)\++([^+]+)\++([^+]+)\++([^+]+)\++([^+]+)\++([^+]+)\++([^+]+)$ http://www.example.com/$1-$2-$3-$4-$5-$6-$7-$8-$9 [R=301,L]

Everything up to the first plus can go in a single capture. That's $1. That gives you up to eight more plus-and-not-plus groups. And then repeat with similar Rules for seven down to one.

Note that I sneakily changed \+ to \++ because if you have more than one of them, I assume you want to collapse them into a single -.

The very first not-plus can be [^+]* in case the next filename starts with plusses. But after that, you don't want
\+[^+]*\+

because that would allow
++

to become
--

instead of collapsing to
-

nisiwi

5+ Year Member



 
Msg#: 4441900 posted 9:53 am on Apr 19, 2012 (gmt 0)

excellent insight lucy24, thankyou

ah, yes, much more efficient, I see

the structure/occurence of the "+" in the urls is very strict/uniform.

1 maybe 2 in total and only ever 1 seperating each word

However I see your example and will give it a whirl.

Thanks again!

nisiwi

5+ Year Member



 
Msg#: 4441900 posted 10:03 am on Apr 19, 2012 (gmt 0)

One small change in the redirected URL

/$1-$2 should be /$1/$2 to maintain the correct output.

Works well. Thank you lucy24.

g1smd

WebmasterWorld Senior Member g1smd us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 4441900 posted 1:28 pm on Apr 23, 2012 (gmt 0)

Your redirect target should include protocol and domain name. Many of the previous example code snippets did not have this.

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Code, Content, and Presentation / Apache Web Server
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved