Forum Moderators: phranque

Message Too Old, No Replies

Redirecting a link with 'script.cgi?parameters'

         

davebach

3:14 pm on Dec 8, 2004 (gmt 0)

10+ Year Member



I see similar problems posted, but I'm not sure how to tackle this.

I have been getting some traffic through a link at a popular site, but the link was posted incorrectly and I can't get the person to change it. The link is to a script on my site, but the link contains some obfuscated code in one of the parameter names after the? in the URL... %3Cbr%20/%3E to be exact, which depending on the browser seems to also be sent as '<br />'

This causes an error on my site. How can I redirect the bad link to the good link? Maybe just a filter to remove the bad code, or a full redirect to the correct dynamic link. I'd settle to redirect to a static page as well.

I am already using mod_rewrite for some other stuff, so it is enabled.

Thanks,
Dave

jdMorgan

3:56 pm on Dec 8, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Dave,

The server should resolve the encoded characters before mod_rewrite sees the URL, so you should be able to redirect the URLs containing "<br />" to whatever URL you like.


RewriteRule ^([^<]*)<br\ />(.*)$ /$1$2 [R=301,L]

(Note that the space in the regex pattern is 'escaped' by preceding it with "\" character.

If the RewriteRule is seeing the encoded URL, then try:

RewriteRule ^([^%]*)\%3Cbr\%20/\%3E(.*)$ /$1$2 [R=301,L]

In both cases, this code should "strip out" the bad characters, and redirect to the cleaned-up URL.

If this doesn't work, it would be helpful if you could post some lines from your server access log showing a few of these requests so we can look at them.

Jim

davebach

5:13 pm on Dec 8, 2004 (gmt 0)

10+ Year Member



Thanks, but this doesn't seem to be working. Looking in my raw access logs I see the script referrenced three different ways:

www.mysite.com/shop/amazon.cgi?input_item=12345&in<br />put_search_type=AsinSearch&input_templates=2

www.mysite.com/shop/amazon.cgi?input_item=12345&in<br%20/>put_search_type=AsinSearch&input_templates=2

www.mysite.com/shop/amazon.cgi?input_item=12345&in%3Cbr%20/%3Eput_search_type=AsinSearch&input_templates=2

So I added this to the .htaccess file along with what you told me:

RewriteRule ^([^<]*)<br\%20/>(.*)$ /$1$2 [R=301,L]

I also swapped out /$1$2 for another URL, but nothing seems to grab that url.

the .htaccess file was in the public_html/ directory, but I also tried it in the /shop/ directory. The first line of the file is RewriteEngine on

Dave

jdMorgan

8:00 pm on Dec 8, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Ah OK, that clarifies things. The corrupted part is in the query string, not the URL.

RewriteCond %{QUERY_STRING} ^in<br\ />(.*)$ [OR]
RewriteCond %{QUERY_STRING} ^in<br\%20/>(.*)$ [OR]
RewriteCond %{QUERY_STRING} ^in\%3Cbr\%20/\%3E(.*)$
RewriteRule ^shop\.amazon\.cgi$ /shop.amazon.cgi?in%1 [R=301,L]

Jim

davebach

9:33 pm on Dec 8, 2004 (gmt 0)

10+ Year Member



Thanks, but I can't get that to work. I know the RewriteEngine is on and working, but the conditions don't seem to be catching it. And even if they did, would that rule work? I'm trying to understand the extra escaped periods/dots.

Dave

jdMorgan

10:04 pm on Dec 8, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



No, You're right. It won't work, because I got in a hurry and left off a good part of your query string... Sorry! Try this:

RewriteCond %{QUERY_STRING} ^([^&]+)&in<br\ />(.*)$ [OR]
RewriteCond %{QUERY_STRING} ^([^&]+)&in<br\%20/>(.*)$ [OR]
RewriteCond %{QUERY_STRING} ^([^&]+)&in\%3Cbr\%20/\%3E(.*)$
RewriteRule ^shop\.amazon\.cgi$ /shop.amazon.cgi?$1&in%2 [R=301,L]

In order to be taken as literals, the following characters must be escaped by preceding them with a backslash:
^ $ %? + * . ( ) { } [ ] ? ¦ \

otherwise, they will have special meanings as regular-expressions operators.

Jim

davebach

11:27 pm on Dec 8, 2004 (gmt 0)

10+ Year Member



Thanks Jim! I modified the RewriteRule to add back in the input_item that got stripped out. For anyone else who might need to do something like this, this was the final rule based on the original example URL:

RewriteRule ^shop/amazon\.cgi$ /shop/amazon.cgi?input_item=12345$1&in%2 [R=301,L]

Thanks again,
Dave