homepage Welcome to WebmasterWorld Guest from 54.166.14.218
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member
Home / Forums Index / Code, Content, and Presentation / Apache Web Server
Forum Library, Charter, Moderators: Ocean10000 & incrediBILL & phranque

Apache Web Server Forum

    
Redirecting url with special characters
phpmaven

10+ Year Member



 
Msg#: 4108380 posted 4:15 pm on Apr 1, 2010 (gmt 0)

I realize that this has been dealt with in quite a few threads, but I've tried all of the examples that I've seen, and I just can't get it to work.

I'm trying to redirect the following url:
http://www.example.com/> Blue Widgets</a>

Which is being seen by Apache as:
http://www.example.com/%3E%20Blue%20Widgets%3C/a%3E

I've tried all of the examples I've seen of escaping the url, but it still just 404s on me.

Any guidance would be greatly appreciated.

Mark

 

g1smd

WebmasterWorld Senior Member g1smd us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 4108380 posted 4:46 pm on Apr 1, 2010 (gmt 0)

Have you got just one of these or a whole bunch?

I worked on a site last year that had lots of duff incoming links with extraneous spaces and punctuation in the URL.

For that, we created a landing page and then simply redirected any URL request with a space, bracket, comma, or % sign in it to that page - the worry being malicious incoming 'bad' links with 'bad' words in them that we might otherwise have 'corrected' to point at a real content page, and hence associate it with the unwanted 'bad' words.

We sacrificed the unindexed landing page for that. That page had links to major site sections, a few featured products, a button to report linking problems, and so on.

WebmasterTools was also great for finding duff incoming links from other sites. Do make sure you check both the www and non-www WMT reports for your site.

phpmaven

10+ Year Member



 
Msg#: 4108380 posted 4:55 pm on Apr 1, 2010 (gmt 0)

Actually WebmasterTools is where I discovered it. It's just one incoming link, but I would like to still benefit from the "link juice" and not just 404 it. I certainly don't want to setup any rules that would just 301 any wacky url to my home page.

phpmaven

10+ Year Member



 
Msg#: 4108380 posted 5:48 pm on Apr 2, 2010 (gmt 0)

I tried the following and it just 404s:

RewriteRule ^>\ Blue\ Widgets</a>
RewriteRule ^\%3E\%20Blue\%20Widgets\%3C/a\%3E

And various other combinations and I can't get it to work.

I would appreciate a bit of guidance.

Thanks,

Mark

jdMorgan

WebmasterWorld Senior Member jdmorgan us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 4108380 posted 6:47 pm on Apr 2, 2010 (gmt 0)

You need to look at your raw server access log and see exactly what URL-path is being requested. It's evident from the rules that you posted that you aren't quite sure, and for mod_rewrite code, you need to be very sure...

If Apache isn't decoding this URL-path as expected, then the more-complex

RewriteCond %{THE_REQUEST} ^[A-Z]+\ /\%(25)*3[Ee]\%(25)*20Blue\%(25)*20Widgets[^\ ]*\ HTTP/
RewriteRule Blue http://www.example.com/path-to-actual-blue-widget-page [R=301,L]

should work.

Note that the un-anchored RewriteRule pattern "Blue" is used to reduce the number of requests for which the RewriteCond must be processed. If this RewriteRule pattern doesn't match, the RewriteCond won't even be parsed. I used only "Blue" so that we can be sure that it will match, without concern for whether the surrounding characters are decoded or remain as URL-encoded entities.

The "(25)*" subpatterns appearing in the RewriteCond pattern allow for multiply-encoded characters. For example, all of %20 %2520, %25252520, and %25252525252520 will be decoded by Apache to a single space.

I also didn't exactly-match the 'tail' of the malformed URL-path in the RewriteCond pattern, matching it with the generic "zero or more characters, anything but a space" subpattern following "Widgets" and preceding the " HTTP/" at the end.

Once you get something working, you can go ahead and make the patterns more-specific if you like, to improve performance slightly.

Jim

phpmaven

10+ Year Member



 
Msg#: 4108380 posted 9:13 pm on Apr 2, 2010 (gmt 0)

Thank you Jim,
As usual, you are "da' man" when it comes to anything Apache.

Mark

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Code, Content, and Presentation / Apache Web Server
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved