Welcome to WebmasterWorld Guest from

Forum Moderators: Ocean10000 & incrediBILL & phranque

Redirecting url with special characters

4:15 pm on Apr 1, 2010 (gmt 0)

10+ Year Member

I realize that this has been dealt with in quite a few threads, but I've tried all of the examples that I've seen, and I just can't get it to work.

I'm trying to redirect the following url:
http://www.example.com/> Blue Widgets</a>

Which is being seen by Apache as:

I've tried all of the examples I've seen of escaping the url, but it still just 404s on me.

Any guidance would be greatly appreciated.

4:46 pm on Apr 1, 2010 (gmt 0)

WebmasterWorld Senior Member g1smd is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

Have you got just one of these or a whole bunch?

I worked on a site last year that had lots of duff incoming links with extraneous spaces and punctuation in the URL.

For that, we created a landing page and then simply redirected any URL request with a space, bracket, comma, or % sign in it to that page - the worry being malicious incoming 'bad' links with 'bad' words in them that we might otherwise have 'corrected' to point at a real content page, and hence associate it with the unwanted 'bad' words.

We sacrificed the unindexed landing page for that. That page had links to major site sections, a few featured products, a button to report linking problems, and so on.

WebmasterTools was also great for finding duff incoming links from other sites. Do make sure you check both the www and non-www WMT reports for your site.
4:55 pm on Apr 1, 2010 (gmt 0)

10+ Year Member

Actually WebmasterTools is where I discovered it. It's just one incoming link, but I would like to still benefit from the "link juice" and not just 404 it. I certainly don't want to setup any rules that would just 301 any wacky url to my home page.
5:48 pm on Apr 2, 2010 (gmt 0)

10+ Year Member

I tried the following and it just 404s:

RewriteRule ^>\ Blue\ Widgets</a>
RewriteRule ^\%3E\%20Blue\%20Widgets\%3C/a\%3E

And various other combinations and I can't get it to work.

I would appreciate a bit of guidance.


6:47 pm on Apr 2, 2010 (gmt 0)

WebmasterWorld Senior Member jdmorgan is a WebmasterWorld Top Contributor of All Time 10+ Year Member

You need to look at your raw server access log and see exactly what URL-path is being requested. It's evident from the rules that you posted that you aren't quite sure, and for mod_rewrite code, you need to be very sure...

If Apache isn't decoding this URL-path as expected, then the more-complex

RewriteCond %{THE_REQUEST} ^[A-Z]+\ /\%(25)*3[Ee]\%(25)*20Blue\%(25)*20Widgets[^\ ]*\ HTTP/
RewriteRule Blue http://www.example.com/path-to-actual-blue-widget-page [R=301,L]

should work.

Note that the un-anchored RewriteRule pattern "Blue" is used to reduce the number of requests for which the RewriteCond must be processed. If this RewriteRule pattern doesn't match, the RewriteCond won't even be parsed. I used only "Blue" so that we can be sure that it will match, without concern for whether the surrounding characters are decoded or remain as URL-encoded entities.

The "(25)*" subpatterns appearing in the RewriteCond pattern allow for multiply-encoded characters. For example, all of %20 %2520, %25252520, and %25252525252520 will be decoded by Apache to a single space.

I also didn't exactly-match the 'tail' of the malformed URL-path in the RewriteCond pattern, matching it with the generic "zero or more characters, anything but a space" subpattern following "Widgets" and preceding the " HTTP/" at the end.

Once you get something working, you can go ahead and make the patterns more-specific if you like, to improve performance slightly.

9:13 pm on Apr 2, 2010 (gmt 0)

10+ Year Member

Thank you Jim,
As usual, you are "da' man" when it comes to anything Apache.


Featured Threads

Hot Threads This Week

Hot Threads This Month