Forum Moderators: phranque

Message Too Old, No Replies

Translating encoded characters into 'real' characters

What can be done to translate requests for %3F and %3D into 'real' chars?

         

mvl22

9:45 pm on Dec 10, 2002 (gmt 0)

10+ Year Member



Recently, I've been receiving a number of requests on various of my sites (which use Apache 1.3) for URLs ending with

%3Fview%3Dprintable
e.g.
/page.html%3Fview%3Dprintable

This should in fact be requests for
/page.html?view=printable

Is there any way to get apache to decode the encoded characters to their 'real' equivalents, to eliminate this source of 404s?

I'm familiar with mod_rewrite, to some extent, as my sites have plenty of
Redirect permanent /foo.html /bar.html
stuff.

amoore

10:27 pm on Dec 10, 2002 (gmt 0)

10+ Year Member


I suppose you have several options. The one I would most recommend is to find out where these URLs are coming from and fix the source of the problem. That just makes more sense to me than treating the symptoms.

If that's impossible for you for some reason, you can change the URLs that are requested in a variety of ways. I'd probably write a small PerlHandler with a regular expression like this in it:

s/%([0-9A-Z][0-9A-Z])/pack("c",hex($1))/ige

and run all my requests through that before letting them be handled in the regular way.

You could put something like that in /page.html, but I'm not sure what language it's written in. That is probably not a good solution since you'd have to do it in each of your pages.

I can't think of a way to do it just using mod_rewrite, but someone may find one.

Hope it helps.

pendanticist

10:53 pm on Dec 10, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I too am on Apache and having similar 404s coming from Metacarta and FAST-WebCrawler.

[webmasterworld.com...]

Pendanticist.