homepage Welcome to WebmasterWorld Guest from 54.198.46.115
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Home / Forums Index / Code, Content, and Presentation / Apache Web Server
Forum Library, Charter, Moderators: Ocean10000 & incrediBILL & phranque

Apache Web Server Forum

    
Disable urlencode
can i disable or decode user lands?
abushahin




msg:4441105
 7:53 pm on Apr 15, 2012 (gmt 0)

Hi Guys,
I'm having problems with some urls that are being sent as encoded i.e the symbol = is encoded and looks like %3F and all other symbols such as & etc.
Now I think my host as set it up so that it doesnt decode and it returns 404 although if decoded the page is there.
Is there a way to overcome this by decoding or something?
any help much apprecited!

 

lucy24




msg:4441150
 11:00 pm on Apr 15, 2012 (gmt 0)

Before we start: Please say you're talking about things in the query string, not in the body of the URL.

abushahin




msg:4441170
 12:00 am on Apr 16, 2012 (gmt 0)

Yes it is in the query string, so id=9 would look like id%3F9
It's causing many pages to be 404

lucy24




msg:4441260
 6:16 am on Apr 16, 2012 (gmt 0)

Now I think my host as set it up so that it doesnt decode and it returns 404 although if decoded the page is there.

Is your host bonkers? Non-alphanumerics are always encoded in transit. And then they need to get disencoded at the far end.

What happens to your query strings when they arrive in your domain? If it's your own php script, all you'd have to do is tweak a few lines. But something tells me you're dealing with a CMS that expects to receive its queries in a particular form.

Option 1 of course is to have a chat with your host and confirm that it's their doing. I kinda think there's something you can set in htaccess, but can't find it :( Special characters going out from your system are not a problem; there's a simple flag in mod_rewrite. It's the incoming characters that are troublesome.

Now, how many different characters are involved? If it's nothing but ? and & and = you might manage to deal with it in mod_rewrite. Otherwise you will need to get friendly with php. Do you ever meet the specific character %25? That's when things get really messy, because it means some characters have been encoded twice.

abushahin




msg:4441308
 7:34 am on Apr 16, 2012 (gmt 0)

No it's not a cms, I just need to convert =, ?, & and that will be fine. Any ideas?
I will get in touch with the hosts and see what they can do

lucy24




msg:4441372
 9:54 am on Apr 16, 2012 (gmt 0)

Well, it really depends how many of them there are. (Er, I mean how many escaped characters, not how many hosts.) If each request comes with a string of seventeen queries

?ab=1&cd=2&ef=3&gh=4&ij=5

and so on and so on, you can give up right now. Or rather, start learning php ;)

But if your query strings are short, you can do it in htaccess alone...

:: massive deleting here as I realize that Apache doesn't know it's a query string when it starts with ... and further digression for severe hysterics as I find that Character Viewer has chosen this of all moments to play dead, or could it possibly be last night's Software Update? ... %3F ::

Have I got this right? Requests are landing in your server as, let's say,

/directory/index.php%3Fname%3DSmith%26rank%3Dcorporal%26serial_number%3D12345

Like that? So you can't even use the Query String function of mod_rewrite, because it doesn't know there is a query string.

Do you want to redirect or rewrite? It properly ought to be a redirect-- except that you then risk having everything get re-escaped the moment it steps outside the door. You can append the [NE] flag, but I'm not sure how long it remains in effect.

If there are no more than, say, three queries, you do it like this (all one line):

RewriteRule ^([^%]*)%3F([^%]+)%3D([^%]*)%26([^%]+)%3D([^%]*)%26([^%]+)%3D([^%]*)$ http://www.example.com/?$1=$2&$3=$4&$5=$6 [R=301,L,NE]

The magic word obviously is [^%] meaning "don't capture anything that has been escaped". Here I have toggled between + and * assuming that a query will always have a name, but its value may be empty. And you know better than I do what comes before the queries. If it's always the same thing, spell it out.

Technically you could handle as many as four queries this way, since Regular Expressions let you juggle up to nine captures. List them from longest to shortest for safety's sake.

enigma1




msg:4441457
 1:28 pm on Apr 16, 2012 (gmt 0)

No it's not a cms

What's your code that retrieves the URI?

abushahin




msg:4441502
 2:52 pm on Apr 16, 2012 (gmt 0)

?countries=Australia&type=1&course=2&uni=3

and others

?uid=183&countries=United+Kingdom&uni=0&type=1&course=0

wilderness




msg:4441505
 3:00 pm on Apr 16, 2012 (gmt 0)

Those are strings.

Code implies some like the following (copied from another thread):

RewriteRule ^Libraries/Emulation/NES/(([^/]+/)*)([^/.]+\.(html?|zip))$

lucy24




msg:4441519
 3:32 pm on Apr 16, 2012 (gmt 0)

Start learning php ;)

abushahin




msg:4441545
 4:27 pm on Apr 16, 2012 (gmt 0)

Currently there isn't any code in the htaccess, I can't control what comes from other websites ie query strings

enigma1




msg:4441551
 4:36 pm on Apr 16, 2012 (gmt 0)

I can't control what comes from other websites ie query strings

Yes you can, so is some static html pages you have or you're using some server language like php to generate pages?

abushahin




msg:4441569
 5:14 pm on Apr 16, 2012 (gmt 0)

Its php I'm using, thanks for the replies

enigma1




msg:4441582
 5:34 pm on Apr 16, 2012 (gmt 0)

If you're using php setup the scripts to use the $_GET array to get the parameters passed. If for some reason you need the query exactly as passed decoded:

$url_query = rawurldecode($_SERVER['REQUEST_URI']);

then the $url_query will contain the proper string

Since the uri is incorrect the 404 handler will be invoked so you need to add a custom error handler in the htaccess. The custom error handler will contain the decoding part and then you need to output the right headers depending what you want to do:

in .htaccess:

ErrorDocument 404 /error_handler.php

And in error_handler.php which is in the root of the domain along with the other php scripts:

$url_query = rawurldecode($_SERVER['REQUEST_URI']);
$parts_array = explode('?', $url_query);
$script = basename($parts_array[0]);
if( is_empty($script) ) $script = 'index.php';

if( is_file($script) ) {
header("HTTP/1.1 200");
require($script);
} else {
header("HTTP/1.1 404");
echo 'Page not found';
}
exit();

abushahin




msg:4441597
 6:37 pm on Apr 16, 2012 (gmt 0)

Many thanks, I will try that out!

enigma1




msg:4441952
 1:27 pm on Apr 17, 2012 (gmt 0)

I had an error in the code for the empty check. It should read

if( empty($script) ) $script = 'index.php';

abushahin




msg:4441998
 2:44 pm on Apr 17, 2012 (gmt 0)

thanks for that but I tried as you said and it didnt work. Is there way to decode using htaccess or on page with php? i.e. the page the query lands on can i extract the values including symbols = & etc with rawurldecode($_SERVER['REQUEST_URI']);

enigma1




msg:4442106
 6:22 pm on Apr 17, 2012 (gmt 0)

That's exactly what the code does. Which part didn't work? If the handler is not invoked because the script is there you could use the $_GET array to process the parameters. Otherwise if there is a 404 error and you have the error_handler.php in the .htaccess invoked.

lucy24




msg:4442168
 8:57 pm on Apr 17, 2012 (gmt 0)

Will the php recognize the query as a query even if the question mark itself is encoded as %3F ? I think that's where the problem started. Once you can pull out the query string, you're home free.

Luckily there's only one of them, even if there are dozens of individual queries. So at worst you'd have to pass the original request through htaccess once, convert that single %3F to ? and then send it along to php.

abushahin




msg:4442216
 11:41 pm on Apr 17, 2012 (gmt 0)

enigma1 im not sure why its not working, is there a way to convert %3F to = and the & sign with htaccess? or is that a limitation?

phranque




msg:4442252
 4:52 am on Apr 18, 2012 (gmt 0)

Question about %3F and %3D embedded in inbound links:
http://www.webmasterworld.com/apache/4138119.htm [webmasterworld.com]

lucy24




msg:4442289
 7:02 am on Apr 18, 2012 (gmt 0)

He used the [N] flag!
!
He used the [N] flag!
!
He used the [N] flag!
!

Just hours ago I was telling someone a few doors down that you could, in theory, use the [N] flag but I would never ever dare.

And that's why jdMorgan is jdMorgan.

Now, what if you do the no-brainer version?

RewriteCond %{REQUEST_URI} ^([^%]*)%3F(.*) [NC]
RewriteRule %3F http://www.example.com/%1?%2 [R=301,NE]

If the original URL contains a percent-encoded question mark, go to Condition to make sure the ? is the first percent-encoded thing in the URL. If it is, everything before the ? becomes URL and everything after the ? becomes query. Kick it out the door (full Redirect). When it comes back, it will be recognized as a query string and php will deal with it.

You could possibly send it straight to php at this point, but it may depend on what order things are expected to happen in.

I put all the captures in the Condition because most URLs will not contain a %3F at all, so let's not make the server do any work it doesn't have to.

abushahin




msg:4442497
 2:08 pm on Apr 18, 2012 (gmt 0)

Thank you Lucy24 is there a way to incorporate %3D as well as %26 in the htaccess, im rubbish at this. :(

lucy24




msg:4442673
 10:03 pm on Apr 18, 2012 (gmt 0)

You can do anything in htaccess so long as it only occurs once. But if you're looping around and around to pick up every last ? and = and & then you have to use [N] as in jdMorgan's example.

These Forums are not real big on smilies, but
[cosgan.de...]
about sums it up.


If you have query strings, you have a function that deals with the query string. You only need to decode that one leading ? so the query will be recognized as a query.

For php-- or any other programming language-- it's trivial to say "replace all occurrences of A with B". Heck, you can even do it in javascript :) Don't beat your brains out trying to do something in htaccess if you already have another way to deal with it.

Leosghost




msg:4442705
 11:48 pm on Apr 18, 2012 (gmt 0)

Don't beat your brains out trying to do something in htaccess if you already have another way to deal with it.

Ought to be "pinned" somewhere hereabouts ..or on a "post it" on every monitor.. :)

abushahin




msg:4443052
 4:25 pm on Apr 19, 2012 (gmt 0)

thanks guys for all the input.

abushahin




msg:4443823
 9:55 pm on Apr 21, 2012 (gmt 0)

RewriteCond %{REQUEST_URI} ^([^%]*)%3F(.*) [NC]
RewriteRule %3F http://www.example.com/%1?%2 [R=301,NE]


that didnt work btw!

phranque




msg:4443869
 3:18 am on Apr 22, 2012 (gmt 0)

what response did you get? any clues in the server log files?

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Code, Content, and Presentation / Apache Web Server
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved