Forum Moderators: coopster

Message Too Old, No Replies

safe characters

         

smallcompany

3:21 am on May 7, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Recently, I asked this under HTTP and got nowhere. Now I have more data after doing some testing in PHP, so the question should fit better here this time.

I run affiliate based business and I use PHP scripting to put together the data that I pass to sales report (like SID in CJ).

The data is comprised from things like search query, referring URL, my own data from PPC, and so on.

The end result is like this ("v" for variable):

$total= "$v1-$v2-$v3-$v4";

When this shows up in reports and I download it into Excel, I use the hyphen sign (minus) to convert text to columns so I get it nicely divided to do sorting, analysis, etc.

The problem is that "-" is sometimes a part of some of the variables. I can;t use plus sign (+) as that one is already representing a space in search queries.

I tried something like % or $, but they get converted into %25 and so on which I don't like.

Then I thought I would use two characters like QQ (unlikely to show up in any of the variables) but to my surprise, I got the last variable only at my end.

Why the other variables got cut off? I don't understand then regular characters get passed with no trouble, but my "QQ" messed it up?
I'm really curious to know this.

Finally, is there anything else besides + or - that I could use? I guess I'll have to use something like "--" or "+-" unless I learn something new there like I've learned with QQ.

Thanks

TheMadScientist

3:35 am on May 7, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



When I need to 'delimit' something I usually use 'almost guaranteed to be unique' and go with something like: |.| or !.!

Not too many 'regular' occurrences of those I've run into.
Not sure on your QQ issue, but hope it gives you some ideas.

smallcompany

4:34 am on May 7, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Thanks.

I'm trying to use a single character only. As I said, I already tried characters other then "-" and "+" and got them URL encoded.

I guess I could use something like "-.-" which would make it unique and would not get encoded.

I still wonder about why QQ ruined the complete variable.

And I also wonder if there is a reference that clearly lists what else besides +, -, and . does not get encoded.

Thanks

TheMadScientist

5:11 am on May 7, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Not sure of the location of a list, but to find out you could:

echo urlencode('Special Chars Here');

IntegrityWebDev

2:24 pm on May 7, 2010 (gmt 0)

10+ Year Member



what about tilde? ~

rocknbil

6:49 pm on May 7, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



$v2 = 'some-var';

// replace the var with an odd character NOT your delimiter

$v2 = str_replace('-','^',$v2); // or |, or something else

$total= "$v1-$v2-$v3-$v4";

// on output, always put it back after splitting

$row = explode('-',$total);

for ($i=0;$i<count($row);$i++) { $row[$i] = str_replace('^','-',$row[$i]); }

Homegrown solution, but will work.

smallcompany

10:00 pm on May 7, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Thanks.

I did more testing and found that I can't really control what happens once my variables go to my partner's site.
What is more, since they run business on few platforms, in some cases I get signs like ~ or | going through like a charm, while in other they get encoded into %7E and %7C respectively (depending on which platform the sales goes through).

So I can stick with something like -.- or use the rocknbil's idea.

I guess there are no more bulletproof characters other then -, +, and .
+ is for spaces coming from search engines, and . and - are for domain names.

Many thanks to all.

TheMadScientist

10:42 pm on May 7, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



You could probably add _ to your list of characters that make it through without an issue. :)
Glad you got something figured out anyway, even if it's not ideal.

smallcompany

2:30 am on May 9, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



You could probably add _ to your list of characters that make it through without an issue.


Underscore did the same what QQ had done. Only last variable would go through.
I don't get that.

No matter how many systems my variables are going through, they're just a string like this:

something1-something2-something3 and so on.

"Something" can be anything and it goes through with no problem, but if I replace those hyphens with any regular character, I get the last variable only.

(:o scratch, scratch

tangor

3:43 am on May 9, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Depending on the data collected I have used TAB, SPACE, PIPE, CARET, TILDE as delimiters

TheMadScientist

6:16 am on May 9, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



(:o scratch, scratch

No kidding.

Would you mind posting the code causing the issue and a couple of sample strings that don't work with it? Maybe one of us will be able to see something 'goofy' contributing to the issue...

coopster

1:18 pm on May 11, 2010 (gmt 0)

WebmasterWorld Administrator 10+ Year Member



The real issue is that you are attempting to separate a value once again in a name/value pair. The data coming in the value part of this pair is varying and may contain unexpected characters. To properly separate, you should break this particular value up into it's own proper name/value pairs. In this manner the query string will ALWAYS provide you the probable/expected variables.

Knowing this, why not further define the QUERY_STRING?