Forum Moderators: coopster & phranque

Message Too Old, No Replies

different possible ways of seperating records

         

lindajames

1:47 pm on Jul 26, 2003 (gmt 0)

10+ Year Member



Hi,

I have a plain text file that contains some data seperated with a comma like this:

username, password, emailaddress

i just realised that i cannot use the commas to seperate the records as some of my records contain commas as the data. what other best method of seperation is there?

any suggestions would be appreciated.

Cheers
Linda

bunltd

1:55 pm on Jul 26, 2003 (gmt 0)

10+ Year Member



Have you tried using the pipe character? ¦ (it won't show accurately here in the forum it's above the backslash) it doesn't commonly occur like a comma making it a good candidate for a delimiter.

LisaB

Damian

1:59 pm on Jul 26, 2003 (gmt 0)

10+ Year Member



You could use pipe characters
name¦password¦email

Whatever seperator you choose, you should set your script to either dissallow or escape the seperator character in your actual data.

I think that's why Brett changes the regular pipe character to the split pipe characters you see on this board when regular pipes are intended... looks like he stores some data in flatfile format with a pipe as seperator.

lindajames

2:09 pm on Jul 26, 2003 (gmt 0)

10+ Year Member



my code has been working fine with the commas will using the pipe characters affect my code in any way?

lindajames

2:42 pm on Jul 26, 2003 (gmt 0)

10+ Year Member



how about using a : instead of a pipe? i was told that a : is better to use

Damian

3:09 pm on Jul 26, 2003 (gmt 0)

10+ Year Member



It doesn't matter which you choose as long as the seperator charcater is not a character in your data as well.

The question is how to avoid the seperator form occuring in your data, not what character you choose. Before you save the data you can either remove the seperator character when it does occur in the data, change it to something else, or escape it.

Then when you retrieve the data, you either have taken the seperator out already, change it back to what it should be because you know how you changed it when you saved it, or just unescape it.

Then there's brett's method which changes a pipe to split pipe and does not change it back when retrieved again.... :)

If you want to use the escape-unescape method you should make your script so that it reads your seperator character only as a seperator if it's not preceded by a backward slash "\"

dkubb

3:59 pm on Jul 26, 2003 (gmt 0)

10+ Year Member



As a separator, I'd probably use the null character; because its least likely to appear within a name or email address:

my $line = join \0, $username, $password, $emailaddress;

($username, $password, $emailaddress) = split \0, $line;

This is how i'd do it quick and dirty. If I had to read the file with other programs like Excel, or I was doing this in production code I'd probably go with a quote-comma format and use Text::CSV or preferably Text::CSV_XS to handle escaping commas within the data (excuse the formatting, I can't seem to properly indent the code):

use Text::CSV_XS;

#...

my $csv = Text::CSV_XS->new({

 eol => "\n",

 binary => 1,

});

my $line = $csv->combine($username, $password, $emailaddress)

? $csv->string

 : die "Could not create a line: ", $csv->error_input;

#to retrieve the information from the line

($username, $password, $emailaddress) = $csv->parse($line)

? $csv->fields

 : die "Could not parse the line: ", $csv->error_input;

This library handles everything for you. You don't have to worry when you write the data out if you'll be able to parse it out again.

lindajames

5:07 pm on Jul 26, 2003 (gmt 0)

10+ Year Member



at the moment i am seperating using commas like this:

while(<FILE> ) {
chomp;
my ($user, $password, $email) = split(/,/, $_);

how will that code need to be if i wanted to seperate it using pipes?

i tried:
while(<FILE> ) {
chomp;
my ($user¦ $password¦ $email) = split(/¦/¦ $_);

but no luck, guess im doing it wrong.

any suggestions would be appreciated.

cheers
linda

Damian

5:16 pm on Jul 26, 2003 (gmt 0)

10+ Year Member




my ($user¦ $password¦ $email) = split(/\¦/, $_);

The pipe needs to be escaped (I think), the comma is part of the function not the seperator

vincevincevince

5:26 pm on Jul 26, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



the real issue is a lack of escaping - by all means use comma seperated values, but just ensure you escape the commas which aren't divisions

lindajames

5:42 pm on Jul 26, 2003 (gmt 0)

10+ Year Member



Damian i tried that but it didnt work, then i tried this:

my ($user, $password, $email) = split(/\¦/, $_);

and that seemed to work, i cant figure out how it works like that.

Damian

5:51 pm on Jul 26, 2003 (gmt 0)

10+ Year Member



Sorry Linda , my mistake.
Glad you figured it out nonetheless! You have the correct notation to read data seperated by pipes.