Forum Moderators: coopster
What i have to accomplish is to basically reformat one CSV file into another.
Here is an example of the structure of the CSV data im dealing with
value1,value2,value3a¦value3b¦value3c¦value4c¦,value4
So as you can see the third field has multiple values delimited by the ¦ symbol
What i need to do is to basically replace the ¦ symbol with commas , so that each data item is in its own field
i.e
value1,value2,value3a,value3b,value3c,value4c,value4
Then eventually remove all other data so i am just left with a CSV file of the values in the third field.
i.e
value3a,value3b,value3c,value4c
So i would like to try it in php , but might use perl depending on ease of implementation though i am more comfortable with PHP.
Anyone have any ideas or suggestions for an approach to this problem?
Thanks in advance
Karl
Anyone have any ideas or suggestions for an approach to this problem?
I am sure we can think of something ;)
your order of events seems fairly straight forward
1. open the source file for read
2. open destination file for write
3. get row from source
4. reformat
5. write to destination
6. repeat
7. close files
seems simple enough
1. open the source file for read
$sfp = fopen [ca.php.net]('/path/to/source.csv','r');
2. open destination file for write
$dfp = fopen('/path/to/destination.csv','w');
3. get row from source - we'll put this into a while loop to continuously get rows until we have none left
while ($row = fgetcsv [ca.php.net]($sfp,10000,",","")) {
}
4. reformat - since fgetcsv splits it up for us we are looking for the third element of the $row array
$goodstuff = str_replace [ca.php.net]("¦",",",$row[2]);
$goodstuff .= "\n";
we may need to intialize $goodstuff inside the loop to be safe
Note:Step 4 can be any type of reformatting that needs to be performed on the file contents. In this particular case we are removing fields but this would be the step where any reconstruction can be made.
5. write to destination
fwrite [ca.php.net]($dfp,$goodstuff);
6. repeat- the while loop takes care of this step as well
7. close files
fclose [ca.php.net]($sfp);
fclose($dfp);
so what does that leave us with?
<?
$sfp = fopen('/path/to/source.csv','r');
$dfp = fopen('/path/to/destination.csv','w');
while ($row = fgetcsv($sfp,10000,",","")) {
$goodstuff = "";
$goodstuff = str_replace("¦",",",$row[2]);
$goodstuff .= "\n";
fwrite($dfp,$goodstuff);
}
fclose($sfp);
fclose($dfp);
?>
seems pretty straight forward, didn't test it obviously ;)
WebmasterWorld breaks pipes so ¦ must be replaced with an unbroken pipe
[edited by: jatar_k at 7:57 pm (utc) on Feb. 6, 2005]
How about using the command-line:
awk -F[,] '{print $3}' csvfile.txt ¦ sed 's/¦/,/g' > newfile.txt
(in English: split each row on ','; keep the third field; replace '¦'s with ','s)
Of course if your data has '"'s in it you might need to tweak the command a bit
;)
Your suggestion works the way i described the problem, so many thanks for that.
But, I had another look at the source data, and the data is not delimited by ¦
It looks like each piece of data i want is actually on a new line.
What it looks like is one field with many lines within that field.It's hard to accurately describe, so i can email a screenshot if required.
I tried using '\n' instead of '¦' but that did not work,.
Any advice appreciated.
The Hard drive failed on my laptop, so I've been out of action for a while.
Yes there is a pattern to the data.
I have been messing about and have the data items in their own row.
i.e.
Value1
Value2
Value3
basically i want every 13 or so rows to be in one row delimited by commas.
Value1,Value2,Value3
There is a repeated value after every record so maybe i could just use fgets to replace newlines \n with commas and then to move on to a new row when a certain value is encountered?
the repeat value crops up like this in the file im working with
Value1
Value2
Value3
REPEAT VAR
Value4
Value5
Value6
REPEAT VAR
so hopefully i could end up with
Value1,Value2,Value3,REPEAT VAR
Value4,Value5,Value6,REPEAT VAR
What do you think?