homepage Welcome to WebmasterWorld Guest from
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member
Home / Forums Index / Code, Content, and Presentation / PHP Server Side Scripting
Forum Library, Charter, Moderators: coopster & jatar k

PHP Server Side Scripting Forum

Basics of extracting data from CSV files
and reformatting their content

 5:27 pm on Dec 18, 2003 (gmt 0)

Hi Everyone

What i have to accomplish is to basically reformat one CSV file into another.

Here is an example of the structure of the CSV data im dealing with


So as you can see the third field has multiple values delimited by the ¦ symbol

What i need to do is to basically replace the ¦ symbol with commas , so that each data item is in its own field



Then eventually remove all other data so i am just left with a CSV file of the values in the third field.



So i would like to try it in php , but might use perl depending on ease of implementation though i am more comfortable with PHP.

Anyone have any ideas or suggestions for an approach to this problem?

Thanks in advance




 8:04 pm on Dec 18, 2003 (gmt 0)

Welcome to WebmasterWorld beatsie,

Anyone have any ideas or suggestions for an approach to this problem?

I am sure we can think of something ;)

your order of events seems fairly straight forward

1. open the source file for read
2. open destination file for write
3. get row from source
4. reformat
5. write to destination
6. repeat
7. close files

seems simple enough

1. open the source file for read
$sfp = fopen [ca.php.net]('/path/to/source.csv','r');

2. open destination file for write
$dfp = fopen('/path/to/destination.csv','w');

3. get row from source - we'll put this into a while loop to continuously get rows until we have none left
while ($row = fgetcsv [ca.php.net]($sfp,10000,",","")) {

4. reformat - since fgetcsv splits it up for us we are looking for the third element of the $row array

$goodstuff = str_replace [ca.php.net]("¦",",",$row[2]);
$goodstuff .= "\n";

we may need to intialize $goodstuff inside the loop to be safe

Note:Step 4 can be any type of reformatting that needs to be performed on the file contents. In this particular case we are removing fields but this would be the step where any reconstruction can be made.

5. write to destination
fwrite [ca.php.net]($dfp,$goodstuff);

6. repeat- the while loop takes care of this step as well

7. close files
fclose [ca.php.net]($sfp);

so what does that leave us with?

$sfp = fopen('/path/to/source.csv','r');
$dfp = fopen('/path/to/destination.csv','w');
while ($row = fgetcsv($sfp,10000,",","")) {
$goodstuff = "";
$goodstuff = str_replace("¦",",",$row[2]);
$goodstuff .= "\n";

seems pretty straight forward, didn't test it obviously ;)

WebmasterWorld breaks pipes so ¦ must be replaced with an unbroken pipe

[edited by: jatar_k at 7:57 pm (utc) on Feb. 6, 2005]


 8:49 pm on Dec 18, 2003 (gmt 0)

If I read it write you want to _lose_ fields 1,2,4 and turn field 3 into a comma-separated list.

How about using the command-line:

awk -F[,] '{print $3}' csvfile.txt ¦ sed 's/¦/,/g' > newfile.txt

(in English: split each row on ','; keep the third field; replace '¦'s with ','s)

Of course if your data has '"'s in it you might need to tweak the command a bit


 12:49 pm on Dec 19, 2003 (gmt 0)

Wow, thanks for the prompt replys jatar_k & dcrombie

Will try both suggstions out and let you know how i get on.

Thanks again


 3:23 pm on Jan 9, 2004 (gmt 0)

Ok i just tried your suggestion for the moment jatar_k.

Your suggestion works the way i described the problem, so many thanks for that.
But, I had another look at the source data, and the data is not delimited by ¦

It looks like each piece of data i want is actually on a new line.

What it looks like is one field with many lines within that field.It's hard to accurately describe, so i can email a screenshot if required.

I tried using '\n' instead of '¦' but that did not work,.

Any advice appreciated.


 7:21 pm on Jan 10, 2004 (gmt 0)

You could maybe use simple old fopen and fgets then maybe. The thing about automating tasks is too truly understand the data that you are trying to parse or work with.

Is there a pattern to the data that you can see?


 11:48 am on Jan 27, 2004 (gmt 0)

Hey Jatar

The Hard drive failed on my laptop, so I've been out of action for a while.

Yes there is a pattern to the data.

I have been messing about and have the data items in their own row.



basically i want every 13 or so rows to be in one row delimited by commas.


There is a repeated value after every record so maybe i could just use fgets to replace newlines \n with commas and then to move on to a new row when a certain value is encountered?

the repeat value crops up like this in the file im working with


so hopefully i could end up with

Value1,Value2,Value3,REPEAT VAR
Value4,Value5,Value6,REPEAT VAR

What do you think?


 8:27 pm on Jan 27, 2004 (gmt 0)

sounds like exactly what you need.

Is the repeated var the same as the one right before it?

ie. in this sceanario


REPEAT_VAR1 == Value3
REPEAT_VAR2 == Value6


 9:21 pm on Jan 27, 2004 (gmt 0)

The reason replacing '¦' with '\n' didn't work is that control characters need to be interpreted. So, you need to use "\n" not '\n'. Try it out...


 4:44 pm on Feb 3, 2004 (gmt 0)

Hey again guys

Managed to get the data into the form i wanted using a combination of PHP scripting, An Excel manipulation utility and just good old search and replace.

What a nightmare :)

Thanks for all your help.


Global Options:
 top home search open messages active posts  

Home / Forums Index / Code, Content, and Presentation / PHP Server Side Scripting
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved