homepage Welcome to WebmasterWorld Guest from 54.211.235.255
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member
Home / Forums Index / Code, Content, and Presentation / Databases
Forum Library, Charter, Moderator: open

Databases Forum

    
Exporting HTML as csv using PHPMyAdmin
mr_nabo




msg:4282636
 6:55 pm on Mar 16, 2011 (gmt 0)

Hi,

I'm trying to export some data using PHPMyAdmin which has some HTML in the 'introtext' and 'fulltext' columns, but when I open the file in Excel, it's formatted incorrectly (i.e. there are lots more rows and the data in the 'introtext' cell has been broken into many cells in Excel.

Here's the SQL command I'm using in PHPMyAdmin:

SELECT `id` , `title` , `alias` , `introtext` , `fulltext`
FROM `jos_content`


I'm assuming this is an issue with the fact there is HTML in the database, so wondering what best practice for exporting HTML data is?

Thanks for any pointers,

mn

 

g1smd




msg:4282662
 8:02 pm on Mar 16, 2011 (gmt 0)

Use an unambiguous column separator.

mr_nabo




msg:4282859
 8:37 am on Mar 17, 2011 (gmt 0)

Hi g1smd,

Thanks for getting back to me. Here are the settings I used, but it still seems to mess up the .csv file:

Export : CSV
Fields terminated by : #
Fields enclosed by : "
Fields escaped by : \
Lines terminated by : AUTO
Replace NULL by : NULL


I'm assuming the 'Fields terminated by' is the column separator you're referring to?m That's what I changed from ',' to '#'

astupidname




msg:4282865
 9:09 am on Mar 17, 2011 (gmt 0)

It sounds to me actually like a problem with line feeds maybe? Are you certain the .csv file is formatted correctly? Or if you open it up are there errant line breaks in the introtext or fulltext columns?
Just a guess on my part, perhaps there were line feeds in the html data. Maybe google "mysql remove line feeds" or such.

mr_nabo




msg:4282872
 9:44 am on Mar 17, 2011 (gmt 0)

Hi,

I think you've hit the nail on the head as there are indeed errant line breaks in the .csv file. Is the only way around it to manually remove line breaks?

THanks

astupidname




msg:4282880
 10:33 am on Mar 17, 2011 (gmt 0)

Hi,
"manually remove line breaks" that would be a "no" I would not do it manually (unless it's a fairly small file you're dealing with). You might want to look at and try some of the suggestions here [stackoverflow.com] which I found by googling as I had suggested. Some of them may work to remove the line breaks from the data as it is output from the database. Other than that, you could instead run a php script to pull the info from db then remove the line feeds from each column which is a string and then fputcsv() each row in to a .csv file. Don't have time at the moment, but if you need I could whip up a quick php script example a little later on.

mr_nabo




msg:4282934
 1:21 pm on Mar 17, 2011 (gmt 0)

Hi,

Thanks for all your help on this, much appreciated (and apologies for not reading your post more closely regarding Googling - I had assumed line breaks and line feeds were different).

I've been googling for ways to write a php script like you mentioned, but I've got a bit lost in it all so if you could whip up a quick script, I'd be massively appreciative.

Thanks again

astupidname




msg:4284515
 10:47 am on Mar 20, 2011 (gmt 0)

Hi again @mr_nabo, my apologies as "a little later on" did not happen due to life issues (busy). But I'm here now that I finally have some free time again, and below is a php example for you to try out and alter as needed, see comments within code:

<?php
//>

//&$arr must be an array
//$findRegEx should be a regular expression, what to find in strings in the array
//$replacement should be a string, what items found in the array will be replaced with
function replaceInArray(&$arr, $findRegEx, $replacement) {
foreach ($arr as $k => $v) {
if (is_string($arr[$k])) {
$arr[$k] = preg_replace($findRegEx, $replacement, $arr[$k]);
}
}
}

//connect to mysql database, note you need to insert your password below
$sql = mysql_connect("localhost", "YOUR_PASSWORD_HERE");
if (!sql) {
die('Unable To Connect!'); //adjust username, password as needed...
}
//select database, note you need to enter your database name below
mysql_select_db('NAME_OF_DATABASE_TO_SELECT_HERE', $sql);

$result = mysql_query('SELECT id,title,alias,introtext,fulltext FROM jos_content', $sql);

//open existing or create new file named 'info_csv.csv'
//the 'w' flag will allow to overwrite any file with the given name if it already exists!
$file = fopen('info_csv.csv', 'w');

while($row = mysql_fetch_array($result, MYSQL_ASSOC)) {
//The second argument being passed to replaceInArray below is a regular expression
//which says to find line feed characters which may or may not be surrounded by one or more single spaces
//I'm unsure whether you will want to replace line feeds with '<br>' or a single blank space ' '
//so adjust the third argument in the call to replaceInArray below to suit needs
replaceInArray($row, '/\s*[\r\n]+\s*/', ' ');
fputcsv($file, $row);
}

fclose($file);

?>

If anybody's wondering, how to post formatted code on webmasterworld [webmasterworld.com]
Do not copy formatted code on webmasterworld from IE, use other browser such as Firefox.

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Code, Content, and Presentation / Databases
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved