Welcome to WebmasterWorld Guest from 54.158.167.59

Forum Moderators: open

Message Too Old, No Replies

Exporting HTML as csv using PHPMyAdmin

   
6:55 pm on Mar 16, 2011 (gmt 0)

5+ Year Member



Hi,

I'm trying to export some data using PHPMyAdmin which has some HTML in the 'introtext' and 'fulltext' columns, but when I open the file in Excel, it's formatted incorrectly (i.e. there are lots more rows and the data in the 'introtext' cell has been broken into many cells in Excel.

Here's the SQL command I'm using in PHPMyAdmin:

SELECT `id` , `title` , `alias` , `introtext` , `fulltext`
FROM `jos_content`


I'm assuming this is an issue with the fact there is HTML in the database, so wondering what best practice for exporting HTML data is?

Thanks for any pointers,

mn
8:02 pm on Mar 16, 2011 (gmt 0)

WebmasterWorld Senior Member g1smd is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



Use an unambiguous column separator.
8:37 am on Mar 17, 2011 (gmt 0)

5+ Year Member



Hi g1smd,

Thanks for getting back to me. Here are the settings I used, but it still seems to mess up the .csv file:

Export : CSV
Fields terminated by : #
Fields enclosed by : "
Fields escaped by : \
Lines terminated by : AUTO
Replace NULL by : NULL


I'm assuming the 'Fields terminated by' is the column separator you're referring to?m That's what I changed from ',' to '#'
9:09 am on Mar 17, 2011 (gmt 0)

5+ Year Member



It sounds to me actually like a problem with line feeds maybe? Are you certain the .csv file is formatted correctly? Or if you open it up are there errant line breaks in the introtext or fulltext columns?
Just a guess on my part, perhaps there were line feeds in the html data. Maybe google "mysql remove line feeds" or such.
9:44 am on Mar 17, 2011 (gmt 0)

5+ Year Member



Hi,

I think you've hit the nail on the head as there are indeed errant line breaks in the .csv file. Is the only way around it to manually remove line breaks?

THanks
10:33 am on Mar 17, 2011 (gmt 0)

5+ Year Member



Hi,
"manually remove line breaks" that would be a "no" I would not do it manually (unless it's a fairly small file you're dealing with). You might want to look at and try some of the suggestions here [stackoverflow.com] which I found by googling as I had suggested. Some of them may work to remove the line breaks from the data as it is output from the database. Other than that, you could instead run a php script to pull the info from db then remove the line feeds from each column which is a string and then fputcsv() each row in to a .csv file. Don't have time at the moment, but if you need I could whip up a quick php script example a little later on.
1:21 pm on Mar 17, 2011 (gmt 0)

5+ Year Member



Hi,

Thanks for all your help on this, much appreciated (and apologies for not reading your post more closely regarding Googling - I had assumed line breaks and line feeds were different).

I've been googling for ways to write a php script like you mentioned, but I've got a bit lost in it all so if you could whip up a quick script, I'd be massively appreciative.

Thanks again
10:47 am on Mar 20, 2011 (gmt 0)

5+ Year Member



Hi again @mr_nabo, my apologies as "a little later on" did not happen due to life issues (busy). But I'm here now that I finally have some free time again, and below is a php example for you to try out and alter as needed, see comments within code:

<?php
//>

//&$arr must be an array
//$findRegEx should be a regular expression, what to find in strings in the array
//$replacement should be a string, what items found in the array will be replaced with
function replaceInArray(&$arr, $findRegEx, $replacement) {
foreach ($arr as $k => $v) {
if (is_string($arr[$k])) {
$arr[$k] = preg_replace($findRegEx, $replacement, $arr[$k]);
}
}
}

//connect to mysql database, note you need to insert your password below
$sql = mysql_connect("localhost", "YOUR_PASSWORD_HERE");
if (!sql) {
die('Unable To Connect!'); //adjust username, password as needed...
}
//select database, note you need to enter your database name below
mysql_select_db('NAME_OF_DATABASE_TO_SELECT_HERE', $sql);

$result = mysql_query('SELECT id,title,alias,introtext,fulltext FROM jos_content', $sql);

//open existing or create new file named 'info_csv.csv'
//the 'w' flag will allow to overwrite any file with the given name if it already exists!
$file = fopen('info_csv.csv', 'w');

while($row = mysql_fetch_array($result, MYSQL_ASSOC)) {
//The second argument being passed to replaceInArray below is a regular expression
//which says to find line feed characters which may or may not be surrounded by one or more single spaces
//I'm unsure whether you will want to replace line feeds with '<br>' or a single blank space ' '
//so adjust the third argument in the call to replaceInArray below to suit needs
replaceInArray($row, '/\s*[\r\n]+\s*/', ' ');
fputcsv($file, $row);
}

fclose($file);

?>

If anybody's wondering, how to post formatted code on webmasterworld [webmasterworld.com]
Do not copy formatted code on webmasterworld from IE, use other browser such as Firefox.