Welcome to WebmasterWorld Guest from 54.226.194.180

Forum Moderators: open

Message Too Old, No Replies

Massive MySQL database issue

MySQL, massive database

     

ninenote

5:26 am on Oct 20, 2009 (gmt 0)

5+ Year Member



Hi guys

Iíve got a problem dealing with massive databases in MySQL (currently, it has more than 50K records/20 fields and itís continually increasing).

The requirement is to build a web based report generator that allows staff to query record base on specific fieldsí value. However, when the query result is larger than 30K records, the processing time becomes quite slow. Especially so when I use PHP to export the database into CSV format Ė the process takes around 10 minutes, which could be affecting the system performance. (This database is still collecting the data in real-time).

The only solution that I could think of is to import the database to the local server and let staff use a database client to generate report locally. However, the data will not be in real time and I need to import it manually.

I donít have a lot of experience in database area. Your help or pointers would be much appreciated.

tangor

7:37 am on Oct 20, 2009 (gmt 0)

WebmasterWorld Senior Member tangor is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month



Is there a reason to use PHP between the MySQL database and analysis? My experience is the closer you get to the data, the faster the outcomes.

Frank_Rizzo

9:41 am on Oct 20, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



To be honest that is a relatively small database. MySQL / PHP should be able to process 50K records in miliseconds or at best no worse than a second or two.

Your problem could be:

lack of memory / swap filing
record locking
inefficient code

What else is going on in this box? What O/S? How much ram? What type of disks? Is the overall box old technology?

RAM is dirt cheap these days. Consider adding more but make sure you tune the my.cnf file.

Run some MySQL tuning performance checks on it. They will tell you what to optimise in the my.cnf file

LifeinAsia

4:03 pm on Oct 20, 2009 (gmt 0)

WebmasterWorld Administrator lifeinasia is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month



Make sure your table is properly indexed! If your table is not indexed (or improperly indexed), that's probably your major bottleneck.

rocknbil

5:03 pm on Oct 20, 2009 (gmt 0)

WebmasterWorld Senior Member rocknbil is a WebmasterWorld Top Contributor of All Time 10+ Year Member



Agreed: indexing, and 50K is very small, really, drop some benchmarks in your code and find out where it's slowing up, its got to be how it's being selected from within the programming.

I'd check the programming first,, then go add the indexes. Reason being, if the tables aren't indexed and you add indexes to the fields, it will speed it up, giving you the impression it's fixed. When it gets to 100K in a few weeks, it will resurface again.

ninenote

9:46 pm on Oct 20, 2009 (gmt 0)

5+ Year Member



Guys tangor, Frank_Rizzo, LifeinAsia, rocknbil

Thank you very much for your kindly help. I'm investigating and will get back soon.

ninenote

2:39 am on Oct 21, 2009 (gmt 0)

5+ Year Member



Million thanks guys! Query result is hundred time quicker!

After heading around this with my poor skill for long, your guys advises are really help me out of this hole.

Database
-----------
The problem was that PK and indexes from one reference table was not set.

After indexes are set, the query result is amazingly quick. It took only - 0.86 sec from command line (>10 min before).

I use MyISAM engine so I could not able to define FK. I'm not sure if it has any risks to alter engine to InnoDB and I'm not sure if there are any advantage of defining FK.

In addition, I'm still calculate AGE from date of birth in the query string (and put it to another field).
I'm not sure if this is a good practice or not.

By the way, the result on mysql command line is now really acceptable.

Coding
----------
I still have problem with exporting CSV using PHP.
There are 2 main loops that may effect the system.

First loop go through every records from mysql_query (40K)
Second loop go through every columns of each record to reform it to csv format.

Here is my code, $arrs is a set of mysql_result in array

function getCSV($arrs, $filename = "export.csv")

{

$csv_terminated = "\n"; //new line
$csv_separator = ","; // new column
$csv_enclosed = '"'; //quatation
$csv_escaped = "\\"; //escape quatation
$csv_content = "";

$columns = array_keys($arrs[0]);

// Put columns header, less than 20 times, I thinks it's not an issue
foreach($columns as $col) $csv_content .= $csv_enclosed . str_replace($csv_enclosed, $csv_escaped . $csv_enclosed, stripslashes($col)) . $csv_enclosed . $csv_separator; // create csv row, add seperator at the end.

$csv_content = trim(substr($csv_content, 0, -1)); // remove last seperater, then remove white space

$csv_content .= $csv_terminated; // new line

foreach($arrs as $arr):// First, loop through every records

foreach($columns as $col)// Second, loop through every columns

$csv_content .= $csv_enclosed . str_replace($csv_enclosed, $csv_escaped . $csv_enclosed, stripslashes($arr[$col])) . $csv_enclosed . $csv_separator;

$csv_content = trim(substr($csv_content, 0, -1));

$csv_content .= $csv_terminated; // new line

endforeach;

// export
header("Cache-Control: must-revalidate, post-check=0, pre-check=0");
header("Content-Length: " . strlen($csv_content));
header("Content-type: text/x-csv");
header("Content-Disposition: attachment; filename=$filename");
echo $csv_content;
exit;
}

Server
-------------------
I think the server is fine. BTW, here is server reference:
top - 13:24:25 up 20 days, 12:46, 1 user, load average: 0.00, 0.00, 0.00
Tasks: 120 total, 1 running, 118 sleeping, 0 stopped, 1 zombie
Cpu(s): 0.6%us, 0.4%sy, 0.0%ni, 98.2%id, 0.2%wa, 0.3%hi, 0.3%si, 0.0%st
Mem: 4055304k total, 3222268k used, 833036k free, 263384k buffers
Swap: 524280k total, 0k used, 524280k free, 2396792k cached

roldar

6:08 am on Dec 7, 2009 (gmt 0)

10+ Year Member



I would see if you could avoid storing, appending to and iterating over the $csv_content variable so much. I'd take a look at turning on output buffering using ob_start() at the beginning of the function, printing out each line of the CSV as soon as it's ready, and then calling ob_end_flush() at the end of the function.

i.e.

Instead of


foreach($a as $b):
$output .= ...
endforeach;
echo $output;

Try:

ob_start();
foreach($a as $b):
echo ...;
endforeach;
ob_end_flush();

ninenote

1:45 pm on Feb 21, 2010 (gmt 0)

5+ Year Member



Thank you so much roldar :)
 

Featured Threads

Hot Threads This Week

Hot Threads This Month