homepage Welcome to WebmasterWorld Guest from 54.226.235.222
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Pubcon Platinum Sponsor 2014
Home / Forums Index / Code, Content, and Presentation / Databases
Forum Library, Charter, Moderators: physics

Databases Forum

    
Massive MySQL database issue
MySQL, massive database
ninenote




msg:4009856
 5:26 am on Oct 20, 2009 (gmt 0)

Hi guys

Iíve got a problem dealing with massive databases in MySQL (currently, it has more than 50K records/20 fields and itís continually increasing).

The requirement is to build a web based report generator that allows staff to query record base on specific fieldsí value. However, when the query result is larger than 30K records, the processing time becomes quite slow. Especially so when I use PHP to export the database into CSV format Ė the process takes around 10 minutes, which could be affecting the system performance. (This database is still collecting the data in real-time).

The only solution that I could think of is to import the database to the local server and let staff use a database client to generate report locally. However, the data will not be in real time and I need to import it manually.

I donít have a lot of experience in database area. Your help or pointers would be much appreciated.

 

tangor




msg:4009899
 7:37 am on Oct 20, 2009 (gmt 0)

Is there a reason to use PHP between the MySQL database and analysis? My experience is the closer you get to the data, the faster the outcomes.

Frank_Rizzo




msg:4009947
 9:41 am on Oct 20, 2009 (gmt 0)

To be honest that is a relatively small database. MySQL / PHP should be able to process 50K records in miliseconds or at best no worse than a second or two.

Your problem could be:

lack of memory / swap filing
record locking
inefficient code

What else is going on in this box? What O/S? How much ram? What type of disks? Is the overall box old technology?

RAM is dirt cheap these days. Consider adding more but make sure you tune the my.cnf file.

Run some MySQL tuning performance checks on it. They will tell you what to optimise in the my.cnf file

LifeinAsia




msg:4010150
 4:03 pm on Oct 20, 2009 (gmt 0)

Make sure your table is properly indexed! If your table is not indexed (or improperly indexed), that's probably your major bottleneck.

rocknbil




msg:4010199
 5:03 pm on Oct 20, 2009 (gmt 0)

Agreed: indexing, and 50K is very small, really, drop some benchmarks in your code and find out where it's slowing up, its got to be how it's being selected from within the programming.

I'd check the programming first,, then go add the indexes. Reason being, if the tables aren't indexed and you add indexes to the fields, it will speed it up, giving you the impression it's fixed. When it gets to 100K in a few weeks, it will resurface again.

ninenote




msg:4010358
 9:46 pm on Oct 20, 2009 (gmt 0)

Guys tangor, Frank_Rizzo, LifeinAsia, rocknbil

Thank you very much for your kindly help. I'm investigating and will get back soon.

ninenote




msg:4010531
 2:39 am on Oct 21, 2009 (gmt 0)

Million thanks guys! Query result is hundred time quicker!

After heading around this with my poor skill for long, your guys advises are really help me out of this hole.

Database
-----------
The problem was that PK and indexes from one reference table was not set.

After indexes are set, the query result is amazingly quick. It took only - 0.86 sec from command line (>10 min before).

I use MyISAM engine so I could not able to define FK. I'm not sure if it has any risks to alter engine to InnoDB and I'm not sure if there are any advantage of defining FK.

In addition, I'm still calculate AGE from date of birth in the query string (and put it to another field).
I'm not sure if this is a good practice or not.

By the way, the result on mysql command line is now really acceptable.

Coding
----------
I still have problem with exporting CSV using PHP.
There are 2 main loops that may effect the system.

First loop go through every records from mysql_query (40K)
Second loop go through every columns of each record to reform it to csv format.

Here is my code, $arrs is a set of mysql_result in array

function getCSV($arrs, $filename = "export.csv")

{

$csv_terminated = "\n"; //new line
$csv_separator = ","; // new column
$csv_enclosed = '"'; //quatation
$csv_escaped = "\\"; //escape quatation
$csv_content = "";

$columns = array_keys($arrs[0]);

// Put columns header, less than 20 times, I thinks it's not an issue
foreach($columns as $col) $csv_content .= $csv_enclosed . str_replace($csv_enclosed, $csv_escaped . $csv_enclosed, stripslashes($col)) . $csv_enclosed . $csv_separator; // create csv row, add seperator at the end.

$csv_content = trim(substr($csv_content, 0, -1)); // remove last seperater, then remove white space

$csv_content .= $csv_terminated; // new line

foreach($arrs as $arr):// First, loop through every records

foreach($columns as $col)// Second, loop through every columns

$csv_content .= $csv_enclosed . str_replace($csv_enclosed, $csv_escaped . $csv_enclosed, stripslashes($arr[$col])) . $csv_enclosed . $csv_separator;

$csv_content = trim(substr($csv_content, 0, -1));

$csv_content .= $csv_terminated; // new line

endforeach;

// export
header("Cache-Control: must-revalidate, post-check=0, pre-check=0");
header("Content-Length: " . strlen($csv_content));
header("Content-type: text/x-csv");
header("Content-Disposition: attachment; filename=$filename");
echo $csv_content;
exit;
}

Server
-------------------
I think the server is fine. BTW, here is server reference:
top - 13:24:25 up 20 days, 12:46, 1 user, load average: 0.00, 0.00, 0.00
Tasks: 120 total, 1 running, 118 sleeping, 0 stopped, 1 zombie
Cpu(s): 0.6%us, 0.4%sy, 0.0%ni, 98.2%id, 0.2%wa, 0.3%hi, 0.3%si, 0.0%st
Mem: 4055304k total, 3222268k used, 833036k free, 263384k buffers
Swap: 524280k total, 0k used, 524280k free, 2396792k cached

roldar




msg:4038532
 6:08 am on Dec 7, 2009 (gmt 0)

I would see if you could avoid storing, appending to and iterating over the $csv_content variable so much. I'd take a look at turning on output buffering using ob_start() at the beginning of the function, printing out each line of the CSV as soon as it's ready, and then calling ob_end_flush() at the end of the function.

i.e.

Instead of


foreach($a as $b):
$output .= ...
endforeach;
echo $output;

Try:

ob_start();
foreach($a as $b):
echo ...;
endforeach;
ob_end_flush();

ninenote




msg:4084285
 1:45 pm on Feb 21, 2010 (gmt 0)

Thank you so much roldar :)

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Code, Content, and Presentation / Databases
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved