I'm currently iterating through lines in a CSV file to check if each line exists in a second CSV file and if it does not, I store the line in an array so I can prune the "mis-matched" items out of my system.
My Problem: This is working fine with smaller CSVs but now that I am working with two CSVs over 85,000 lines, my CPU is spiking and being used 70-85% on this single script and is taking a tremendous amount of time to finish. I am wondering if there is a better way of going about it to make what I am trying to do more efficient.
My Code: //Two CSV files $csv = "data.csv";
$csv_local = "local_data.csv";
//Parsing CSV data into arrays $feed_info = parseData($csv);
$local_info = parseData($csv_local);
//Parsing Function
function parseData($csv_file){
$file_pointer = fopen($csv_file, "r");
$array = array();
while($line = fgets($file_pointer)) {
$array[] = trim($line);
}
return $array;
}
//Store mis-matched lines in array
if(count($feed_info) > 1 && count($local_info) > 1){
$mis_match_array = array();
foreach($local_info as $info){
if(!in_array($info,$feed_info)){
$mis_match_array[] = $info;
}
}
}
Can't think of a better/less resource intensive way of going about this - any thoughts?
Thanks!