iterating through csv to find lines different from another csv

I'm currently iterating through lines in a CSV file to check if each line exists in a second CSV file and if it does not, I store the line in an array so I can prune the "mis-matched" items out of my system.

My Problem: This is working fine with smaller CSVs but now that I am working with two CSVs over 85,000 lines, my CPU is spiking and being used 70-85% on this single script and is taking a tremendous amount of time to finish. I am wondering if there is a better way of going about it to make what I am trying to do more efficient.

My Code:

//Two CSV files
$csv = "data.csv";
$csv_local = "local_data.csv";

//Parsing CSV data into arrays
$feed_info = parseData($csv);
$local_info = parseData($csv_local);

//Parsing Function


function parseData($csv_file){
 $file_pointer = fopen($csv_file, "r");
 
  $array = array();
  while($line = fgets($file_pointer)) {
   $array[] = trim($line);
  }
  return $array;
 
}

//Store mis-matched lines in array


if(count($feed_info) > 1 && count($local_info) > 1){
 
 $mis_match_array = array();

 foreach($local_info as $info){

  if(!in_array($info,$feed_info)){
   
   $mis_match_array[] = $info;
   
  }
 } 
}

Can't think of a better/less resource intensive way of going about this - any thoughts?

Thanks!

iterating through csv to find lines different from another csv

tec4

coopster

swa66

penders

tec4

Join The Conversation

Moderators and Top Contributors

Hot Threads This Week