Forum Moderators: coopster

Message Too Old, No Replies

Deleting lines from top of a large open text file

Is there a way to do this without rewiting the whole file?

         

lars stecken

8:44 am on May 7, 2005 (gmt 0)

10+ Year Member



Hi all,

I am splitting up one huge text file in several smaller files to prepare them for database processing. So far I tested it with files up to 100MB and everything works fine.

Now here's my question: To optimize my script I would like to incrementally remove all lines form the original file which have already been copied to the smaller file.
This would speed up reading the original file in each loop.

Unfortunatelly I couldn't find a way to remove/delete lines from an open file. Of course I could read the whole file, delete the lines and then write it back. But for files bigger than 1 GB this doesn't seem to be a good idea....

Can anybody help? Alternatively I would also be thankful for a suggestion about how to efficiently process huge text files.

Greets,

lars_stecken

coopster

12:34 pm on May 7, 2005 (gmt 0)

WebmasterWorld Administrator 10+ Year Member



A flat text file does not lend itself to "updates" very kindly in the manner you are describing as an update to a file is ... well, an update to a file ;) It is not a relational database with columns and rows.

You'll find moderate discussion [webmasterworld.com] on the forum about this topic, but not much detail. You can flock the file, process, and write it back out entirely or move your information to a database management system. Outside of that, I look forward to hearing any other alternatives.

lars stecken

1:00 pm on May 7, 2005 (gmt 0)

10+ Year Member



Coopster, thank you for your note.
I was actually pretty sure there is no out-of-the-box solution for this problem. What I am doing is exactly what you are suggesting - transfering the date into a database.

Since I cannot change the format (file) the data is delivered to me, I have no choice but to cope with these huge files :-(

I'll read up the thread you pointed me to anyway.

Thanks again,
lars_stecken