Page is a not externally linkable
incrediBILL - 5:36 pm on Jun 20, 2012 (gmt 0)
That depends, would a search open thousands of files if you have 6K smaller files?
The file open operation is actually one of the most expensive (time consuming) functions in the operating system so one file vs many files has a very significant advantage in that aspect alone.
When you open a very large file, the key to speedy processing is the size of the cache buffer your code uses because disk hardware tends to read like 10K or more at a one time. If you take all the data the disk controller reads from a track when it has it, you save multiple accesses to the drive, etc. etc. and it gets really speedy to process.
Compare that to reading line by line using the programming languages gets() function, which uses their own default buffering which doesn't take into account the actual hardware performance capabilities, and it's going to be much slower.
I could go on but you probably get the idea that a custom read function or at least specifying the right amount of data to cache per read designed to minimize HD head movement, can make such a search screaming fast.
A 1MB file shouldn't really challenge modern hardware for speed either in reading the whole thing from the disk in one shot or in searching through it in memory.
If possible, move it to RAM disk.