Hi webmasters,
I'm rewriting my framework and CMS now using PHP. I will use a flat-file-database system just like I did on my previous framework (using Perl), why? I will explain this later, for now let's stick to the question.
As with any other system, the "data-base" (here, a flat file system designed by me), will experience certain concentration of traffic on certain files that may grow to about 5mb, or 10mb, and perhaps on a data intensive system about 20mb. Most of it will be "reading", but part of it will involve writing. I want to pick the best strategic and safest method to do this regarding speed + safety writing to the file (reading is usually not a problem).
Option #1. file_get_contents
It is said to be the best method to read files using PHP, however, diverse documentation says it's memory intensive and it will load the whole file at once. Some other documentation says you can specify offset, and the whole thing may work better than expected, because PHP may use special mapping techniques if the current OS supports them. The problem is, I don't like the idea of loading the whole file at once (imagine 20 mb having lots of requests per second), and... I don't like not knowing what sort of OS mods are in place, as the system may go from server A to server B. I value independence... a lot, making the system as flexible and neutral as possible.
Option #2. fopen() // fread() <-- this is what I'm using right now
It's my personal choice at the moment and it works pretty well during my tests. This option allows me fine control over opening for reading only, or writing, and it seems safest, even the append option comes handy, but I have to load the entire file in memory. There is a limit to how far I can go optimizing my data structure, then, I'm on the realm of the command limitations or effects of large amounts of data. I can even load just the beginning of a file, but I see no options to load the middle or end (or progressively).
Option #3. fgets() <-- I can read line by line (I'm looking at this as my possibly best option)
This allows me to open large files without loading them entirely on memory, and move up and down while searching that, quite handy to avoid using too much memory and keeping my scripts under safe ranges.
Option #4. SplFileObject
I'm reading I can read line by line without loading the whole file in memory, many users reports it's fast and reliable, and it uses very little resources.
- - - - - - - - - -
Why? details?
I created my previous framework and CMS (built on top of my framework) using PERL, and it's fast and reliable, it handled pretty well sites having 7,000-9,000 daily visits with recurrent page reads. The same system handled site A with 7-9K daily unique; site B with 5K daily unique; site C with 3K unique; and site D with around the same 3K daily unique; with a temporary site E with 1,500 daily unique. All of those websites were hosted on the same server and never experienced any problems.
Why flat file databases? because if well designed (along with the right code), they can be really fast, and I absolutely love how easy this allow me to just zip and move to another server without the need of any extra configurations or modifications.
But, what about collisions? I was very careful with the code to avoid any, never experienced a single one.
What about speed? damn fast, also added cache optimization.
I know the risks of flat file databases, and I avoid all I can handling data while the file is open, I open, read, process all I need, and when file writing is required, it's open again and done without wasting time.
I want to achieve this using PHP, but I'm not as experienced here as with Perl. Why not continuing with Perl? many new servers don't come with standard configurations, meaning every time my system is moved... chaos appears and I need to advice the sysadmin on what to do, they have been quite positive and helpful, but I don't want to deal with that anymore. Besides, Perl has some limitations and bugs with dates, some appear on some severs, others appear on other servers, I'm tired of that and I had to code a lot to avoid using modules, keeping such use at minimum. Many things like that are standard PHP options, I love Perl, but it's time to move on. My first Perl CMS was written around 2007, and the one I'm currently using (framework and cms) it's from 2010.
Naturally, I'm studying my options to code this new PHP framework and CMS to exceed my current needs, anticipating possible scenarios with lots of traffic.
I will appreciate your comments and corrections if any. Thanks in advance.