Forum Moderators: phranque
What I'm wondering is if I should save my content in XML format and parse the XML document to screen via PHP or should I keep it in a MySQL database and write PHP to accomplish the XML formatting when necessary. Or?
If you have a million records and you are not dealing with them sequentially, there is no doubt that a database is the right way to go. XML is missing some key features that a database supports: locking (necessary for multi-user writes to the database), security, indexes, random access, and not having to parse text.
So if you have so much data that you cannot contain the entire (parsed) XML file in memory, then a database is the right way to go. On the other hand, for a small amount of data, the entire (parsed) file can be contained in memory, at which point you have most of the advantages of a database without the maintenance problems of databases.
Databases have the drawbacks that they can be expensive. They have to be bought, properly installed, maintained, backed up, and administered. For a large database, there are Database Administrators (DBA) whose whole full-time job is to manage the database. Since XML is just text, it doesn't require as much maintenance.
XML can be more memory intensive on the server. You typically load (and parse) the XML file for each hit on a page. That loaded page is consuming memory until the page is fed to the end user when it is released. If you get a flood of people hitting your web site all at once (such as if your site gets mentioned on the WebmasterWorld home page), then serving all those pages might consume all the memory on the server, at which point some people won't be able to get on the site. Memory intensive operations don't scale well. If you do it, max out the memory on you web server. Memory's cheap.
So it boils down to: if you have a small amount of traffic at a steady pace and aren't doing multi-user writes to the data, then XML works great.
We're talking about relatively short articles and sports stats. To begin with I don't expect the traffic level to be that much. I plan to use a shared hosting arrangement to start and monitor traffic.
I read your post re: xslt but from what I gather I will still need to use PHP and the expat lib functions to parse the xml docs. I had another question as I pondered this, in order for the SEs to find all of the content, I assume I will need one unique PHP file for each XML document IF I want the SEs to find that content (e.g. article001.php parses article001.xml and article002.php parses article002.xml) or is there an easier way to make the content SE friendly?
Your help is much appreciated.
G.