Forum Moderators: phranque

Message Too Old, No Replies

mysql versus "included file" based content management system?

Which is more scalable and efficient?

         

crowthercm

4:45 pm on Jul 31, 2003 (gmt 0)

10+ Year Member



Hello,

I am writing a content management system that will allow me to develop new websites more quickly and efficiently. This is something that is new to me and I'm a little worried about how efficiently it will work in the longer run.

I am trying to decide between using an included file that contains associated arrays versus using mysql databases. Every page on my site is built with the same template that has no specific information included, just css-positioning structure & php variables (i.e. <? echo $keyword?>, <? echo $description?> etc.). Each page contains a unique key variable that I then use to fill in the correct content from a central location.

Now at current, I am simply including an associative array inside an .inc file that contains information for the entire site. The array contains all the specific information for each page on the site.

Would it be better to use an sql table to include relevant content? Certainly it would allow maintenance to proceed much more easily. Assuming a site starts getting hit with large amounts of traffic which system will hold up better? That is, I need to choose between 1000s of simultaneous requests for the same file versus the equivalent number of simultaneous mysql queries.

Thanks,
Chris

DaveAtIFG

3:14 pm on Aug 2, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Your question is way over my head! :) Anyone? Help!

crowthercm

5:04 pm on Aug 2, 2003 (gmt 0)

10+ Year Member



Actually maybe I can help a little with my own question now, maybe it will be useful to someone else as well.

I wanted to build a templated website so that I can make all the changes from one central location. Namely, instead of modifying individual html or php files, I wanted to include everything through an administration file. That way I can change shared resources (i.e. link tables, copyright, etc.) across the whole site while inserting dynamic content where applicable. It's basically just content management tailored to my own needs.

What I was concerned about was how to do this most efficiently. In the interim since I made that post I've learned that database driven sites are better than file driven. They are more popular so support is more widely available, easier to maintain than messing with arrays, and also able to handle heavy loads when used on the proper server.

In terms of what has been suggested to me, I was told use mySQL to start. Queries only take about 0.1 seconds so lower bandwidth sites should be able to handle this easily. What kills mysql in the longrun is opening and closing the connection to the database (which can take upto 1 second). As a result, the most efficient use of sql is through connection persistence where you use sessions to keep the database open until the visitor is finished browsing your site.

"Connection pooling" was also mentioned to improve efficiency but I haven't had time to look into yet. I'm guessing this has to do with using multiple databases although I'm not sure.

Last point, once the site outgrows mySQL, which of course is my ultimate goal, you can move into an oracle or SQL server solution.

killroy

7:59 pm on Aug 2, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Hmm I've been using MySQL driven CMS very heavily on 1000s of pages. never used persistent connections, since my own scripting environment doesn't support them.

From the profiling I've done, the opening of the database connection has never taken longer then around 0.05seconds, around 50ms, never been a problem.

It sounds like you use a TCP/IP connection to a remote server, try running the webserver on the same machine as the MySQL server and use pipes for the connection.

SN

crowthercm

5:33 pm on Aug 3, 2003 (gmt 0)

10+ Year Member



Actually I haven't had problems using mysql in the past either. One of my sites runs using the same database for about 3000 pages but traffic to the site is very low. I'm still quite new to development but what I've fast learned is that the planning step is by far the most important for me. I've wasted more time redesigning or modifying a site because I missed something I shouldn't have earlier on. This site and its userbase is fast becoming one of my best tools in getting things done right from the beginning (:

webdevsf

6:08 pm on Aug 3, 2003 (gmt 0)

10+ Year Member



You are wise to be having this thought now. Code is ALWAYS faster than the db. Use a publishing solution that does not rely on a db for dynamic content generation if possible. If anything, use a db as a repository, and then publish semi-static pages to your site. But don't have your users access your db every time they hit a page.

Somebody will tell you about using a cache to avoid hitting the db, but this is just creating an extra layer in your system. This will save you time and money in the long term. It will also give you more flexibility in your layout and page design, as well as UI.

killroy

6:51 pm on Aug 3, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



webdevfs: I'm not following you at all. How does the choice of data repository influence design choices in any way?

SN

webdevsf

7:32 pm on Aug 3, 2003 (gmt 0)

10+ Year Member



Sorry if i'm not being clear.

Point is - dbs are slow. web pages and scripting languages are fast. Any solution that ends up with a sql statment, or a stored proc, every time a user hits a page, is not as scalable as a solution which does not hit the db.

crowthercm

7:40 pm on Aug 3, 2003 (gmt 0)

10+ Year Member



Hm, that's food for thought. I was just about finished with the template using a straight mySQL query once for every page on the site.

Do you know where I can find any statistics based upon speeds of programming queries versus DB queries. For instance, how long would it take to run through an associative array of 1000 items versus a SELECT statement using sql?

It wouldn't take too much more programming to simply output the sql into the file that's going to include the relevant page information. Is it really worth it to do so though? What are the largest database based sites doing, i.e. ebay and others?

webdevsf

8:02 pm on Aug 3, 2003 (gmt 0)

10+ Year Member



The speed is much faster for an array. Write a script that access your db 100,000 times and one that doesn't. It's probably 100x faster.

Additionally, you are adding more resource usage. An assocaite array is almost just a big blob of memory, which is about as fast as you can get. A db is an enourmous resource that has lots of complicated data structures, may use network bandwidth (if on a separate box) and is a single point of failure for distributed websites.

The way most big sites work is:
personalization stuff is in a db of some kind, highly optimized and often requiring lots of programmers. Common stuff, such as catalog listings, message boards, etc, are usually spit out by a cron job, message queue, or equivalent into static script.

Sites that require real time updates such as etrade will build hugely complicated db/caching schemes to make sure that they don't kill too much performance.

Rule of thumb - dbs are slow. Don't use them unless you have to for scalable sites. What works for 1000 visitors a day won't work for 100,000.

crowthercm

8:08 pm on Aug 3, 2003 (gmt 0)

10+ Year Member



Great information (: Thanks a bunch!