Forum Moderators: open

Message Too Old, No Replies

Page request bottleneck?

1 file or multiple files best?

         

rfontaine

8:29 pm on Apr 1, 2003 (gmt 0)

10+ Year Member



Hi,

I am no systems admin guru but I do know enough to be dangerous...

In respect to the speed of individual page requests on an Apache Server, all other things being equal -

If there are, say, 1000 web page requests at the same time is there a difference whether those requests are all for just one page, say food.php or spread around over many different pages: like 250 requests for cookies.php, 200 for icecream.php, and 500 for candy.php etc.

The reason I ask is that through scripting I can parse the URI of each page request and then run this information through just one file to serve up the content. But would this cause a bottleneck of sorts as a thousand page requests attempt to use the same file?

Thank you in advance for your input

Critter

8:32 pm on Apr 1, 2003 (gmt 0)

10+ Year Member



Won't make a difference *unless* there's some sort of blocking behavior (database access, file access, semaphores, mutex, blah blah blah) in any particular page.

Peter

rfontaine

8:33 pm on Apr 1, 2003 (gmt 0)

10+ Year Member



Yes, there are calls to the database and also a number of php includes

Dolemite

8:37 pm on Apr 1, 2003 (gmt 0)

10+ Year Member



I believe you'd be limited by the number of simultaneous database connections allowed, which would be a server-wide limit (or perhaps server-wide by virtual server), not relating to the number of files.

[edited by: Dolemite at 8:41 pm (utc) on April 1, 2003]

BigDave

8:40 pm on Apr 1, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



It is more of a PHP issue than an apache issue. To answer your question, everything else being equal, single file is better since it will be in the cache. But things are never equal.

Multiple smaller files is better than one huge file. If you have a forum, you would likely have better performance with addmessage.php, listmessages.php, readmessage.php, and deletemessage.php, than having one large file message.php, where you pass it the operation that you want it to perform.

Rhadamanthus

9:30 pm on Apr 1, 2003 (gmt 0)

10+ Year Member



If you're doing database lookups, this is going to be your bottleneck, end of story.

A page view will basically go through the following process:

1) client (browser) requests page
2) server locates page
3) server loads page
4) server sends data back to client

However, this is simplistic. If you have static content, then step 3 will probably go away because the page will be cached in RAM already. But if you have dynamic content, then step 3 expands into the following process:

3a) load the page from disk (if it's not in the cache)
3b) parse any scripts on the page
3b1) load any data from the database
3c) generate the actual HTML to send

All of these steps add extra overhead to the page request, but step 3b1 is far and away the largest, because it in turn encapsulates a series of events very much like the original set: request data, locate it, read it from a file, send it back.

It's really hard to accurately profile a page request and determine how much time you're spending in each step, but I'd hazard a (very educated) guess that you're spending more time in step 3b1 than in everything else combined, and if you want to make your server more effecient, this is the part to hit.

There are a couple of approaches you can take to optimizing your data access. Unfortunately, the more gain you get, the more work is involved. I'll start with the easiest first.

1) Optimize your database. The MySQL documentation includes a section on this, and it'll help, but not a lot. It's fairly easy to do, though, and shouldn't upset any of your scripts.
2) Optimize your data access. Modify your SQL queries so that they only request the data you actually need, and nothing else. Make sure that the fields you're sorting/searching on are indices (this kindof falls under step 1 and step 2). Don't request the same data more than once in a script - pass it around in memory instead.
3) Cache your data. If possible, load your data once into a static variable and then pass it around between page requests. This will entirely eliminate step 3b1, making your page loads much faster right off the bat. Unfortunately, some of the most popular web scripting languages (including PHP) don't have any native support for this, which is why I'm porting my web site to ASP.NET (which has REALLY GOOD support for this). But there may be a way to do it there if you're creative/industrious enough.
4) Cache your entire page output. Depending upon your content, this could either be the easiest or the most difficult approach. If your dynamic data consists of something that never changes, or only changes when *YOU* change it, this is pretty easy - just run a script to generate each page every time you change the database, and save the output as html files that you actually serve to the user. Unforunately, if you serve pages that include user-modified data (user forums, or just user account info on the page) then you're simply going to have too many versions of the page to get away with caching them all unless you have very few users or lots and lots and lots (and more lots) of RAM. But if you can do it, this approach will turn all of your dynamic content into static content, which can be easily cached by your web server and effectively eliminate phase three of the page fetch altogether.

Splitting one large page into smaller pages may help your pages load quicker if you do it in such a way to ensure that dynamic data that isn't really needed isn't loaded, when it would be otherwise. If you can't, it's probably not going to make any noticable difference one way or another.

le_gber

9:33 pm on Apr 1, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



like 250 requests for cookies.php, 200 for icecream.php, and 500 for candy.php

That's not 1000 ... ;)

rfontaine

9:53 pm on Apr 1, 2003 (gmt 0)

10+ Year Member



like 250 requests for cookies.php, 200 for icecream.php, and 500 for candy.php

Yes - but you forgot about the "ETC." I added at the end...

BigDave

9:58 pm on Apr 1, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



3) Cache your data. If possible, load your data once into a static variable and then pass it around between page requests. This will entirely eliminate step 3b1, making your page loads much faster right off the bat. Unfortunately, some of the most popular web scripting languages (including PHP) don't have any native support for this, which is why I'm porting my web site to ASP.NET (which has REALLY GOOD support for this). But there may be a way to do it there if you're creative/industrious enough.

I actually store some of the information in session variables to avoid having to do lookup on more than just a few pages. It won't solve eveyones problems, but if you are looking up the same few pieces of data for every page, just shove it in a session variable.

4) Cache your entire page output. Depending upon your content, this could either be the easiest or the most difficult approach. If your dynamic data consists of something that never changes, or only changes when *YOU* change it, this is pretty easy - just run a script to generate each page every time you change the database, and save the output as html files that you actually serve to the user. Unforunately, if you serve pages that include user-modified data (user forums, or just user account info on the page) then you're simply going to have too many versions of the page to get away with caching them all unless you have very few users or lots and lots and lots (and more lots) of RAM. But if you can do it, this approach will turn all of your dynamic content into static content, which can be easily cached by your web server and effectively eliminate phase three of the page fetch altogether.

Or you can implement support for if-modified-since, and just touch a file when the database changes. checking dates on a file is a rather fast operation compared to opening a database. Though this will only help you when it comes to one client repeatedly requesting the same page.

Rhadamanthus

10:05 pm on Apr 1, 2003 (gmt 0)

10+ Year Member



Or you can implement support for if-modified-since, and just touch a file when the database changes. checking dates on a file is a rather fast operation compared to opening a database. Though this will only help you when it comes to one client repeatedly requesting the same page.

Thanks, BigDave. This is another really good approach - if it's compatible with a given set of dynamic data. If you're displaying multiple sets of dynamic data though, it can be problematic. For example, WebMaster world displays dynamic forums, but it also displays user information on each page. With thousands of users, if it cached each page for each user it would quickly grow to be too much cached data. But if it didn't display the user info, and just the actual forum stuff, it would be very managable, and your solution would work very well. Even so, it would take a little bit of time, effort, and knowledge to implement, but it would probably be well worth it.

The best solution for any given site is going to vary greatly depending upon exactly what kind(s) of data you're getting from the database.

BigDave

10:19 pm on Apr 1, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



No argument here. IMS works well for my links pages, which are kept in a DB, change rarely, but some people use fairly often. But I only use it when displaying those pages to people that are not logged in. There are other pages that I can implement it on, but they will be even more work.

There is no doubt that there is extra hassle involved. I just did it to make sure that I *could* do it, before I *needed* to do it.

Rhadamanthus

10:42 pm on Apr 1, 2003 (gmt 0)

10+ Year Member



Sorry, I hope I didn't imply that you hadn't thought of that. After reading your post and re-reading mine before that, I wasn't sure I had made it clear enough what kind of situations could cause that kind of solution to not work and why.

Like I said, under the right circumstances, it's a *very* good solution.

BigDave

11:14 pm on Apr 1, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Oh, I understood that and no offence was taken. It's that old limitation of the written word being easy to misinterpret.

There are lots of different options depending on what your data set is. Of the 5 different types of data that I am storing in the database, the pages that use that information could easily use four different methods of optimizing the storage to speed up the accesses. One data set is so dynamic and so rarely used that there no point in ever trying to cache it anywhere.

One thing that is rarely thought of, and therfore under-used, is using simple files, instead of databases, when locks and indexes are not an issue. Some people are so in love with databases that they don't realize how much overhead they have with simple data sets.