Forum Moderators: phranque

Message Too Old, No Replies

References for performance for serving mid-size files (1MB)?

How many accesses can I serve before performance degrades?

         

MichaelBluejay

9:59 am on Sep 11, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I have a client who wants to add 1 Mb static HTML file to the site. (It's not chunkable, trust me.) Up until now my experience has been with only small files, so my first impulse was to be wary of lots of bots all trying to tie up the server by downloading it. (Yeah, I know I need to exclude bots from the site, but it's beyond my programming ability at present, I'll get to that eventually.) On second blush I'm realizing that there are lots of sites that serve lots of audio and video all the time, so maybe I shouldn't sweat a single 1Mb file.

I tried to find info on server performance for mid-size files but came up blank. How much CPU and memory is consumed by a single request for a 1Mb file? I guess my thinking was that any file that takes the server more than a second to download means that the server's CPU is tied up for more than a second, which I don't like, but maybe it takes such a small fraction of the CPU's attention that I needn't worry about it.

Perhaps a good specific question is, how often can I serve a new 1Mb file before performance degrades, on an otherwise good P4/1Mb server that normally has a load average of about 0.1? And does anyone know of any good references for server performance for newbies? Thanks much.

g1smd

11:24 am on Sep 11, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



You can use:

User-Agent: *
Disallow: /the/path/to/that/file.ext

in a robots.txt file to discourage bots from touching it if you don't want them in there.

MichaelBluejay

1:10 pm on Sep 11, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Thanks, but that's not my questions. Blocking bots through robots.txt isn't feasible because most bots are bad bots. Please, let's not sidetrack this thread towards a discussion about bot-blocking. What I'm looking to find out now are the things I asked about server performance. Thanks.

SeanW

3:00 am on Sep 16, 2007 (gmt 0)

10+ Year Member



Serving static files, especially over local disk, isn't that big of a deal, because Apache doesn't buffer the whole file (I think by default it uses mmap() which keeps memory overhead low). Also look at the EnableSendfile directive which makes it even faster (just stay away from NFS)

mod_status might be able to give you an idea of the cpu time spent servicing the request, and the inverse of that is the number of max connections per second you can take before getting backed up on the CPU.

My only concern would be keeping enough httpd children around to service the other requests. If it takes 30 seconds to download the file and you have 10 people/second downloading the file, that means you need MaxClients >= 300 just to serve those files, let alone the other requests. As much as I love Apache, lighttpd or a reverse proxy like squid would be better suited for the larger volume cases.

I wrote a series on LAMP tuning over at ibm.com/developerWorks/linux that goes over some of the basics of Apache tuning, though mostly targeted to dynamic stuff.

Sean