Forum Moderators: phranque
I'm also more interested in the technical aspects of your server performance Markus.
Any threads you wanted to start on that subject I'm sure would be well supported....
TJ
[edited by: Woz at 10:46 pm (utc) on Mar. 17, 2006]
[edited by: tedster at 4:39 pm (utc) on May 28, 2007]
What I mean to say is: Did you design the system from day one to eventually handle these loads or did things gradually (or suddenly) get big and you realized this kind of load handling would be required?
What you have done is incredible (but I do belive it), congratulations. If you knew it would be this big from the start, that would be one more impressive.
some post already have inquired about your database software. I understand if you don't want to let us know - but maybe you can tell us if it's open source or not.
nerd
P.S. I'm really impressed by the performace of your hardware/software setup. But that COULD possibly be handled by a couple of "vaxes behind the curtain" - but you should definitely write a book about about marketing. I'd buy 10 copies right away.
As for designing the system to handel that much load from the start is impossible. As you ad more data the way the database works changes(I'm not saying what i'm using) Also I constantly added new features every week or 2 making it more resource intensive. I just spent all my time trying to scale the DB on only 1 server and stick to using only 1 web server. Constant optimzation till I reached the level i'm at now. Law of big numbers works on my side now, I will only run into issues at 1.6X my current traffic.
I think you can accomplish the same thing in nearly any language except PHP which does not scale at all when it comes to heavy loads.
[public.yahoo.com...]
Scaling up especially via good code is far more cost effective and maintainable then scaling out.
and then In july i was up to a few thousand and adsense came out. The day I added it to my site is the day I knew I was going to be a mega site. The main reason was that I was growing at a consistant excellerating rate and I could plot my growth on a graph and predict exactly where I would be in 5 months. I knew the only way to suceed was to fly under the radar.
I went and blocked alexa users from being able to signup, as well as comscore.
Not only did you do something amazing, you showed great vision on many fronts.
Congratulations!
markus007: I went and blocked alexa users from being able to signup, as well as comscore
This might be trivial to some, but how is this done - How do you know (detect) if visitor is using Alexa toolbar or comscore (or google toolabr, or yahoo, etc)?
markus007: I think you can accomplish the same thing in nearly any language except PHP which does not scale at all when it comes to heavy loads.
...
PHP ...can not scale to tens of millions of pageviews on a single machine.
Why is this? What specifically prohibits code written in PHP to scale (verticaly). I am new to PHP so I am very curious.
As far as PHP not being able to handle the same amount of RPS as other scripting languages - it really boils down to what you are doing with the language and what your data access layer looks like.
Also, there's a huge community developing OS projects in PHP and with that comes a bunch of great tools us MSFT folks never get to see :(
Chip-
I think you are using a page data cache on your site and you can do the same with PHP
to make dynamic pages semi-static and radically reduce loads.
Some large scale PHP presentations:
[talks.php.net...] (doesn't work right in my Firefox)
[public.yahoo.com...]
[public.yahoo.com...]
ASP.net is compiled code so if you are going to compare to that, you'd take the most critical parts of the site and write them into C cgi's which are even faster than ASP.net and easily integrate them into PHP.
On a different note though, what I am curious about is if your customer uploaded images are stored statically or inserted into a sql database? I never understand programmers who convert images and store them in a database, it adds incredible overhead and virtually no advantages. I suspect yours are static?
In any case I believe gaiaonline is one of the top 10 largest forums in the world and it's PHP. In 2004 they claimed to have 70 million messages, 9000 simultaneous users, and 750000 registered. With four dual-core servers. I'm limited to what I can link to on there but try here [big-boards.com...]
ah I just found this from October 2005:
We now can easily support 30K simultaneous users signed in with no problem. After a few more optimizations we should be able to double and triple our load, without even adding any more hardware. We currently have about 300M posts on the site, with a few million more being added every day.But of course they don't qualify anymore as a "one person high traffic site" so I digress.
freeflight2: but if the DB corrupts you've got virutally all images down and a single point of failure in any case - what about unique filenames based on some kind of directory substructure that is even more easily mirrored (copy only files that are new or changed). Honestly I've never worked with with anything more than a dual server setup so I don't have real world experience with anything massive. But it still perplexes me why someone would make it that much more complicated unless it's an absolute ton of tiny images.
As for PHP dieing under big loads, i think it had to do with it being unable to run well with multipul threads.
[simon.incutio.com...]
As for storing images in a database i wouldn't dream of it. Creating a terrabyte database is not something i'm keen on doing and even then steaming the images from a database would max out my IO's damn fast. I would need a huge SAN with hundreds of drives.
Bruce
I know Alexa numbers are very distorted. Even so, I can not understand why there is a huge spike in reach a week or so ago, which reversed very quickly.
I wondered if the publicity has had an effect on traffic; but that does not seem likely given how busy the site is anyway.
As for alexa i only lifted the ban on thursday, the surge was from an extra 120 toolbar users from slashdot on the monday. My alexa reach rank should move from a 400 to 1,200 within 4 weeks.
I run quite a large Harry Potter fan site which does 8-10 million pageviews monthly. I *did* write every line in the code from scratch, including the forums, membership database, news aggregation and content administration, visitor tracking and much more.
With good caching, the site blazes even with 3,000+ concurrent users. It runs on 3 servers: one for mail/external apps, one for DB (Microsoft SQL Server), and one for ColdFusion and IIS.
I'm not just patting myself on the back here - I'm adding my voice to the true fact that you can run a very high-traffic site on a very reasonably-priced hardware setup. My site started at my apartment on a Pentium III 450MHz in 2002 with 1,000 visitors/month. Now at over 1.5 million visitors a month, all it took was a relocation to a datacenter ($300/mo instead of $289/mo SDSL at my apartment), and two new servers (both Dell PowerEdge's... total of less than $4,000).
I couldn't be happier with the hardware, and there is so much room left to grow on it that I am not at all worried for the next several years (until the Potter franchise is all but done in 2010 - by then, I'll be looking for "other work" anyways). ;)
huge server farmsweb servers are usually not the issue - you can run 100 web servers for $10k - $20k/mo (ticketmaster has some 1000+ perl web servers) while 4-5 top of the line DB servers might costs as much as that (especially if you have to license oracle or similar).
I have a site that does about 80,000 dynamic page views a day, and there's no way I could run gzip compression on them on the fly without totally taking out the processors.(1 dual proc web server, hitting 1 dual proc DB). I know IIS6 has the gzip embedded into its core, but this is something I've researched high and low and even had an MSFT employee try to help with, but no dice. I even lowered the compression to 3 and it still totally taxed the procs.
I know each of our web servers does a minimum of 250k dynamic pages every day (1-2 inserts and 4-20 selects on every page) with no problems. Just dual zeon 2.8's with 2 gigs of ram. We use XCompress instead of IIS GZip -- maybe you could try that; they have a trial version I believe. We also use a custom caching solution which caches about 50% of each page (more if it's just static content for the most part). We then update each page every 18 hours or so automatically. Pack that with an image server and we figure we can double (or more) the amount of traffic we're sending through each server. We're running Win2k3 and 99% classic ASP.
S
As for not being able to gzip up 80,000 dynamic requests a day, don't forgot, I"m on old hardware :)
Chip-
You have done an excellent job. I would like to summarise the lessons you have shared with us.
a) if you are skillful enough, write your own code. then after you have written it, test it, and then optimise it. when it gets to slow, find the bottleneck, re-write it and re-optimize it. tune it to minimize user wait.
b) when you have the money, invest in ram ( I noted that you hinted about huge ram ). funny that no one ever mentions that option ( which is the first thing I try to do with extra cash)
c) invest when you can in purchasing good links and work on developing others.
d) invest time into content developing.
e) invest your time in being stealthy. interesting, but this is only effective for the person that really wants to make a solid effort.
the only thing I really did not see ( or you are hiding it ) is the redundancy issue. knowing how to create 5xnine's machines back in 2000 to 2003 some thing might be still valid ( for those that don't know, five - nines means 99.999 up-time, mission critical is six nines, and we would laugh because I came up with nuke_six-nines, which is 6 mirror system of a six-nine's machines, placed in different geopolitical areas), anyway, I would mention that a simple round robin would offload a ton CPU and/or i/o. and if I recall correctly, MS server 2000 has a fail-over round robin system when CPU or i/o or heat get to the fail-over trigger point.
I prefer round robin over anything else in your case. When we have mirror system with optimized code, with that code releasing the i/o and CPU quickly, you will never have a CPU - i/o bog. ( I/O / CPU bog is when you have repeated calls of a process that is intensive, what happens is that you could end up with a few people doing these routines on the same server ( #*$! luck but it happens ) and the system slows down). When you have someone consistently optimising the code to release the CPU or I/O, you rarely will have hit the Bog. Question ow is how to mirror the drive system to replicate the content with little to no lag and doing it cheaply.
A Question does come into play, how the heck do you do your back-ups