Forum Moderators: open

Message Too Old, No Replies

How come these forums are *so* fast? :)

Not like I'm complaing, but how?

         

picophd

2:25 pm on Nov 4, 2004 (gmt 0)

10+ Year Member



Greetings,

First of all, thanks a million to everybody working on this portal. This place is amazing. I'm so happy I found it.

I had posted the same CSS question in like 4 other forums, and I received my first answer HERE, even though I posted the other questions in the other forums like *hours* before posting here.

Second, another thing that's making me fall in love with WebmasterWorld is the speed. Gosh, these forums are super fast. I frequent many other forums in other places, but I've never found something faster than this. So may I know the secret, please? I know it's not like you can teach me to develop the same thing by replying to this post, but I mean what technologies exactly are used to produce such results? :)

Thank you, everyone.

mattglet

6:22 pm on Nov 4, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Awesome database indexing/structure, and super efficient code. That's the key to any fast site.

freeflight2

4:02 pm on Nov 6, 2004 (gmt 0)

10+ Year Member



no unnecessary junk (icons etc.) and mod_gzip helps a lot, too.

picophd

4:29 pm on Nov 6, 2004 (gmt 0)

10+ Year Member



mod_gzip?

freeflight2

8:27 pm on Nov 6, 2004 (gmt 0)

10+ Year Member



modern browser such as IE, firefox etc. accept compressed pages - a compressed (gz) google home page is only a couple hundred bytes and even fits into a single TCP packet that makes a huge difference... even on a 100k html page you often save 60-80%

picophd

12:09 am on Nov 7, 2004 (gmt 0)

10+ Year Member



I searched for mod_gzip online. Is it only available for Apache servers? Is there any alternative technologies on SQL servers, for instance?

ThomasB

1:31 am on Nov 7, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Just browse the forums and post in the right one and I'm pretty sure somebody will know. If it's a MS SQL Server you might try [webmasterworld.com...]

submitx

1:38 am on Nov 7, 2004 (gmt 0)

10+ Year Member



Last time I spoke to Brett (The creator of WW), he mentioned WW is entirely flat Text based database, which would translate to slow. That is why there is no search functionality. So I don't know how you find it to be fast. Maybe you are refering to responses being fast...in which case it has to do with having many members.

freeflight2

1:39 am on Nov 8, 2004 (gmt 0)

10+ Year Member



well,to give you an impression what's possible: one of my sites makes 150M/mo dynamic php-pageviews all database driven (10M+ posts, 800k members) on 6 dual xeon boxes, wikipedia serves 1B+ pages/mo on a dozen+ servers (some 64bit with 8-16G RAM), wikipedia is really impressive since every word of any article has to be analyzed for possible references (at least once) - the key (for them and other high traffic sites) is to use caching: users not logged in (about 80%) are all getting the same page directly from server RAM, the most effective SQL solution for 'forum like' sites is currently mysql/innodb, MSSQL/oracle is slower (but also much more powerfull / feature rich)

picophd

4:02 am on Nov 8, 2004 (gmt 0)

10+ Year Member



Ohh, I always loved accuracy, details, and numbers! :)

Thank you, freeflight.

Could you also tell me about this "compressed pages" thing? Is it only for high-end websites that have heavy traffic? Or do normal websites use it sometimes? How expensive is it or technically hard to administer, for a forums' oriented website without a big budget?

Brett_Tabke

4:14 am on Nov 8, 2004 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



> what technologies exactly are used to produce such results?

FlatFiles. They still smoke the doors off every db system out there when the record location is known (eg: the url). They suck at searching, but excell at high speed retrieval.

picophd

4:19 am on Nov 8, 2004 (gmt 0)

10+ Year Member



Would you know why WebmasterWorld went for the choice that compromises search capabilities for the sake of superior speed or delivery? I just would automatically assume that many people may want to find answers to their questions that were already asked and answered before, before they post the same questions again.

Brett_Tabke

12:53 pm on Nov 8, 2004 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



Let the webs great search engines (yahoo,google,msn) do the searching and we will stick to the community building.

To paraphrase the great American talk show host who had a long career in the middle of massive competition: "Do one thing and do it better than anyone else. Don't worry about what the other guys are doing - take care of what you do best and everything else will take care of itself." - Johnny Carson 1969.

freeflight2

6:10 pm on Nov 8, 2004 (gmt 0)

10+ Year Member



FlatFiles. They still smoke the doors off every db system
berkley db is fast, yes... but... nowadays with modern dual CPUs the bottleneck is usually disk IO, with 500 sql queries/sec the main DB server here with 6 scsi disks is 60% idle while the disks are at 85% util and most (if not all) high traffic sites run into exactly the same issue (and after that the internal network is next).
SQL servers have a lot of advantages even for simple queries:
- replication: write to one master, read from n slaves or even have a cluster solution do everything for you
- "hot backups": create a consistent snapshot of all data
- have DB automatically repair tables after crash
- query caching: get results directly from RAM... some big sites serve 80%+ of all pages directly from server memory.
- scaleable, highly available: have multiple webservers read from several SQL servers... if one of them goes down no big deal. Also, it's more effecient to have a server dedicated for SQL (data retrieval) only and one for java, perl, php or whatever to utilize the CPU caches as much as possible.
- querylog / profiling

tell me about this "compressed pages" thing?
http://sourceforge.net/projects/mod-gzip/

you can also do it 'by hand' - in a mod_perl enviroment it would look like this (client sends 'Accept-Encoding: gzip'):
$html = Compress::Zlib::memGzip($html);
$r->header_out('Content-Encoding', 'gzip');
$r->send_http_header();
$r->print($html);

Easy!

Brett_Tabke

6:57 pm on Nov 8, 2004 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



SQL/MySql is slow (imho). We would not be able to run this system as it is under sql - it would smoke the box.

The bottle neck on a good db has little to do with disk i/o speed and everything to do with the efficency of the lookup algo.

I stick to flat files because file systems are the oldest and most efficient database system available - we let it do the work.

The slowest part of the system here is the actual perl that runs the show. C or ideally ML is the only way to go faster.

freeflight2

7:42 pm on Nov 8, 2004 (gmt 0)

10+ Year Member



hmmm don't know about WebmasterWorld's internals but SQL tables with indexes take only 1 or 2 disk seeks per lookup if the data/index is not in memory and almost no CPU at all... sorting/grouping takes most CPU (*).
mysql can execute a couple hundred or perhaps even 1000 times "give me all posts for thread_id=x" results per second - even on 50-100M rows.

perl: precompiled perl (e.g. mod_perl) / precompiled php is amazingly fast... I don't think C would be that much faster (unless you don't need to use perl's great features such as hashes/regex etc.) - per dedicated perl¦php dual xeon 2.8 box I am able to serve 65-70 fully dynamic+gziped sql page requests per sec here (both mod_perl / php with much more 'junk' (userpics, friends etc.) than WebmasterWorld has), high traffic forums make about the same, wikipedia a little bit less but overall even more thanks to intelligent squid/proxy caching.

(*) actually that might have been the issue since you have to sort through many old threads on the index pages

adni18

5:01 pm on Nov 25, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I don't get it.

Brett_Tabke

2:05 am on Nov 26, 2004 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



> SQL tables with indexes take only 1 or 2 disk seeks

That is *AFTER* it finds the db to begin with (which may or may not be cached).

With flat files - there is 1 read and that's it.

freeflight2

5:51 am on Nov 26, 2004 (gmt 0)

10+ Year Member



well I hope you are going to release BestDB some day ;) it's probably faster than any SQL DB, I agree.
The nice thing about SQL is also that you can distribute replicating instances over several datacenters, if a DB server goes down (for backup/maintenance/profiling) no big deal. Also I noticed that WebmasterWorld often 'hangs'(?) for a couple seconds after posting/editing an entry. A setup of master/slave DBs with all writes to the master and all reads from the slave(s) usually does not run into locking issues/delays.