Welcome to WebmasterWorld Guest from 184.108.40.206
I have an XML file (DMOZ download/content.rdf.u8 file) and PHP code that does a relatively raw import into two tables. First table is two columns no indexes , second table is 5 columns and no indexes. I parse through the XML file and when I get data that I can insert I kick off an insert statement and continue. This is the test functionality that I am trying to verify/believe.
I have an UltraII Sun box with 192 meg of ram and a scsi 25 Gig external disk drive with two 166 mhz processors. No other traffic or activity on the machine beyond my test.
I run my php code pointing it to the mysql database and inserting 200,000 rows in the 5 field table and 14,000 rows in the 2 field table.
Completetion time in average of 9 seconds.
(same as above)
I run my php code pointing it to the postgresql database and inserting 200,000 rows in the 5 field table and 14,000 rows in the 2 field table.
Completetion time in average of 23 seconds.
From my extremely simple monitoring techniques (top, mpstat, iostat ) I was able to determine that both processes utilitize one processer 100%. MySql uses manly user processing time and Postgres uses system process time mostly. I don't know if there is a different between the two in priority or anything but the processor was at 100% the whole time and none of it was IO.
My Question, from other peoples experience. Is MySql truely that much better? I know Postgresql has followed SQL standards almost to the T and they have transaction control which is an extremely big plus but speed is were the money is. If it takes me a little longer to code the same functionality in MySql, I only have to do that once, I will be inserting this data every week and queries will hopefully be going on 24/7 if I do this right. I have also read other studies that show MySql crawling after about 10 years and postgresql staying steady at 100. Again another plus if I needed the functionality.
Any thoughts, ideas, experiences can only help. Thanks.
Not to disprove your theory though. I really do appreciate the feedback. I started searching the net again, and again, and again...... I found some hints that brought postgresql insert time down to 13 seconds. For anybody using Postgresql. These are the settings I changed in the postgresql.conf file in your data directory.
tcpip_socket = true
port = 5432
hostname_lookup = false
show_source_port = false
shared_buffers = 256
sort_mem = 512
vacuum_mem = 8192
wal_files = 32
commit_delay = 50000
commit_siblings = 50
fsync = false
enable_seqscan = true
enable_indexscan = true
enable_tidscan = true
enable_sort = true
enable_nestloop = true
enable_mergejoin = true
enable_hashjoin = true
effective_cache_size = 1000
geqo_pool_size = 1024
log_timestamp = true
Since I was extremely close to the automated uninstall button I didn't really bother reading were I was at or what these changes accomplished. Do not take me posting these as "best practice" settings. I know a couple of them are extremely dangerous and will be changed back if I do get a production site going. The fsync = false option in particular is probably what gave me the most for my mileage. It also will destroy a database if I don't properly do a shut down on the database server specificaly. Just rebooting without a database shutdown could corrupt the database without a proper fsync. YOU HAVE BEEN WARNED.
Visit the Transaction Performace Council at [tpc.org...]
The speed of a database is really in the hands of a good DBA.
TeddyBare69: another consideration, holding all other parts equal, is the database driver for postgres is simply not as well written as the one for MySQL in PHP. You should try the exact same test and take php out of the loop, just write a big sql statement and redirect it into the respective DB's and see what the results are like.
Though you are on the mark re:stored procedures. It also lacks triggers, referential integrity. With the release of 4.0, the UNION statement is now supported and they've laid the foundation for stored procedures in upcoming releases.
As for the development environment, MySQL has gained its popularity as a web based db and pretty much any web language can access and use MySQL though PHP seems to be the tool of choice. So you get to choose the development environment which offers you the most robust tools for the job you're trying to accomplish. I've never used it with anything but PHP and Cold Fusion. I didn't even know it had it's own dev environment.
But, you'll grant me that for an open source db, the cost and support are phenominal! And if it keeps growing at it's current rate, MySQL will be a serious contender against the current big boys.
MySQL has its strengths, and I believe in the upcoming years (I will not say months because it took them forever to release 4.0), it will be a contender. Shoot, I even run MySQL for small databases and for clients. But if I were developing a database intensive product in a high traffic site I would never consider it, it would not stand water against the big boys.
You are correct on the replication I see. I am impressed it has that support.
However, at the same time it is nice to have transaction support. I know the MySQL with InnoDB can do transactions, but I don't consider InnoDB has a mature/complete product, and I still have not tested it on a production site. I know it is possible to "simulate" transactions using table locks in MySQL, but I think transaction is really useful in application development. Imagine in one of the heavy-weight subroutine, 20 levels down the call-stack, and 10 different database tables have been updated, then there is an uncaught exception... It is just much easier to do a rollback then redo all the changes.
I have not tried PostgreSQL recently, especially all the exciting new updates in 7.x. However, I have been using Interbase/Firebird in some of the newer projects, and it is also a great, mature and reliable ACID-compliant database.