|PHP and MySql/PostgreSql.|
Help validating simple test results
This is mainly a database question but I believe the results are being impacted by PHP support for Postgresql or the lack there of.
I have an XML file (DMOZ download/content.rdf.u8 file) and PHP code that does a relatively raw import into two tables. First table is two columns no indexes , second table is 5 columns and no indexes. I parse through the XML file and when I get data that I can insert I kick off an insert statement and continue. This is the test functionality that I am trying to verify/believe.
I have an UltraII Sun box with 192 meg of ram and a scsi 25 Gig external disk drive with two 166 mhz processors. No other traffic or activity on the machine beyond my test.
I run my php code pointing it to the mysql database and inserting 200,000 rows in the 5 field table and 14,000 rows in the 2 field table.
Completetion time in average of 9 seconds.
(same as above)
I run my php code pointing it to the postgresql database and inserting 200,000 rows in the 5 field table and 14,000 rows in the 2 field table.
Completetion time in average of 23 seconds.
From my extremely simple monitoring techniques (top, mpstat, iostat ) I was able to determine that both processes utilitize one processer 100%. MySql uses manly user processing time and Postgres uses system process time mostly. I don't know if there is a different between the two in priority or anything but the processor was at 100% the whole time and none of it was IO.
My Question, from other peoples experience. Is MySql truely that much better? I know Postgresql has followed SQL standards almost to the T and they have transaction control which is an extremely big plus but speed is were the money is. If it takes me a little longer to code the same functionality in MySql, I only have to do that once, I will be inserting this data every week and queries will hopefully be going on 24/7 if I do this right. I have also read other studies that show MySql crawling after about 10 years and postgresql staying steady at 100. Again another plus if I needed the functionality.
Any thoughts, ideas, experiences can only help. Thanks.
Yes, MySQL is that much faster. I switched from Oracle to MySQL because of the speed issues. MySQL is the fastest database out there. I have used many, but it all comes back to MySQL.
Rock on MySQL!
I just love your enthusiasm. I guess I asked for that by bringin such a topic to a forum. I do agree with you though. MySql is really good.
Not to disprove your theory though. I really do appreciate the feedback. I started searching the net again, and again, and again...... I found some hints that brought postgresql insert time down to 13 seconds. For anybody using Postgresql. These are the settings I changed in the postgresql.conf file in your data directory.
tcpip_socket = true
port = 5432
hostname_lookup = false
show_source_port = false
shared_buffers = 256
sort_mem = 512
vacuum_mem = 8192
wal_files = 32
commit_delay = 50000
commit_siblings = 50
fsync = false
enable_seqscan = true
enable_indexscan = true
enable_tidscan = true
enable_sort = true
enable_nestloop = true
enable_mergejoin = true
enable_hashjoin = true
effective_cache_size = 1000
geqo_pool_size = 1024
log_timestamp = true
Since I was extremely close to the automated uninstall button I didn't really bother reading were I was at or what these changes accomplished. Do not take me posting these as "best practice" settings. I know a couple of them are extremely dangerous and will be changed back if I do get a production site going. The fsync = false option in particular is probably what gave me the most for my mileage. It also will destroy a database if I don't properly do a shut down on the database server specificaly. Just rebooting without a database shutdown could corrupt the database without a proper fsync. YOU HAVE BEEN WARNED.
I think it is going a little overboard to claim MySQL is the fastest database. It might be the fastest for small databases, but it stands no chance against Microsoft SQL Server or Oracle in medium to large size databases. It also lacks many key features in high availabilty type environments such as replication, failover. And, for a development standpoint it lacks views, stored procedures and any kind of bulk loading feature.
Visit the Transaction Performace Council at [tpc.org...]
The speed of a database is really in the hands of a good DBA.
TeddyBare69: another consideration, holding all other parts equal, is the database driver for postgres is simply not as well written as the one for MySQL in PHP. You should try the exact same test and take php out of the loop, just write a big sql statement and redirect it into the respective DB's and see what the results are like.
That's not entirely true from what I've read. MySQL does support replication:
Though you are on the mark re:stored procedures. It also lacks triggers, referential integrity. With the release of 4.0, the UNION statement is now supported and they've laid the foundation for stored procedures in upcoming releases.
As for the development environment, MySQL has gained its popularity as a web based db and pretty much any web language can access and use MySQL though PHP seems to be the tool of choice. So you get to choose the development environment which offers you the most robust tools for the job you're trying to accomplish. I've never used it with anything but PHP and Cold Fusion. I didn't even know it had it's own dev environment.
But, you'll grant me that for an open source db, the cost and support are phenominal! And if it keeps growing at it's current rate, MySQL will be a serious contender against the current big boys.
That's not what I meant by development environment. I meant development meaning the actual programmers coding against it, and producing database intensive applications.
MySQL has its strengths, and I believe in the upcoming years (I will not say months because it took them forever to release 4.0), it will be a contender. Shoot, I even run MySQL for small databases and for clients. But if I were developing a database intensive product in a high traffic site I would never consider it, it would not stand water against the big boys.
You are correct on the replication I see. I am impressed it has that support.
I too have experiences with both databases (MySQL and PostgreSQL). A while ago I need to analyse some data in an Excel spreadsheet, and I dumped them into CSV and then imported them into the database. It was around 100,000 rows and 30-40 fields on each row, and some fields were big varchars. For MySQL, it took around 30 seconds to do the import (using mysql client), and in PostgreSQL 7.0 (latest back then), using psql client, it took nearly 5 minutes to complete! MySQL is simply the fatest RDBMS out there for raw insertion speed, as well as simple SELECT statements. MySQL is quite light on the memory usage and fast in connection speed, since it uses one clent/thread model, instead of one client/process model of the PostgreSQL. I myself use MySQL int my sites that need simple backend database support, and most PHP apps out there use MySQL by default.
However, at the same time it is nice to have transaction support. I know the MySQL with InnoDB can do transactions, but I don't consider InnoDB has a mature/complete product, and I still have not tested it on a production site. I know it is possible to "simulate" transactions using table locks in MySQL, but I think transaction is really useful in application development. Imagine in one of the heavy-weight subroutine, 20 levels down the call-stack, and 10 different database tables have been updated, then there is an uncaught exception... It is just much easier to do a rollback then redo all the changes.
I have not tried PostgreSQL recently, especially all the exciting new updates in 7.x. However, I have been using Interbase/Firebird in some of the newer projects, and it is also a great, mature and reliable ACID-compliant database.