Page is a not externally linkable
decibel - 8:11 pm on Oct 10, 2005 (gmt 0)
PostgreSQL uses Multi-Version Concurrency Control instead of locking to provide ACIDity. This means that every tuple (row) in a table has data that indicates what transactions are allowed to 'see' that tuple. So when you delete a tuple (and update is actually the same as insert/delete), the tuple isn't immediately removed. Instead, it's marked "deleted as of transaction #so-and-so". Of course, the downside to this is now you have a tuple in the database that will eventually be doing nothing but taking up space. This is where vacuum comes in. It scans through the table and when it runs across a deleted tuple, it checks to see if any transactions currently running can still "see" that tuple. If none can, it will remove the tuple, reclaiming the disk space. Of course, that's a high-level overview. When it comes to vacuuming, your best bet is to use pg_autovacuum (versions before 8.1), or to enable the built-in autovacuum in 8.1. This will handle 90+% of users in a painless manner. Note that pg_autovacuum will exit if the database is stopped, so you might want to use the scripts at [cvs.distributed.net...] You can also throttle vacuum so that it doesn't clobber your server while it's running. Something else to consider is that the out-of-box configuration for PostgreSQL is *very* conservative. It's meant so that you've got the best possible odds of being able to start the database, even if you're on an old 486. Of course this also means that there's some big performance gains to be had from tweaking a few parameters. -- [edited by: volatilegx at 9:22 pm (utc) on Oct. 10, 2005]
Before I go into what this probably means for MySQL, I'll answer some of the PostgreSQL questions:
Jim Nasby, Sr. Engineering Consultant
[edit reason] fixed sig [/edit]