Page is a not externally linkable
linear - 2:37 am on Sep 21, 2005 (gmt 0)
You say: The database guy hears: You say: The database guy hears: You say: The database guy hears: You say: Database guy hears: ...and so on. The main thing is that you need to be able to describe patterns of usage, and service level expectations. The database individual should be recommending specifics based on those requirements. Anyone who's been at this a while will have fallen into a lot of traps and be eager to avoid them in the future.
One thing I want to throw in that I haven't seen come up yet: you should be prepared to discuss the usage patterns with the database guy. Some examples may help.
"This system is primarily driven by commercial transactions. It needs to handle an average load of 20 transactions per minute, and may peak to ten times that volume. The users are expecting instantaneous confirmation of transactions, so response time is a consideration."
"The usage pattern is biased toward INSERTs. INSERT performance under load will be a critical performance metric."
"This system is mainly for content management. A team of six authors will add articles on a relatively infrequent basis using a set of screens built for the purpose. Catalog pages and articles will be built out of the database dynamically."
"Usage biased toward SELECT statements. INSERT will be comparatively rare and not performance-critical since it's not public-facing."
"We import between ten and twenty thousand customer records each night from a different system. That system finishes its batch job between midnight and oh two hundred most nights. Users expect the system to be up by oh seven hundred. Once a quarter we update all 1.2 million customer records with a different batch feed. We can run that job over a weekend, but we have to lock users out of seeing their balances until it's done."
"Batch loading performance is normally important but not critical, and may be a problem for the quarterly loads. This has specific index implications for the tables holding the relevant data."
"We need to keep a five year history of all transactions for regulatory reasons. We aren't allowed to aggregate those. The users expect to be able to query aginst history with reasonable performance, but it isn't something they do daily."
"At least one table is going to grow absurdly large. Without careful attention to the history queries, performance is headed for the dumpster. Backups and restores will take forever."