Forum Moderators: open
Here's the problem I'm facing: I have a table that has about 20 selects per second. Every hour, about 5 vendors (it's a e-commerce website) uploads additions to their inventory, inserting about 5.000 records each one and taking some precious 2-3 seconds while I get 40-60 queries sitting there and waiting to be run. (so, the next second I have 60-80 queries trying to run at the same time)
The 20 queries/second themselves are already a very intense load. I'd say if it was only them, we'd be at 60% of what we can expect to get from this server we're in. But when the inserts happen, our query time really climbs from 50 ms average to more than 2 seconds sometimes. And the mess takes many seconds to be solved by itself.
We're talking about a 3 million row CD/DVD inventory table, with many fulltext indexes (one for artist, other for album name, other for description and another one for all of them). So our inserts certainly are paying the price for that. But the fact is that we really have to keep all these indexes.
Engine: MyISAM
Version: 10
Row_format: Dynamic
Rows: 2853361
Avg_row_length: 244
Data_length: 696384596
Max_data_length: 281474976710655
Index_length: 876121088
Data_free: 0
+----------+------------+----------------------+--------------+--------------+-----------+-------------+----------+--------+------+------------+---------+
¦ Table ¦ Non_unique ¦ Key_name ¦ Seq_in_index ¦ Column_name ¦ Collation ¦ Cardinality ¦ Sub_part ¦ Packed ¦ Null ¦ Index_type ¦ Comment ¦
+----------+------------+----------------------+--------------+--------------+-----------+-------------+----------+--------+------+------------+---------+
¦ inventory ¦ 0 ¦ PRIMARY ¦ 1 ¦ inv_id ¦ A ¦ 2853363 ¦ NULL ¦ NULL ¦ ¦ BTREE ¦ ¦
¦ inventory ¦ 1 ¦ index_user_id ¦ 1 ¦ user_id ¦ A ¦ 5907 ¦ NULL ¦ NULL ¦ YES ¦ BTREE ¦ ¦
¦ inventory ¦ 1 ¦ index_genre ¦ 1 ¦ genre ¦ A ¦ 75 ¦ NULL ¦ NULL ¦ YES ¦ BTREE ¦ ¦
¦ inventory ¦ 1 ¦ index_ativo ¦ 1 ¦ ativo ¦ A ¦ 3 ¦ NULL ¦ NULL ¦ YES ¦ BTREE ¦ ¦
¦ inventory ¦ 1 ¦ index_label ¦ 1 ¦ label ¦ A ¦ 45291 ¦ 5 ¦ NULL ¦ YES ¦ BTREE ¦ ¦
¦ inventory ¦ 1 ¦ index_inv_id_user_id ¦ 1 ¦ user_id ¦ A ¦ 5907 ¦ NULL ¦ NULL ¦ YES ¦ BTREE ¦ ¦
¦ inventory ¦ 1 ¦ index_inv_id_user_id ¦ 2 ¦ inv_id ¦ A ¦ 2853363 ¦ NULL ¦ NULL ¦ ¦ BTREE ¦ ¦
¦ inventory ¦ 1 ¦ index_author ¦ 1 ¦ author ¦ NULL ¦ 1 ¦ NULL ¦ NULL ¦ ¦ FULLTEXT ¦ ¦
¦ inventory ¦ 1 ¦ index_title ¦ 1 ¦ title ¦ NULL ¦ 1 ¦ NULL ¦ NULL ¦ YES ¦ FULLTEXT ¦ ¦
¦ inventory ¦ 1 ¦ index_authoretitle ¦ 1 ¦ author ¦ NULL ¦ 1 ¦ NULL ¦ NULL ¦ ¦ FULLTEXT ¦ ¦
¦ inventory ¦ 1 ¦ index_authoretitle ¦ 2 ¦ title ¦ NULL ¦ 1 ¦ NULL ¦ NULL ¦ YES ¦ FULLTEXT ¦ ¦
¦ inventory ¦ 1 ¦ index_descr ¦ 1 ¦ descr ¦ NULL ¦ 1 ¦ NULL ¦ NULL ¦ YES ¦ FULLTEXT ¦ ¦
+----------+------------+----------------------+--------------+--------------+-----------+-------------+----------+--------+------+------------+---------+
I've tried LOAD DATA INFILE, but the insert didn't get much faster.
I am now thinking about some approach that deals with some temporary tables. For example: having the data inserted to a temporary table INVENTORY_TEMP and them doing only one insert per hour on the production 'inventory' table.
This way I'd tell my vendors: "Hey, ok, your inventory table was read, but the data will only show up at the searches at the next hour, ok?"
But I've done some testing, and the
INSERT INTO INVENTORY SELECT * FROM INVENTORY_TEMP;
doesn't get any faster than the sum of the inserts I would have avoided, though. (I thought that transfering from one table to another could somehow be faster, with mysql somehow merging the already built indexes. But it seems it doesn't do that)
I know INNODB's row-level lock would help not to make readers wait untill the 2 second write is done. But the fact is that I need to keep the fulltext indexes - and innodb doesn't support them, as you know. (Then I thought: "Mmm, what if I had one MYISAM table for the searches (reads) and other INNODB for the writes. But, despite the initial feeling, I couldn't go much further with the concept. Ended up thinking it was a stupid irrelevant idea I had.)
Any ideas, folks? Has anyone ran into this problem and found a good strategy?
Thanks a lot
Phoenix
PS: I'm running mysql 5 on RHEL 5.
0) [only necessary first time] Create a table called inventory_staging which is a copy of inventory
1) Add new info to inventory_staging
2) Copy inventory_staging to inventory_staging_bk
3) SWAP the two table names inventory and inventory_staging (see docs for howto)
4) DROP table inventory_staging
5) RENAME inventory_staging_bk to inventory_staging
This is just a theory but it's worth a shot and I think it should work as long as the RENAME op is as fast as I think it is.
[dev.mysql.com...]
[p.s. the docs do indicate that the move op is atomic!]
We had a similar problem, although it was ongoing- stat logging frequently caused slowdowns and timeouts all day long. Finally going to replication pretty much eliminated the issue overnight.
LifeinAsia might be onto a more solid solution though...
I've set up some new info at my logs, to take a closer look at how many records my vendors are inserting each time. And:
Good news: surprise, they are inserting way less than I thought.
Bad news: a one-row insert query is taking 300ms to complete!
Here's the log: (nrecs, query time, date)
+------+------+---------------------+
¦ recs ¦ elaps¦ log ¦
+------+------+---------------------+
¦ 2 ¦ 0.36 ¦ 2008-06-30 19:26:24 ¦
¦ 72 ¦ 0.44 ¦ 2008-06-30 19:33:44 ¦
¦ 1 ¦ 0.33 ¦ 2008-06-30 19:35:18 ¦
¦ 41 ¦ 0.43 ¦ 2008-06-30 19:36:51 ¦
¦ 5 ¦ 0.32 ¦ 2008-06-30 19:37:12 ¦
¦ 1 ¦ 0.33 ¦ 2008-06-30 19:45:43 ¦
¦ 200 ¦ 0.54 ¦ 2008-06-30 19:50:28 ¦
¦ 44 ¦ 0.35 ¦ 2008-06-30 19:52:39 ¦
¦ 3 ¦ 0.39 ¦ 2008-06-30 20:08:31 ¦
¦ 313 ¦ 1.29 ¦ 2008-06-30 20:13:03 ¦
¦ 2 ¦ 0.86 ¦ 2008-06-30 20:21:29 ¦
¦ 163 ¦ 1.88 ¦ 2008-06-30 20:29:30 ¦
¦ 50 ¦ 0.44 ¦ 2008-06-30 20:44:52 ¦
¦ 46 ¦ 0.48 ¦ 2008-06-30 21:39:25 ¦
¦ 0 ¦ 0.32 ¦ 2008-06-30 21:42:16 ¦
¦ 60 ¦ 0.39 ¦ 2008-06-30 21:42:41 ¦
¦ 27 ¦ 0.35 ¦ 2008-06-30 21:58:54 ¦
¦ 19 ¦ 0.36 ¦ 2008-06-30 22:39:27 ¦
¦ 259 ¦ 0.81 ¦ 2008-06-30 22:59:50 ¦
¦ 46 ¦ 0.36 ¦ 2008-06-30 23:18:24 ¦
¦ 30 ¦ 0.39 ¦ 2008-06-30 23:35:14 ¦
¦ 36 ¦ 0.82 ¦ 2008-07-01 00:25:49 ¦
¦ 79 ¦ 0.75 ¦ 2008-07-01 01:52:30 ¦
¦ 200 ¦ 0.39 ¦ 2008-07-01 08:26:34 ¦
¦ 120 ¦ 0.35 ¦ 2008-07-01 08:27:06 ¦
¦ 80 ¦ 0.35 ¦ 2008-07-01 08:37:46 ¦
¦ 10 ¦ 0.31 ¦ 2008-07-01 08:46:28 ¦
¦ 7 ¦ 0.33 ¦ 2008-07-01 08:52:15 ¦
¦ 80 ¦ 0.34 ¦ 2008-07-01 08:57:19 ¦
¦ 1 ¦ 0.31 ¦ 2008-07-01 09:09:01 ¦
¦ 59 ¦ 0.41 ¦ 2008-07-01 09:20:54 ¦
¦ 45 ¦ 0.37 ¦ 2008-07-01 09:21:38 ¦
¦ 61 ¦ 0.75 ¦ 2008-07-01 10:28:57 ¦
¦ 1 ¦ 0.32 ¦ 2008-07-01 10:36:16 ¦
¦ 2 ¦ 0.31 ¦ 2008-07-01 10:53:30 ¦
¦ 21 ¦ 0.33 ¦ 2008-07-01 10:54:33 ¦
¦ 30 ¦ 0.33 ¦ 2008-07-01 10:56:15 ¦
¦ 29 ¦ 0.35 ¦ 2008-07-01 11:25:05 ¦
¦ 117 ¦ 0.51 ¦ 2008-07-01 11:31:13 ¦
¦ 50 ¦ 0.48 ¦ 2008-07-01 11:45:31 ¦
¦ 7 ¦ 0.32 ¦ 2008-07-01 11:50:43 ¦
¦ 60 ¦ 0.44 ¦ 2008-07-01 11:52:33 ¦
¦ 30 ¦ 0.45 ¦ 2008-07-01 11:56:13 ¦
¦ 29 ¦ 0.40 ¦ 2008-07-01 11:58:30 ¦
¦ 3 ¦ 0.38 ¦ 2008-07-01 12:04:46 ¦
¦ 37 ¦ 0.34 ¦ 2008-07-01 12:18:04 ¦
¦ 0 ¦ 0.42 ¦ 2008-07-01 12:22:57 ¦
¦ 91 ¦ 0.33 ¦ 2008-07-01 12:24:11 ¦
¦ 31 ¦ 0.57 ¦ 2008-07-01 12:24:25 ¦
¦ 30 ¦ 0.35 ¦ 2008-07-01 12:39:04 ¦
¦ 10 ¦ 1.11 ¦ 2008-07-01 13:05:45 ¦
¦ 59 ¦ 0.36 ¦ 2008-07-01 13:16:16 ¦
¦ 200 ¦ 0.48 ¦ 2008-07-01 13:17:29 ¦
¦ 319 ¦ 1.98 ¦ 2008-07-01 13:27:33 ¦
¦ 30 ¦ 0.33 ¦ 2008-07-01 13:40:23 ¦
¦ 52 ¦ 0.34 ¦ 2008-07-01 13:47:01 ¦
¦ 301 ¦ 1.91 ¦ 2008-07-01 14:12:06 ¦
¦ 90 ¦ 0.85 ¦ 2008-07-01 14:13:14 ¦
¦ 10 ¦ 0.30 ¦ 2008-07-01 14:20:11 ¦
¦ 66 ¦ 0.34 ¦ 2008-07-01 14:20:16 ¦
¦ 19 ¦ 0.30 ¦ 2008-07-01 14:20:39 ¦
¦ 20 ¦ 0.43 ¦ 2008-07-01 14:33:11 ¦
¦ 240 ¦ 0.59 ¦ 2008-07-01 14:36:32 ¦
¦ 36 ¦ 0.33 ¦ 2008-07-01 14:44:37 ¦
¦ 175 ¦ 1.48 ¦ 2008-07-01 14:54:02 ¦
¦ 46 ¦ 1.75 ¦ 2008-07-01 15:13:10 ¦
¦ 35 ¦ 0.57 ¦ 2008-07-01 15:17:39 ¦
¦ 28 ¦ 0.32 ¦ 2008-07-01 15:20:09 ¦
¦ 109 ¦ 0.56 ¦ 2008-07-01 15:29:27 ¦
¦ 30 ¦ 0.47 ¦ 2008-07-01 15:34:22 ¦
¦ 25 ¦ 0.48 ¦ 2008-07-01 15:35:27 ¦
¦ 2 ¦ 0.33 ¦ 2008-07-01 15:44:28 ¦
¦ 1 ¦ 0.33 ¦ 2008-07-01 15:46:11 ¦
¦ 21 ¦ 0.37 ¦ 2008-07-01 15:52:04 ¦
¦ 119 ¦ 0.57 ¦ 2008-07-01 16:03:50 ¦
¦ 53 ¦ 0.37 ¦ 2008-07-01 16:48:14 ¦
¦ 30 ¦ 8.80 ¦ 2008-07-01 17:02:03 ¦
¦ 415 ¦ 4.72 ¦ 2008-07-01 17:05:02 ¦
¦ 10 ¦ 0.43 ¦ 2008-07-01 17:08:52 ¦
¦ 43 ¦ 0.44 ¦ 2008-07-01 17:13:32 ¦
+------+------+---------------------+
Is this normal? I mean, like we were discussing, I was thinking about some table swapping technique. But the point is: is it a really legitimate problem this one? If I had the insert demand I thought I had (5 thousand rows several times a day) ok, but do I have to use a technique like that because of so few rows inserts?
Is there something I shall do?
Thanks a lot folks!
Phoenix
PS: By the way:
I'm on a 2 Quad-core processor machine, with 16Gb ram, and RAID10 disks
And here's my my.cnf.
[mysqld]
datadir=/var/lib/mysql
socket=/var/lib/mysql/mysql.sock
#old_passwords=1
skip-locking
key_buffer = 1024M
max_allowed_packet = 40M
table_cache = 2048
sort_buffer_size = 512M
read_buffer_size = 256M
read_rnd_buffer_size = 256M
myisam_sort_buffer_size = 256M
thread_cache_size = 16
thread_concurrency = 8
query_cache_size = 256M
tmp_table_size=64M
back_log = 100
max_connect_errors = 10000
join_buffer_size=64M
open-files = 20000
bulk_insert_buffer_size=128M
concurrent_insert=2
interactive_timeout = 600
wait_timeout = 600
max_connections=500
log-slow-queries=/var/lib/mysql/slow.log
long_query_time=1
max_heap_table_size=200M
ft_stopword_file=/etc/stopwords_pt.txt
ft_min_word_len=3
server-id = 1
¦ 1 ¦ 0.33 ¦ 2008-06-30 19:45:43 ¦
¦ 200 ¦ 0.54 ¦ 2008-06-30 19:50:28 ¦
I'm not sure what's causing this but it could be due to your indexes.
We're talking about a 3 million row CD/DVD inventory table, with many fulltext indexes
Perhaps you can increase the number of records added per insert since it seems that the re-index is the heaviest operation. Or you could use delayed key writing:
For MyISAM tables, another strategy for reducing index flushing is to use the DELAYED_KEY_WRITE table option. With this option, data rows are written to the data file immediately as usual, but the key cache is flushed only occasionally rather than after each insert. To use delayed index flushing on a server-wide basis, start mysqld with the --delay-key-write option.
#2 - I heard about DELAYED_KEY_WRITE option, but I didn't find out at the docs how to include it at the my.cnf. This is an command line option, right? Also, another thing that worries me about this is that updates are delayed too, and this can be bad, as we're a online store, and products may be sold twice with this delay.
#2 - More info on my.cnf and how to use command line type args in my.cnf
[dev.mysql.com...]