Forum Moderators: open

Message Too Old, No Replies

Help me choose a database

What database for large table with extremely frequent inserts/deletes?

         

Jonatan

1:15 pm on Mar 19, 2007 (gmt 0)

10+ Year Member



Hi,

I would like to have some tips on what database to use for an application that I'm currently working on.

I'm going to have a table where the data is quite temporary and changes a lot (from one day to another the rows may be totally different). The row will have the following fields: one primary id, three varchar fields of about 200 bytes each (they will be of about 50 bytes each on average), two other integers and three more char fields of about 20 bytes each. The 200 bytes varchar fields should be indexed and searchable, and one of the 20 bytes fields (a hash value) should be an index as well for fast access. Because of the temporary nature of the data in the table, it's not critical if data is lost or cleared in case of a crash.

I've estimated that with about 10k clients there will be about 10 million rows in the table (about 2-3 GB) at any given moment. With this traffic, there will be about 7 inserts and deletes each second with an average of 1000 rows inserted and deleted on each insert/delete. In other words 7000 rows will be inserted and deleted every second. At the same time there will be about 150 selects every second. The

This seems to be a quite heavy load, and therefore I've been thinking of using an in-memory database, since the data is not critical. But using an in-memory database raises a problem with the size of the database since I will need a server with a LOT of internal memory (the database can't be larger than the total amount of physical memory). An in-memory database would probably solve all processing performance problems, but the internal memory will set the limit. I'm not sure but I'm guessing that if using, for example MySQL, the limit will be the processing time of the inserts/deletes.

So I guess what I'm hoping for now, is to get input on what kind of database I could use that can handle this amount of traffic. How would MySQL perform with that kind of traffic? Anyone who has experience of some similar application?

Thanks a lot in advance!
/Jonatan

TheNige

1:10 am on Mar 20, 2007 (gmt 0)

10+ Year Member



Can you give any background on what this table will actually be storing and why you would be inserting/deleting 7000 rows per second?

Jonatan

11:08 am on Mar 20, 2007 (gmt 0)

10+ Year Member



I miscalculated in my previous message. There will be 0.7 inserts/deletes every second and each of those will insert or delete 1000 rows each. So there will be 700 rows inserted and deleted every second.

Since the application is not yet released I would prefer not to explain why so many rows need to be inserted/deleted every second, but I can assure you that it IS needed. And for my questions I really don't see any need to explain it either.

Thanks,
/Jonatan

TheNige

8:31 pm on Mar 20, 2007 (gmt 0)

10+ Year Member



Sorry for asking.

I'd recommend downloading different DBMS and trying them out by stress testing them yourself.

Jonatan

8:24 pm on Mar 21, 2007 (gmt 0)

10+ Year Member



Oh, no problem. I would like to do that, but I was hoping that someone knew if, for example, what I want to do is almost impossible with a database stored on the harddrive (like MySQL) and that I have to go for an in-memory database. Or maybe someone knows that this kind of load will be no problem at all?

If I should stress-test it myself I need to create a database table that contains data that is similar to the production case, and then I need to write a program/script that can "stress" the database with a lot of queries. I was hoping that I could avoid this, but maybe this is what I need to do.

Thanks again,
/Jonatan

txbakers

10:54 pm on Mar 21, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



mySQL can handle that without a problem, provided you handle the connection objects correctly.

Jonatan

11:44 am on Mar 22, 2007 (gmt 0)

10+ Year Member



What's the correct way to handle them? Would persistent connections in PHP (mysql_pconnect()) be a good way?

Also, what would you say are the hardware requirements for a server running MySQL with that kind of load?

Thanks a lot for the input!

/Jonatan

Jonatan

2:59 pm on Mar 25, 2007 (gmt 0)

10+ Year Member



Hi,

We have decided that we will probably go with a MySQL database that has a large in-memory (heap) table.

Anyone has an idea on what kind of server would be needed for the kind of load I described earlier using a MySQL in-memory (heap) table?

We are currently looking into purchasing a server that has two dual core AMD Opteron 270 (2 ghz, 64-bit) and 4 gb ram. Does anyone know if this configuration would be sufficient?

txbakers:
Any tip on how to "handle the connection objects correctly"?

Thanks for the help so far!

/Jonatan

[edited by: Jonatan at 3:02 pm (utc) on Mar. 25, 2007]

justageek

2:11 pm on Mar 26, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



We are currently looking into purchasing a server that has two dual core AMD Opteron 270 (2 ghz, 64-bit) and 4 gb ram. Does anyone know if this configuration would be sufficient?

That might be an overkill althought not knowing exactly what you're doing makes it tough to answer. I can say that on a setup close to what you have listed I can almost always get at least 100k questions per second in memory and sometimes I run as high as 272K per second.

With memory tables the important factors are how much memory, the memory speed and the cpu speed. So you'd get better performance if you used 1066 memory versus 667 or 533.

JAG

Clark

1:19 pm on Mar 27, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Just curious, will you have a problem when your machine reboots? Because naturally with an in-memory table, you lose the entire database in a crash.

justageek

3:32 pm on Mar 27, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



You could run into trouble if you are not careful. But most times memory tables are tables than can be lost without causing that much pain.

For example, I have a small script that only loads about 10 Gig of data and less than 200 million rows at a time. I then process that data in the db and output my final results to a disk based table. It only takes about a minute or so to do what I need to do so if I have a db go down for whatever reason I only have to reload that last batch and move on.

In other words...the loss of data just means about a 2 minute loss in processing time. Nice tradeoff from disk based tables that would take me days to do the same processing.

JAG

Jonatan

5:29 pm on Mar 30, 2007 (gmt 0)

10+ Year Member



Hi,

Thanks a lot for the input, I really appreciate it!

It would be a little bit more painful for us to loose the data, it would probably take 30-60 minutes to gather the data again but if it doesn't happen more than maximum a couple of times a month we will be okay.

/Jonatan