|Selecting the top 10 results by count in MySQL|
Is this an efficient way of going about it?
I'm working with a rapidly growing dataset, and being somewhat of a beginner in MySQL, I'm concerned about making early mistakes :)
I have a MySQL database with several million rows, from which I want to select the top 10 occurrences of a particular item, by how often that item occurs in the database. So, I'm using the syntax below:
SELECT count(my_item), my_item
FROM data WHERE item_category='example_category'
GROUP BY my_item
ORDER BY count(my_item) DESC LIMIT 10;
Now this isn't slow, but it can take a good few seconds to complete when my_item is present in many thousands of rows. I can live with this, but I just want to get some advice on whether I'm doing this in an efficient way :)
What's the query time without LIMIT?
Count() is slowing the query time down when there is no index, do you have one on the table?
Also try explain + query to see what's going on.
Thanks for the reply :)
To follow your steps:
I don't really understand how I should be interpreting EXPLAIN, but it gives me the results below:
Extra Using where; Using temporary; Using filesort
I guess the problem is I need to index some of the affected columns?
Not a problem,
Ok so you don't have an index on my_item. In order to shorten the query time index columns used in WHERE and ORDER BY. If you are using phpMyAdmin, it's very easy to add an index, just few clicks...
It would be great if you created a backup of the table before making any changes, in case if something goes left.
I tried adding an index to first my_item then item_category, but that actually seems to be slower!?
Perhaps I just have too big a database for this server to handle quickly - I just want to make sure I'm not doing something basic incorrectly. Still, could be a pretty hefty hardware upgrade to take it down from 20 seconds to something more palatable.
It would seem to me that an index on `item_category` would be helpful.
|MySQL database with several million rows, from which I want to select the top 10 occurrences of a particular item |
One reaches a certain point where doing queries like this in real-time just isn't efficient anymore. Run a cronjob as often as you expect the rankings to change and store the results in a separate summary table.
I'm facing the same problems. Have you set innodb_file_per_table in your my.cnf?
When you say "a MySQL database with several million rows", you mean a table, or a set of tables? May be time to shard your big table(s) too? Would you post the table(s) structure(s)?
I'm a bit late, but...
I'm not sure how well the
ORDER BY clause works when you use a function, or what kind of optimizations or caching are done, but something worth trying might be to give the result of
COUNT() a name and then
ORDER BY that name.
SELECT COUNT( `my_item` ) AS `item_count` [...] ORDER BY `item_count` [...]