Welcome to WebmasterWorld Guest from 54.221.131.67

Forum Moderators: open

MySQL: grouping by column, but give 3 results per column

     
1:51 am on Jul 28, 2017 (gmt 0)

Preferred Member

Top Contributors Of The Month

joined:Mar 15, 2013
posts: 571
votes: 47


I have a message board, and the "subjects" table looks like this:

id
category
lastmodified
subject
firstname
lastname
size
shortdesc

Column "category" is a VARCHAR(15). There are currently 7 different potential categories, but it could grow to be more so I stuck with a VARCHAR (for now) instead of an ENUM.

What I'd like to do is select the 3 most recent rows (sorted by "lastmodified") for each "category".

Currently, the only way I know to do that is with several different queries:

1. SELECT DISTINCT category FROM subjects;

2. Loop through results, then for each category:

while (list($category) = mysql_fetch_row($sth) {
$sth_list = sprintf("SELECT * FROM subjects WHERE category='%s' ORDER BY lastmodified DESC LIMIT 3",
mysqli_real_escape_string($dbh, $category));

...
}

But that would be a staggering 8 queries on the page... more, as new categories are added!

Can you guys suggest a way to do this in a single, faster query?
9:31 am on July 28, 2017 (gmt 0)

Senior Member

WebmasterWorld Senior Member topr8 is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Apr 19, 2002
posts:3356
votes: 39


maybe you could run a query that creates a temporary table populated by looping through each category and getting the 3 most recently modified id's, then select all the rows from that table ordered by category ... from php you'd only be making one query, although the database itself would be running several of course.

another strategy could be to have a table most_recent_3 which you initially populate with the most recent 3 subjects in each category, this table would literally only need to have one column id_ref because when calling it you would join it to the subjects table. then when ever someone updates a row in the subjects table or adds a new row, then also run a query that deletes the oldest of the 3 in that category from themost_recent-3 table and adds the new or updated id to the most_recent_3 table.

hope that makes sense.

it all depends how often people are updating or inserting new subjects and how oftenthey are just reading whats there.
12:37 pm on July 28, 2017 (gmt 0)

Senior Member

WebmasterWorld Senior Member topr8 is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Apr 19, 2002
posts:3356
votes: 39


i was just having a coffee and thought that union could be your friend here ...


(SELECT * FROM subjects WHERE category='category_1' ORDER BY lastmodified DESC LIMIT 3)
UNION
(SELECT * FROM subjects WHERE category='category_2' ORDER BY lastmodified DESC LIMIT 3)
UNION
(SELECT * FROM subjects WHERE category='category_3' ORDER BY lastmodified DESC LIMIT 3);

and so on with a select for each of your categories


personally i still like the seperate table idea as it is similar to caching the result in a way, so that it is a very quick simple query for any user just viewing the listing.
7:37 pm on July 28, 2017 (gmt 0)

Preferred Member

Top Contributors Of The Month

joined:Mar 15, 2013
posts: 571
votes: 47


I totally understand your idea with the dedicated table, and it's not a bad one... it's not much different than how I created the "subjects" table in the first place, instead of reading "posts" to get the subject data.

I just now tested the UNION idea, though, and a table with 200,000 subjects gave me the results I needed in 0.0018s! So that might be the winner; faster to implement, fast to process, and clean code :-) Thanks for the suggestion!