Welcome to WebmasterWorld Guest from 54.226.209.162

Forum Moderators: open

Message Too Old, No Replies

How to group by one field and order by another?

"group by" is overriding my "order by"

     
5:30 pm on Oct 3, 2012 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Feb 12, 2003
posts: 1199
votes: 0


I'm trying to pull the most recent posts out of a table, but have each topic represented only once.

This gets the posts, but takes more than one from each topic:
SELECT topic_id,posted,poster,message FROM posts ORDER BY posted DESC LIMIT 20


Adding a GROUP BY makes sure each thread is represented only once, but then I no longer get the most recent post from each topic:
SELECT topic_id,posted,poster,message FROM posts GROUP BY topic_id ORDER BY posted DESC LIMIT 20


I tried to put "ORDER BY" before "GROUP BY", but that just gives me a syntax error:
SELECT topic_id,posted,poster,message FROM posts ORDER BY posted DESC LIMIT 20[ GROUP BY topic_id


How should I write my query?
4:15 pm on Oct 18, 2012 (gmt 0)

Junior Member

10+ Year Member

joined:Apr 16, 2005
posts: 113
votes: 0


Google for "sql top 1 in each group" - there should be loads of articles - I'd post a link but not sure if I am allowed to?
4:49 pm on Oct 18, 2012 (gmt 0)

Moderator from US 

WebmasterWorld Administrator lifeinasia is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Dec 10, 2005
posts:5589
votes: 27


You can do it in a 2-step process:
1) Find the max posted date for each topic-
SELECT topic_id, MAX(posted) AS LastPost
FROM posts
GROUP BY topic_id

2) Match the posts by topic_id and LastPost

The following should work on some systems:
SELECT p.topic_id,p.posted,p.poster,p.message
FROM posts p INNER JOIN (SELECT topic_id, MAX(posted) AS LastPost FROM posts GROUP BY topic_id) m ON (p.topic_id=m.topic_id AND p.posted=m.LastPost)
5:56 am on Oct 19, 2012 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Feb 12, 2003
posts: 1199
votes: 0


Thanks for the responses. In the meantime, I figured out another way to do it:
SELECT topic_id,posted,poster,message FROM posts temp1 WHERE
posted=(SELECT MAX(temp2.posted) FROM posts temp2 WHERE temp1.topic_id = temp2.topic_id)
AND poster in (SELECT poster FROM posts)
ORDER BY posted DESC LIMIT 20
3:53 pm on Oct 19, 2012 (gmt 0)

Moderator from US 

WebmasterWorld Administrator lifeinasia is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Dec 10, 2005
posts:5589
votes: 27


A few problems with that method...

You're making 3 calls to the posts table:
1) SELECT poster FROM posts (which has to scan the entire table and keep the results in memory)
2) SELECT MAX(temp2.posted) FROM posts
3) SELECT topic_id,posted,poster,message FROM posts temp1

I see #1 as a complete waste because it pulls everything (including duplicate poster values). Plus it doesn't do anything to match the actual poster with the posts identified in #2.

In fact, the "AND poster in (SELECT poster FROM posts)" clause is basically the same as saying "AND 1=1" in this case.

Try your query with and without "AND poster in (SELECT poster FROM posts)" and I suspect you'll get the same results (except that it should run faster without it).

Otherwise, the logic is basically the same as what I posted- not sure which is more efficient. (And depending on your DBMS, it may rewrite both into a more efficient 3rd version under the hood. :) )
11:45 am on Oct 21, 2012 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Feb 12, 2003
posts: 1199
votes: 0


Thanks again for the help. I left something out of my initial post to keep it simple, but I guess I should be more specific: I also need to pull only posts where the poster's very first post to the whole forum was made more than 24 hours ago. This ensures that I don't pull any spam posts which I haven't yet reviewed. So, the AND isn't actually completely useless. Here's my actual code, which uses a Perl variable for the time:
SELECT topic_id,posted,poster,message FROM posts temp1 WHERE
posted=(SELECT MAX(temp2.posted) FROM posts temp2 WHERE temp1.topic_id = temp2.topic_id)
AND poster in (SELECT poster FROM posts WHERE posted < $twentyFourHoursAgo)
ORDER BY posted DESC LIMIT 20

With that in mind, does my query look okay to you now?
 

Join The Conversation

Moderators and Top Contributors

Hot Threads This Week

Featured Threads

Free SEO Tools

Hire Expert Members