Welcome to WebmasterWorld Guest from 34.231.247.139

Forum Moderators: coopster & jatar k

Message Too Old, No Replies

MySQL ORDER BY with GROUP BY

     
9:43 pm on Aug 5, 2005 (gmt 0)

Full Member

10+ Year Member

joined:Nov 19, 2003
posts:291
votes: 0


Hi all,

I have a table that contains duplicate data.

I am trying to pull all of the most recent rows from the table

SELECT * FROM `data` ORDER BY 'timestamp' GROUP BY 'name'

This query produces SQL erros. How can I pull out the "latest version" of all the names in the table?

12:53 am on Aug 6, 2005 (gmt 0)

Administrator

WebmasterWorld Administrator coopster is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:July 31, 2003
posts:12555
votes: 3


SELECT name instead of "*":
SELECT name FROM `data` ORDER BY 'timestamp' GROUP BY 'name'

You are probably experiencing what is known as "The Single-Value Rule" [webmasterworld.com]. There is an explanation in the link provided.

8:19 am on Aug 6, 2005 (gmt 0)

Junior Member

10+ Year Member

joined:July 23, 2005
posts:44
votes: 0


I had this problem before, and people in this forum helped solve the problem.
Basically you can create an id which auto-increments, then with this query:

SELECT name FROM data ORDER BY id DESC;

This will query name from the highest id to lowest.

I've never tried timestamp. I guess using timestamp to extract information is a good idea too.

Opiston

11:55 pm on Aug 21, 2005 (gmt 0)

Full Member

10+ Year Member

joined:Nov 19, 2003
posts:291
votes: 0


Still running into errors


SELECT url, 1, 2, 3, .................., 13, timestamp FROM `data` ORDER BY 'timestamp' GROUP BY 'url'. You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'GROUP BY 'url'' at line 1

Apparently the group and sort functions don't play well togeather. I don't want to put all the celumns in the GROUP BY section if there is an easier way to do this.

Each url can have several rows. I want to SELECT the most recent row for each url. Is there a simple way to do this? Maybe some sort of sub query?

Thanks!

7:33 am on Aug 22, 2005 (gmt 0)

New User

joined:Feb 2, 2005
posts:24
votes: 0


First thing is that ORDER BY *HAS* to go after the GROUP BY. The common rule: GROUP BY ... HAVING ... ORDER BY ... . So ithink this should work: SELECT * FROM `data` GROUP BY 'name' ORDER BY 'timestamp'.
2:11 pm on Aug 22, 2005 (gmt 0)

Full Member

10+ Year Member

joined:Nov 19, 2003
posts:291
votes: 0


Ok, putting order by last got rid of the error! I've tried both ASC and DESC, but it just changes the order of the whole list. I keep getting the first entry for the GROUP BY column 'url'. How do I get it to SELECT the last row for each 'url' instead of the first?

Thanks for your help!

2:45 pm on Aug 22, 2005 (gmt 0)

New User

joined:Feb 2, 2005
posts:24
votes: 0


That happens because the DB engine GROUPS table by URL first and then sorts the results.

Try:
SELECT url, MAX(timestamp) AS max, MIN(timestamp) AS min FROM data GROUP BY url ORDER BY timestamp

7:59 pm on Aug 22, 2005 (gmt 0)

Full Member

10+ Year Member

joined:Nov 19, 2003
posts:291
votes: 0


SELECT url, MAX(timestamp) AS max, MIN(timestamp) AS min FROM data GROUP BY url ORDER BY timestamp

This still just seems to change the order of the results in whole.

Is there some other way I can wirte the query? Maybe without GROUP By and ORDER BY? Maybe some sort of virtual table that I can ORDER BY temistamp ASC, and then doing another SELECT with a GROUP BY?

8:54 am on Aug 23, 2005 (gmt 0)

New User

10+ Year Member

joined:Jan 10, 2005
posts:7
votes: 0


Try

SELECT url, timestamp, ... FROM data ORDER BY 1, 2 DESC
10:07 am on Aug 23, 2005 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Apr 9, 2005
posts:1509
votes: 0


I have a table that contains duplicate data.

I am trying to pull all of the most recent rows from the table

Not sure why you need the aggregate data ability of group by when you are really looking for the same information in a number of rows, not a sum total of that information? Maybe I am not understanding...

Here are some ideas that have not been suggested:

"SELECT * FROM data WHERE col='duplicate info' ORDER BY timestamp DESC";

OR to get distinct associated url's from the table

"SELECT DISTINCT(url) FROM data WHERE col='duplicate info' ORDER BY timestamp DESC";

OR MAYBE

X=Number of duplicate records
SELECT *,MAX(timestamp) AS maxtime GROUP BY url ORDER BY timestamp DESC LIMIT X;

Hope this helps.

Justin

11:20 am on Aug 23, 2005 (gmt 0)

New User

joined:Feb 2, 2005
posts:24
votes: 0


Well.. could you please explain what you want to select then? Table structure, example of data, example of duplicate items and example of correct results would help (-;
3:25 pm on Aug 23, 2005 (gmt 0)

Full Member

10+ Year Member

joined:Nov 19, 2003
posts:291
votes: 0


Nope, those don't work either.

Some more clarifications on the data set. The table consists of:
id, url, ...a bunch of stored data for that url..., timestamp

The user can store historicaly data about different urls in the table. so it may look something like this as time goes on:


id ¦ url ¦ ...stored data ... ¦ timestamp
1 ¦ www.mydomoain.com ¦ ...blah... ¦ 2005-08-04 13:03:12
2 ¦ www.webmasterworld.com ¦ ...blah... ¦ 2005-08-04 13:33:12
3 ¦ www.cnn.com ¦ ...blah... ¦ 2005-08-04 15:03:12
4 ¦ www.cnn.com ¦ ...blah... ¦ 2005-08-06 10:00:02
5 ¦ www.mydomoain.com ¦ ...blah... ¦ 2005-08-10 13:03:12
6 ¦ www.mydomoain.com ¦ ...blah... ¦ 2005-08-11 13:03:12
7 ¦ www.mydomoain.com ¦ ...blah... ¦ 2005-08-12 13:03:12
8 ¦ www.msn.com ¦ ...blah... ¦ 2005-08-20 13:13:12
9 ¦ www.mydomoain.com ¦ ...blah... ¦ 2005-08-31 13:03:12
...

The query I am trying to write needs to SELECT the last entry for a url (latest date or greatest id) relative to the other rows for the same url. (the last entry for mydomain.com for example). (This is why I was using GROUP BY, so that I would only get one row per url)

The query needs to return the most recent entry for each url.


id ¦ url ¦ ...stored data ... ¦ timestamp
2 ¦ www.webmasterworld.com ¦ ...blah... ¦ 2005-08-04 13:33:12
4 ¦ www.cnn.com ¦ ...blah... ¦ 2005-08-06 10:00:02
8 ¦ www.msn.com ¦ ...blah... ¦ 2005-08-20 13:13:12
9 ¦ www.mydomoain.com ¦ ...blah... ¦ 2005-08-31 13:03:12

Does this give a more clear picture of the query I am trying to create?

I was thinking perhaps using a temporary table to hold the contents of 'data' ORDER BY timestamp DESC, then do a query on the temp table grouping by url. Not sure if this will create a big performance hit on my server though?

If there is a simpler/more efficient way to do this - I would love to know.

Thanks so much for all of your contributions!

7:17 am on Aug 24, 2005 (gmt 0)

New User

joined:Feb 2, 2005
posts:24
votes: 0


Okay.

This should work if your ID field really autoincrementing:

SELECT data.* FROM data INNER JOIN (SELECT MAX(id) AS id FROM data GROUP BY url) ids ON data.id = ids.id

7:24 pm on Aug 25, 2005 (gmt 0)

Full Member

10+ Year Member

joined:Nov 19, 2003
posts:291
votes: 0


That seemed to have done the trick! Now I just have to decipher exactly what it is doing so I can duplicate it!

Thanks!

 

Join The Conversation

Moderators and Top Contributors

Hot Threads This Week

Featured Threads

Free SEO Tools

Hire Expert Members