Forum Moderators: open

Message Too Old, No Replies

Find Rank Based on AVG()?

         

Tusserte

12:36 pm on Sep 3, 2008 (gmt 0)

10+ Year Member



Hey everyone, I currently have a MySQL database containing user-submitted scores that are given to items. I would like to do two things: first, I would like to build a SELECT statement that orders the items based on the averages of their scores. Second, I would like to create some sort of SELECT statement that finds the rank of a certain item based on the list that is sorted by averages. I have noticed that "SELECT * FROM reviews WHERE AVG(score)>'".average_for_this_item()."' GROUP BY item" does not work, and I kind of understand why (although I don't know how to fix it). I've looked into the subject and found that the OVER() and PARTITION BY statements don't seem to work, although I believe they are exactly what I need. Can anyone help me? Thanks in advance!

[edited by: Tusserte at 12:56 pm (utc) on Sep. 3, 2008]

ZydoSEO

7:40 pm on Sep 3, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I'm not 100% sure about MySQL but most SQL DBs only allow the following in the list of values to be returned by a SELECT with a GROUP BY clause:

1) values that appear in the GROUP BY and
2) Aggregate functions like AVG(), SUM(), COUNT(), etc.

You were only using 'item' in the GROUP BY but you were returning all fields in the table (*) in the SELECT. Thus the error.

SELECT item, AVG(score)
FROM reviews
GROUP BY item
ORDER BY AVG(score) DESC

should get you a list of items ordered by their average score descending assuming 'item' uniquely identifies a particular thing being scored.

I'm guessing you have a separate 'item' table where the column 'item' is the primary key and that reviews.item is just a foreign key reference back to item.item. If that is the case and the item table had an item_name that you wanted to display you could do something like:

SELECT r.Item, i.ItemName, AVG(r.Score)
FROM review AS r
JOIN item AS i ON r.item = i.item
GROUP BY r.Item, i.ItemName
ORDER BY AVG(r.Score) DESC

NOTE: Because I wanted to return ItemName, it had to also appear in the GROUP BY.

As far as getting the ranking, I'm guessing you want to know which item has, say, the 3rd highest score or the 19th highest score.

The only way I could think to do that would be using another table. Not sure how often you would re-calculate the averages. If it's every time a user wants to query it then this approach is probably not good. But if you do it once per day or once per week, you could store the rankings and average scores in a table and have your web simply pull back the values from the table. In that case you could create a table something like:

CREATE TABLE tItemAverageScore (
Rank INT NOT NULL AUTO_INCREMENT,
Item INT,
AverageScore DECIMAL(2)
)

Schedule a stored procedure to run once per day or week that does something like:

DROP TABLE tItemAverageScore

CREATE TABLE tItemAverageScore (
Rank INT NOT NULL AUTO_INCREMENT,
Item INT,
AverageScore DECIMAL(2)
)

INSERT INTO tItemAverageScore (Item, AverageScore)
SELECT item, AVG(score)
FROM review
GROUP BY item
ORDER BY AVG(score) DESC

This would drop and recreate the table when the proc ran and then repopulate it based on results in your review table. It assumes you want the highest AVG(score) to rank as #1, 2nd highest AVG(score) to rank as #2, etc. I'm guessing auto increments always start at 1 in MySQL (sorry I'm MS SQL*Server / Oracle experienced).

Then your web app could do something similar to the following if you wanted to show the top 10 items in order of ranking:

SELECT i.*, ias.Rank, ias.AverageScore
FROM Item AS i
JOIN tItemAverageScore AS ias ON i.item = ias.Item
WHERE ias.Rank <= 11
ORDER BY ias.Rank

Hope that helps or at least gets your wheels turning...

DISCLAIMER: I haven't checked any of the above syntax. Just keyed it in on the fly. I don't have a copy of MySQL to play with. Sorry.

[edited by: ZydoSEO at 7:42 pm (utc) on Sep. 3, 2008]

Tusserte

9:37 pm on Sep 3, 2008 (gmt 0)

10+ Year Member



Thanks for replying. You guessed my intentions correctly. I understood the first part, but what I was really interested in was the "get rank" function. I was afraid that it would come to something like what you recommended. Do you believe that this is the only solution? I was hoping that there might be some sort of subquery that would get the job done without having to create a new table or do anything special. If no one else knows, however, I will resort to doing what you recommended, as it really seems like the only thing that would work at this point...