Forum Moderators: coopster

Message Too Old, No Replies

Boggled mind 2

complex query or a bunch of simple ones?

         

slade7

10:05 pm on Nov 21, 2003 (gmt 0)

10+ Year Member



I have a mysql table with 14 columns and over 25,000 rows

I need to select a few of these columns and do some statistical analysis on them like this:

codename¦¦name¦¦owner¦¦figure1¦¦figure2¦¦limiter¦¦identifier

In my table, each row is an event involving two parties. one party is irrelevant, and the other party is codename+owner (both of which are varchar and may appear far more than once in the table. The identifier is actually just a string containing the codename and the owner I use for working with the data in excel.

What I need to do is SELECT COUNT for every row in which the limiter (a float) is greater than 0, and it needs to be grouped by identifier (which has the same effect as GROUP BY codename, owner) and I need the AVG of all the values in figure1 and figure2 combined for each.. but averaged by the total number of values, not rows (which is really the COUNT *2 for each grouped set, because if the limiter is > 0 every row will have a value for fig1 and fig2) Then I want to print the top 20 averages onto a page like this ordered by avg DESC:

codename name¦¦owner¦¦COUNT¦¦AVG

Right now I do this by
SELECT codename, name, owner, COUNT(*)
WHERE limiter > '0'
GROUP BY codename, owner //identifier would work here too

then I loop through all of these w/ PHP applying other simple queries to each to obtain the average figure, and dump the end result into excel, re-sort, delete unwanted and dump that into another table and SELECT from there to display on web page. Is there a select that would do all that better? The number of rows returned by my query above is about 1700-1800 usually. I want the top 20 Averages

My apologies if this is too long or inappropriate. Brain is fried.

thawebmaster

2:34 pm on Nov 22, 2003 (gmt 0)

10+ Year Member



You can use the LIMIT keyword.

SELECT codename, name, owner, COUNT(*)
WHERE limiter > '0'
GROUP BY codename, owner
LIMIT 0, 20

Now you only get the first 20 items from the DB.

slade7

6:41 pm on Nov 22, 2003 (gmt 0)

10+ Year Member



I could, but it would not give me the 20 that I want, which are the TOP 20 averages. The average being a figure that is calculated on the fly, I don't know how to overcome this without trying different ways of splitting up the table.

I want to save myself the time of having to compile the averages manually and display from a compiled table.

I have a testing server set up and am looking at ways of splitting up my data to achieve this, but its a difficult problem. Most of the readily available advice out there deals with operational databases and there's not much I've found about OLAP.

I guess a similar situation would be if you had to calculate batting averages from a table of many, many at bats - each row being one at bat.

slade7

5:56 pm on Nov 26, 2003 (gmt 0)

10+ Year Member



I got most of the way with this:

SELECT codename, name, owner,
AVG(fig1 + fig2) AS daavg,
COUNT(*) AS dacount
FROM table
WHERE limiter > '0'
GROUP BY identifier
HAVING COUNT(*) >=5
ORDER BY whatever

I'd like to limit the results to HAVING COUNT >=5 AND HAVING AVG(fig1 + fig2) > 'whatever' - but it doesn't seem to work that way