Forum Moderators: coopster
I have used MySQL variables in queries like Paul DuBois' example of numbbering a result set.
Now I'd like to use them to keep track of the cumulative total of words with a given frequency in a text.
SELECT DISTINCT freq, COUNT(*) FROM words GROUP BY freq ORDER BY freq DESC
gives me a a list of frequencies and the number of words with that frequency.
What I want is a running total of the number of words looked at so far.
I tried
SET @total=0
SELECT DISTINCT freq, COUNT(*), @total=@total + COUNT(*) FROM words GROUP BY freq ORDER BY freq DESC
but each time the third column is only the current value of count.
How can I make it keep the running total?
TIA,
Bill
But since this is the PHP forum, it would be pretty easy to do what you need with it, just keep a running total in a PHP variable.
If using PHP isn't an option, it's still possible to do something like this with a subquery, but it might run (ahem) a little slow. It would look something like:
SELECT field1, count(*), (SELECT COUNT(*) FROM table_name AS s WHERE s.field1 < m.field1) FROM table_name AS m GROUP BY field1 ORDER BY field1
1. >> "You're going at this as if SQL were a procedural language (which it isn't). The second row can't be thought of as happening "after" the first."<<
>> On the other hand, that is the way it works in DuBois' example:
SET @n := 0;
SELECT @n := @n+1 AS rownum, thing, arms, legs FROM limbs;
In this example, @n is incremented with each row.
The problem seems to be with the aggregate functions like COUNT() and SUM(), not with field values, which can also be added to a variable. For example, in my code (error: it should have been @total:=)
SET @total=0
SELECT DISTINCT freq, COUNT(*), @total:=@total + freq FROM words GROUP BY freq ORDER BY freq DESC
the 3rd column is the cumulative frequency count.
2. >>But since this is the PHP forum, it would be pretty easy to do what you need with it, just keep a running total in a PHP variable. <<
>>> It can be done with PHP (which is how I am doing it now, but have had overflow problems with doing certain calculations which MySQL's double precision reals would avoid) and find an on the fly MySQL solution more elegant.
Does anyone with experience with MySQL variables have another suggestion?
Bill
Basically, since you just want a running count at any given point, and you are already grouping by freq, then all you need is a line counter. Unless I misunderstand what you are looking for, this should work:
SET @total=0;
SELECT freq, COUNT(*), @total:=@total+1 AS freqcount FROM words GROUP BY freq ORDER BY freq DESC;
Unfortunately that doesn't do it -- it just increments by 1 each time. Say there are 10 items with frequency 5, 6 with freq 4, 21 with freq 3 etc. I'd want the following results:
Frequency ¦ Count ¦ Cumulative Count
5 ¦ 10 ¦ 10
4 ¦ 6 ¦ 16
3 ¦ 21 ¦ 37
I do think the problem is the inability to use function return values as you can column values.
Bill
SELECT freq, COUNT(*) AS count, (SELECT COUNT(*) FROM words AS s WHERE s.freq >= m.freq) AS cumulative FROM words AS m GROUP BY freq, cumulative ORDER BY freq DESC;
Also, subqueries were made available in MySQL version >= 4.1. If your server is not at this version, you could go with a temporary table and accumulate the sum as you go:
--
-- Temporary table:
--
--
CREATE TEMPORARY TABLE tmp(
frequency SMALLINT UNSIGNED,
count SMALLINT UNSIGNED
);
--
INSERT INTO tmp
SELECT
freq,
COUNT(*) AS count
FROM words
GROUP BY freq
ORDER BY freq DESC
;
--
-- SELECT * FROM tmp;
--
SET @cumulative = 0;
--
SELECT frequency, count, @cumulative:=@cumulative + count AS cumulative FROM tmp;
--
DROP TABLE tmp;