Forum Moderators: open
I'm looking to get a set of numerical data representing the number of times a field appears in a DB table (MySQL), based on the top 10 in terms of quantity.
Example NAME field in the table might look like this:-
Bob
Adam
James
Bob
James
Andrew
Cecil
etc
The data I would need to get back would be:-
Bob - 2
James - 2
Adam - 1
Andrew - 1
Cecil - 1
.... order by most appearing first, then in descending order, limit of 10.
But I don't know what the names are beforehand, so I can't just count the number of times "james" appears, then the number of times "bob" appears etc.
Is it possible to do this in one or two queries?
Thanks!
TJ
One further addition (if this is possible) - the names are actually user-agents, and each has a version number which sometimes differs, eg:-
Adam 2.1.3
Adam 2.1.4
Adam 2.0.1.5
etc
But basically as far as my script is concerned, I'm only interested in whether or not it's "Adam".
Is it possible to implement a REGEX or "LIKE" into the query to ensure that duplicates of "Adam" with different version numbers are not returned separately, but all rolled into the one result?
Thanks again,
TJ
SELECT
name,
COUNT(name) AS occurs
FROM
(SELECT
CASE
WHEN POSITION(' ' IN name) > 0
THEN SUBSTRING(name FROM 1 FOR (POSITION(' ' IN name)) - 1)
ELSE name
END AS name
FROM names) AS t1
GROUP BY name
ORDER BY occurs DESC, name
LIMIT 10
;
Hmmm... problem is I have one in between:-
Adam/sgr-3.1.3(LS) 20
BillyBob
Adam/sgr-2.0.12(LS) 15
This might prove tricky.... I suppose I could always grab a set consisting of more records than I need, then parse them down?
We are going to SELECT the name from your database table but rather than select the entire name we just want the actual *name* part of it, not the numeric string. When you first showed the names it looked as if there was always a space in between the actual *name* and the *user-agent* part of it, the numbers.
Adam 2.1.3
1234567890
Next, we are using that value to extract just the name portion. SUBSTRING is taking just a piece of the name column and returning it for us. Remember that POSITION - 1 returned 4, so SUBSTRING(name FROM 1 FOR 4) gives us 'Adam'. I used a CASE logic construct just in case you had any user agents that did not have any spaces or numbers after their name, just the plain old name. If we don't do that, we would get a 0 value returned instead of the name because the POSTION would not find any space in the name, returning zero!
So, knowing that you can either add to your WHEN structure or as you said before you could use alternatives. REGEX is an option but not as portable. Sounds like you are using MySQL here, true? If so we could just throw down a REGEX. To adapt the existing query and keep it compliant you could:
SELECT
name,
COUNT(name) AS occurs
FROM
(SELECT
CASE
WHEN POSITION(' ' IN name) > 0
THEN SUBSTRING(name FROM 1 FOR (POSITION(' ' IN name)) - 1)
WHEN POSITION('/' IN name) > 0
THEN SUBSTRING(name FROM 1 FOR (POSITION('/' IN name)) - 1)
ELSE name
END AS name
FROM names) AS t1
GROUP BY name
ORDER BY occurs DESC, name
LIMIT 10
;