MYSQL: How to make this Query of Sub-Queries A LOT more EFFICIENT

Forum Moderators: open

Message Too Old, No Replies

MYSQL: How to make this Query of Sub-Queries A LOT more EFFICIENT

NeedExpertHelp

7:41 pm on Jan 15, 2010 (gmt 0)

Hello everyone,

I have the following query of sub-queries:

-----------------------------------------
SELECT *
FROM table1
WHERE

(
SELECT COUNT( col1 )
FROM table2
WHERE table1.col1 = table2.col2
) = 0

AND

(
SELECT COUNT( col1 )
FROM table3
WHERE table1.col1 = table3.col2
) = 0

AND

(
SELECT COUNT( col1)
FROM table4
WHERE table1.col1 = table4.col2
) < 2

ORDER BY table1.col3 ASC
-----------------------------------------

How can I make it A LOT more EFFICIENT?

It's terribly slow right now, although it does what I want.

I'm thinking JOIN's and Intersections, but I don't know how and it's beyond my level of expertise so I need expert help.

Thanks!

Demaestro

8:37 pm on Jan 15, 2010 (gmt 0)

try this:

select
tb1.*,
count(tb2.col1) as tb2_count,
count(tb3.col1) as tb3_count,
count(tb4.col1) as tb4_count
from
table1 tb1, table2 tb2, table3 tb3, table4 tb4
where
tb1.col1 = tb2.col2
and tb1.col1 = tb3.col2
and tb1.col1 = tb4.col2
and tb2_count = 0
and tb3_count = 0
and tb4_count < 2
group by
tb2.col1,
tb3.col1,
tb4.col1
order by
tb1.col3 ASC

Another thing you can try playing around with is the "HAVING CLAUSE"

it would look something like:
group by
tb2.col1,
tb3.col1,
tb4.col1
HAVING COUNT(tb2.col1) = 0, COUNT(tb3.col1) = 0, COUNT(tb4.col1) < 2

NeedExpertHelp

9:44 pm on Jan 15, 2010 (gmt 0)

Thanks Demaestro.

Your query is giving me this error:

#1054 - Unknown column 'tb2_count' in 'where clause'

Any ideas?

syber

10:20 pm on Jan 15, 2010 (gmt 0)

I believe the following is the same logically and should be more efficient:

SELECT *
FROM table1
WHERE NOT EXISTS (SELECT * FROM table2 WHERE table1.col1 = col2)
AND NOT EXISTS (SELECT * FROM table3 WHERE table1.col1 = col3)
AND ( SELECT COUNT( *) FROM table4 WHERE table1.col1 = col2) > 2
ORDER BY col3 ASC

NeedExpertHelp

7:30 am on Jan 16, 2010 (gmt 0)

Hi Syber, thanks for your input.

Unfortunately, your query took roughly the same amount of time as my original query: 40 seconds.

It also had a different row count.

Any other ideas?

tangor

8:47 am on Jan 16, 2010 (gmt 0)

Is this a very massive database, or perhaps not indexed?

NeedExpertHelp

7:09 pm on Jan 16, 2010 (gmt 0)

Hi tangor, it's actually not that big at all. Each table has under 20,000 records.

What do you mean if it's indexed?

Also, would it be better if I just added a column to table3 (which is the biggest table) keeping track of col2's total so that then I can quickly check that column vs. doing a count? That would probably speed the query up a lot.

Any ideas?

NeedExpertHelp

2:35 pm on Jan 18, 2010 (gmt 0)

Hello, I hope everyone had a great weekend.

Just wanted to follow up on this.

Thanks!

Demaestro

3:46 pm on Jan 18, 2010 (gmt 0)

When you join tables by a field it is more efficient to create an INDEX on that field.

An index simply creates sort of a "quick guide" to that field. It makes joining and performing functions on them quicker.

Most DBs will index a field if you make it the primary key but it also helps to index other fields that get used often for joining or functions.

Regardless of what solution you use you should index some of these fields.

Did you try HAVING?

It would look something like this:

select
tb1.*
from
table1 tb1, table2 tb2, table3 tb3, table4 tb4
where
tb1.col1 = tb2.col2
and tb1.col1 = tb3.col2
and tb1.col1 = tb4.col2
group by
tb2.col1,
tb3.col1,
tb4.col1
having
count(tb2.col1) = 0
and
count(tb3.col1) = 0
and
count(tb4.col1) < 2
order by
tb1.col3 ASC

syber

3:56 pm on Jan 18, 2010 (gmt 0)

The problem with the solution above is, if the COUNT for tb2.col1 and tb3.col1 is 0 - how will you have any row returned from the joins of tb2 and tb3?

NeedExpertHelp

2:34 pm on Jan 20, 2010 (gmt 0)

So where do we go from here?

syber

2:24 am on Jan 21, 2010 (gmt 0)

I think you are left with adding a column to tb3 to keep track of col2's total. The only downside is you need to come up with a way to keep the value current.