Welcome to WebmasterWorld Guest from 54.167.5.15

Forum Moderators: open

Message Too Old, No Replies

mysql query help

     

enotalone

8:43 pm on Mar 3, 2010 (gmt 0)

10+ Year Member



Hello, my mysql knowledge is limited and after doing some reading, tests I am unable to find any solution and would appreciate any help. The task is to find products that with specific tags.

The main products table

CREATE TABLE IF NOT EXISTS `ds_item` (
`iid` int(11) NOT NULL auto_increment,
`status` int(5) NOT NULL default '0',
`title` varchar(255) NOT NULL default '',
`date` int(11) NOT NULL default '0',
`dupdate` int(11) NOT NULL default '0',
`sale` smallint(2) NOT NULL default '0',
`price_list` varchar(50) NOT NULL default '0.00',
`price_sale` varchar(50) NOT NULL default '0.00',
`price_sale_avg` decimal(15,2) NOT NULL default '0.00',
`price_save` decimal(15,2) NOT NULL default '0.00',
`store_id` int(11) NOT NULL default '0',
`pid` varchar(110) NOT NULL,
`sku` varchar(100) NOT NULL,
`upc` varchar(100) NOT NULL,
`url` text NOT NULL,
`description` text NOT NULL,
PRIMARY KEY (`iid`),
KEY `store_id` (`store_id`),
KEY `status` (`status`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8 AUTO_INCREMENT=2124683 ;


tags' table

CREATE TABLE IF NOT EXISTS `ds_item_cat` (
`rel_id` int(11) NOT NULL auto_increment,
`iid` int(11) NOT NULL default '0',
`cat` int(11) NOT NULL default '0',
PRIMARY KEY (`rel_id`),
KEY `iid_cat` (`iid`,`cat`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8 AUTO_INCREMENT=6127161 ;


stores' table

CREATE TABLE IF NOT EXISTS `ds_store` (
`store_id` int(11) NOT NULL auto_increment,
`store_name` varchar(255) NOT NULL default '',
`store_status` smallint(3) NOT NULL default '0',
`currency` char(3) NOT NULL default 'usd',
`note` varchar(255) NOT NULL,
PRIMARY KEY (`store_id`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8 AUTO_INCREMENT=539 ;


now a sample query to get Women's Black Steve Madden Sandals

SELECT ds_item.iid, title, sale, price_list, price_sale, price_sale_avg, ds_item.store_id, description, store_name, currency FROM ds_item
INNER JOIN ds_item_cat AS c1 ON ds_item.iid=c1.iid
INNER JOIN ds_item_cat AS gender ON ds_item.iid=gender.iid AND gender.cat='4118'
INNER JOIN ds_item_cat AS brand ON ds_item.iid=brand.iid AND brand.cat='715'
INNER JOIN ds_item_cat AS color ON ds_item.iid=color.iid AND color.cat='4127'
LEFT JOIN ds_store ON (ds_item.store_id=ds_store.store_id)
WHERE status='0' AND c1.cat IN ('4160')
ORDER BY sale DESC, ds_item.dupdate DESC
LIMIT 0, 10

The problem: when I run explain on this query it shows that about 93947 rows need to be scanned. Any way to avoid this by optimizing the query, tables? Thank you.


the results of explain

[img218.imageshack.us...]

syber

3:05 pm on Mar 4, 2010 (gmt 0)

10+ Year Member



Is there any reason that you need to JOIN ds_item_cat 5 times? Also doing a LEFT JOIN at the end is dragging everything back, even if it doesn't match.

Couldn't you just say:

WHERE status='0' AND c1.cat = '4160'

to find what you are looking for?

enotalone

4:33 pm on Mar 4, 2010 (gmt 0)

10+ Year Member



Thanks syber, that many joins are needed to pull products that are color: white, gender:women, brand:Steve Madden and category:Sandals.

with WHERE status='0' AND c1.cat = '4160' it would show sandals (cat='4160') that are in stock (status='0').

The last left join is simply use to connect store information where they are sold to products table on a common key (store_id). I will try the query without joining ds_store, but I have a feeling it is not going to change much.

enotalone

4:36 pm on Mar 4, 2010 (gmt 0)

10+ Year Member



Just tried without the last join, no difference.

explain SELECT ds_item.iid, title, sale, price_list, price_sale, price_sale_avg, ds_item.store_id, description FROM ds_item
INNER JOIN ds_item_cat AS c1 ON ds_item.iid=c1.iid
INNER JOIN ds_item_cat AS gender ON ds_item.iid=gender.iid AND gender.cat='4118'
INNER JOIN ds_item_cat AS brand ON ds_item.iid=brand.iid AND brand.cat='715'
INNER JOIN ds_item_cat AS color ON ds_item.iid=color.iid AND color.cat='4127'
WHERE status='0' AND c1.cat IN ('4160')
ORDER BY sale DESC, ds_item.dupdate DESC
LIMIT 0, 10

explain shows 152341 rows being scanned.

LifeinAsia

4:48 pm on Mar 4, 2010 (gmt 0)

WebmasterWorld Administrator lifeinasia is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month



It *might* be more efficient if you combine the ds_item_cat JOINs with something like (may need to do some tweaking):
SELECT ds_item.iid, title, sale, price_list, price_sale, price_sale_avg, ds_item.store_id, description, store_name, currency
FROM ds_item INNER JOIN (SELECT ds_item.iid FROM ds_item_cat AS gender ON ds_item.iid=gender.iid AND gender.cat='4118'
INNER JOIN ds_item_cat AS brand ON ds_item.iid=brand.iid AND brand.cat='715'
INNER JOIN ds_item_cat AS color ON ds_item.iid=color.iid AND color.cat='4127') AS c1 ON ds_item.iid=c1.iid
LEFT JOIN ds_store ON (ds_item.store_id=ds_store.store_id)
WHERE status='0' AND c1.cat IN ('4160')
ORDER BY sale DESC, ds_item.dupdate DESC

You'll have to try and see if that helps any.

syber

4:53 pm on Mar 4, 2010 (gmt 0)

10+ Year Member



I still don't understand why you need to do so many joins of the same table. Can't you filter on gender, brand and color just using a where clause?

LifeinAsia

6:56 pm on Mar 4, 2010 (gmt 0)

WebmasterWorld Administrator lifeinasia is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month



Those are all different rows in the same table. How do you propose to write the WHERE statement?

LifeinAsia

7:02 pm on Mar 4, 2010 (gmt 0)

WebmasterWorld Administrator lifeinasia is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month



Actually, a more efficient way might be something like:
SELECT ds_item.iid, title, sale, price_list, price_sale, price_sale_avg, ds_item.store_id, description, store_name, currency
FROM ds_item INNER JOIN (SELECT iid
FROM ds_item_cat
WHERE cat IN ('4118','715','4127')
GROUP BY iid
HAVING COUNT(*)=3) AS c1 ON ds_item.iid=c1.iid
LEFT JOIN ds_store ON (ds_item.store_id=ds_store.store_id)
WHERE status='0' AND c1.cat IN ('4160')
ORDER BY sale DESC, ds_item.dupdate DESC

enotalone

8:35 pm on Mar 4, 2010 (gmt 0)

10+ Year Member



Thank you LifeinAsia, I tried the sql in your last post, it was saying unknown column c1.cat so I changed it around to:

explain SELECT ds_item.iid, title, sale, price_list, price_sale, price_sale_avg, ds_item.store_id, description, store_name, currency
FROM ds_item INNER JOIN (SELECT iid
FROM ds_item_cat
WHERE cat IN ('4118','715','4127', '4160')
GROUP BY iid
HAVING COUNT(*)=4) AS c1 ON ds_item.iid=c1.iid
LEFT JOIN ds_store ON (ds_item.store_id=ds_store.store_id)
WHERE status='0'
ORDER BY sale DESC, ds_item.dupdate DESC

But now according to explain ds_item_cat needs to scan 1761475 rows.

syber

9:42 pm on Mar 4, 2010 (gmt 0)

10+ Year Member



try correlated subqueries:


SELECT ds_item.iid, title, sale, price_list, price_sale, price_sale_avg, ds_item.store_id, description, store_name, currency
FROM ds_item INNER JOIN ds_item_cat AS c1 ON ds_item.iid=c1.iid
WHERE status='0' and cat = '4160' AND
c1.iid IN (SELECT iid FROM ds_item_cat
WHERE iid = c1.iid AND cat = '4118') AND
c1.iid IN (SELECT iid FROM ds_item_cat
WHERE iid = c1.iid AND cat = '715') AND
c1.iid IN (SELECT iid FROM ds_item_cat
WHERE iid = c1.iid AND cat = '4127')

LifeinAsia

9:43 pm on Mar 4, 2010 (gmt 0)

WebmasterWorld Administrator lifeinasia is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month



See if switching the order helps:
explain SELECT ds_item.iid, title, sale, price_list, price_sale, price_sale_avg, ds_item.store_id, description, store_name, currency
FROM (SELECT iid
FROM ds_item_cat
WHERE cat IN ('4118','715','4127', '4160')
GROUP BY iid
HAVING COUNT(*)=4) AS c1 INNER JOIN ds_item ON ds_item.iid=c1.iid
LEFT JOIN ds_store ON (ds_item.store_id=ds_store.store_id)
WHERE status='0'
ORDER BY sale DESC, ds_item.dupdate DESC

It may be that with your table structure, your original query was the most efficient. How many rows do you have in the ds_item and ds_item_cat tables?

enotalone

9:59 pm on Mar 4, 2010 (gmt 0)

10+ Year Member



Thanks syber, the query works but the explain shows 152698 rows being scanned by ds_item.

Thanks LifeinAsia, works but did not change anything, still ds_item_cat scans 1761588 rows. at the moment ds_item has 480,007 rows, ds_item_cat 1,761,588 rows. I am starting to think like you said that there just might be no better way of doing this.

ds_item_cat has a key "KEY `iid_cat` (`iid`,`cat`)" (table structure in my first post). I am not sure if indexing iid and cat separately will improve anything.

syber

11:00 pm on Mar 4, 2010 (gmt 0)

10+ Year Member



one more query to try:


SELECT ds_item.iid, title, sale, price_list, price_sale, price_sale_avg, ds_item.store_id, description, store_name, currency
FROM ds_item INNER JOIN ds_item_cat AS c1 ON ds_item.iid=c1.iid
WHERE status='0' and cat = '4160' AND
EXISTS (SELECT * FROM ds_item_cat
WHERE iid = c1.iid AND cat = '4118') AND
EXISTS (SELECT * FROM ds_item_cat
WHERE iid = c1.iid AND cat = '715') AND
EXISTS (SELECT * FROM ds_item_cat
WHERE iid = c1.iid AND cat = '4127')

enotalone

11:29 pm on Mar 4, 2010 (gmt 0)

10+ Year Member



Thanks Syber, still the explain shows 152698 rows being scanned. I think either I have to change my keys, table structure somehow or get used to the idea that I have a query who enjoys scanning rows.
 

Featured Threads

Hot Threads This Week

Hot Threads This Month