Forum Moderators: coopster

Message Too Old, No Replies

Storing and selecting array in SQL (serialize & unserialize)

Looking for an efficient way to select element of an stored aray

         

tomda

11:20 am on Jul 4, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I have a picture gallery and every pic is tagged using array.
e.g. pic_array("blue","red","black","green")

METHOD I USE
************
I currently store my array in my db as a string (longtext), meaning that I implode pic_array and just insert the string.
So I have in my db "blue red black green".

If I want to do a search for pictures with red and black, I do:
SELECT pic_id,color FROM pic_gallery WHERE color like '%red%' OR color like '%black%';

It works great but as you may know the LIKE statement is not the best option as it is very slow (especially if I have thousands of pic).

I am looking for an efficient and fast way to output all pictures tagged "red" - something like in_array("red", $array) - in a SQL query and without using the LIKE statement.

I know that I could serialize my array and store it, but this doesn't help because even if MYSQL can store array it can't search for one element of the array, presumably because serialized arrays are store in a string format (varchar() or blob/text).

bennymack

1:01 pm on Jul 4, 2005 (gmt 0)

10+ Year Member



Hello, my inclination would be to set up a separate table to hold the different colors and then select the intersection of the two tables.

E.G.
CREATE TABLE pic_gallery (
pic_id INT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY,
INDEX (pic_id),
pic BLOB /* if you wanted to store the pic in the DB/*
)TYPE=INNODB;

CREATE TABLE pic_colors (
pic_id INT UNSIGNED NOT NULL FOREIGN KEY REFERENCES pic_gallery(pic_id) ON DELETE CASCADE,
ENUM color ('red','green','blue',...) NOT NULL,
INDEX (pic_id),
INDEX (color)
)TYPE=INNODB;

Then to select all pics with red:
SELECT pic_gallery.pic_id,pic_gallery.pic
FROM pic_gallery,pic_colors
WHERE pic_gallery.pic_id=pic_colors.pid_id
AND pic_colors.color='red'; /* color=0 would work as well since enums are also enumerated arrays */

So if there were 5 pics that had a row in pic_colors that equaled 'red', you would now have them in your result set.

tomda

2:35 pm on Jul 4, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Hi Bennymack,

Not sure you understood me clearly. Nevertheless, your suggestion is approximately what I currently have.

1/ A database of keywords (blue, black, green, red, sky, tree, grass...)
2/ A database of pictures

When someone submit a pic through my GD Gallery, I tag the picture using checkbox[].

So one picture can have many keywords or their respective id ("blue sky green grass" or "1 5 3 6" for example). These keywords/id are inserted in my picture_db (in tags field).

If I want to do a search for pictures with green and grass, I do:
SELECT pic_id FROM picture_db WHERE tags like '%green%' OR color like '%grass%';

************************
I know, as you suggested it and if I got it right, I could do it the otherway round, that is store an array of pic_id for every keywords.
green -> 1 3 5
grass -> 2 3 4

So that the picture No3 is the only one showing "green grass"

********************
Thanks. I think that I will stick to the LIKE statement for the moment. It is a pity that there is not something similar to in_array() in MYSQL queries.

bennymack

3:02 pm on Jul 4, 2005 (gmt 0)

10+ Year Member



Here's some light reading for you.

[webopedia.com...]

[en.wikipedia.org...]


Not sure you understood me clearly. Nevertheless, your suggestion is approximately what I currently have.

1/ A database of keywords (blue, black, green, red, sky, tree, grass...)
2/ A database of pictures

In your database of keywords, you're storing them multiple times for each picture. In my suggestion, the 'keywords' are split up, one per row so you'd end up with pic_colors:
pic_id, color
1, purple
1, hazy
1, eye
2, green
3, yellow
so on,

So now there's several clever ways to select the info out of the database. Storing all the keywords in one string in the DB is not exactly optimal. If you use a MyISAM you could index the field or do a fulltext on it. But with my method you're really just selecting integer values.. Much quicker IMHO.


When someone submit a pic through my GD Gallery, I tag the picture using checkbox[].

So one picture can have many keywords or their respective id ("blue sky green grass" or "1 5 3 6" for example). These keywords/id are inserted in my picture_db (in tags field).


The latter seems a little bit better but why store 4 different keywors/id's in one row? have four rows in the pic_color table for the one image to store its description.


If I want to do a search for pictures with green and grass, I do:
SELECT pic_id FROM picture_db WHERE tags like '%green%' OR color like '%grass%';

************************
I know, as you suggested it and if I got it right, I could do it the otherway round, that is store an array of pic_id for every keywords.
green -> 1 3 5
grass -> 2 3 4

So that the picture No3 is the only one showing "green grass"


I fail to see where your two tables are being used in the above query. How about:
SELECT picture_db.pic_id, picture_db.pic
FROM picture_db,pic_keywords
WHERE picture_db.pic_id=pic_keywords.pic_id
AND pic_keywords.tag='red' /* or the index of 'red' */
OR pic_keywords.tag='fuzzy';


It is a pity that there is not something similar to in_array() in MYSQL queries.

You could, for example, update the ENUM every time a new color/description is added and that will store all the colors. Then select that into an array in you PHP and then simply us the array in conjunction with your selects.

Sorry for the lengthy post. I'm at work on the 4th and it's completely dead so I have a lot of spare time.

tomda

8:54 am on Jul 5, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



No problem for the long post...
Hope you have some spare time today :)

I am aware of normalization table (at least 1NF) but because I am still learning I am difficulties to make more advance normalized tables and queries with multiple tables.

Before closing the topic, you suggest me to do the following:

TABLE KEYWORD - which I have
keyword_id / keywords
1 / green
2 / grass
3 / blue
4 / sky

TABLE PICTURE - which I have also
pic_id / pic_name / width / height / etc...
1 / picture1.jpg / 240 / 160 / ...
2 / picture2.jpg / 240 / 160 / ...

And a third table as follows
e.g. if picture1 is tagged "green grass blue"

THIRD TABLE
pic_id / keyword
1 / 1 (that is green)
1 / 2 (that is grass)
1 / 3 (that is blue)

Am I wrong?
Again, thank you

bennymack

4:20 pm on Jul 5, 2005 (gmt 0)

10+ Year Member



The third table is unnecessary. Just take this table:

TABLE KEYWORD - which I have
keyword_id / keywords
1 / green
2 / grass
3 / blue
4 / sky

And have multiple keyword rows per pic:


1 / green
1 / grass
2 / blue
2 / eye
3 / blue
3 / beer

So pic_id:1 is "green" and "grass"
pic_id:2 is "blue" and "eye"
pic_id:3 is "blue" and "beer"
etc...

Still slow at work. This is my last day here. I start a new perl programming job a week from today. I don't want to say where but it's an Internet advertising company and it's been "reviewed" on WebmasterWorld before! Can't wait!

Sarah Atkinson

4:38 pm on Jul 5, 2005 (gmt 0)

10+ Year Member



I disagree with Benny I think having 3 tables is better

THIRD TABLE
pic_id / keyword
1 / 1 (that is green)
1 / 2 (that is grass)
1 / 3 (that is blue)

This way you can have a list of all of your key words in one list. It will make it easier to add keywords and delete them. then you will just use joins to get your data

Also to me this just seems neater and more organized. But I supose that could be a matter of personal opinion.

[edited by: jatar_k at 5:02 pm (utc) on July 5, 2005]
[edit reason] removed url [/edit]