Forum Moderators: coopster

Message Too Old, No Replies

Making searchable "tags"

Using mysql to make searchable tags

         

nfs2

8:24 am on Mar 14, 2006 (gmt 0)

10+ Year Member



i run a blogging site, and i was thinking of making a searchable tagging system, where people can "tag" entries with certain keywords (i'll assume most or all of you know what tagging is).

The way i'd like it to work is one text box, and people would be able to enter their tags simply by seperating the words, for example, a blog post about a vacation would be tagged like below

vacation plane beach mexico

and each of those words would be seperatly inserted into a database to be searched later.

Would there be a way to insert seperated words like that into a database so they could be individually searched? Im thinking in mysql if all of them are in the same table, id have to select all or none of them...

Any ideas?

tomda

8:50 am on Mar 14, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Yeah possible!

There are two different manners.

1st - SIMPLE - 1 DB - ADD TAGS TO ONE FIELD AS ARRAY
****************************************************
Just add all the tags to one field so that you get
BLOG_ID / TAGS
12 / plane recreation mexico

Then use explode() and implode() to work with the array.
And SELECT * FROM db WHERE tags LIKE '%recreation%' to search.

2nd - BIT HARDER - 2DB - USE OF A KEYWORDS DATABASE
***************************************************
The problem with the first method is that you can't control the keywords typed by users - which can be a mess in the long term and oblige you to check all blogs' tags - and that you use the LIKE statement to search - which can be slow for heavy databases.

The other method is to have one database full of keywords. Note that you can also try to have parent/children relation (optional)
KEYWORD_ID / KEYWORDS_NAME / PARENT
8 / vacation /
9 / transport / 8
10 / car / 9
11 / plane / 9
12 / train / 9

Have radio buttons that user can tick to tags pictures. Then, climb the parent tree to include all keywords.

1st choice - Do like in the first example, that is include all keywords in one field of your blog database - same problem as above, you'll have to use "LIKE" statement to search.

2nd choise - Create a new database to link the two databases (keywords and blog) - much faster to search since you'll use "="
e.g. blog No 12 is tagged "transport car plane"
KEYWORDS_ID / BLOG_ID
9 / 12
10 / 12
11 / 12

My 2cents

henry0

1:27 pm on Mar 14, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



It could be a risky proposition
If without control and authentication a form/box is offered to the general public.
Watch for SQL injection
Your concept has merits but you need to secure the process.

nfs2

8:33 pm on Mar 15, 2006 (gmt 0)

10+ Year Member



I was thinking about making a table called tags, with one column being the persons username, and the other column being the tag. So if i enter

vacation mexico plane fun

It would create 4 rows where username = username

My problem is how to i seperate the keywords that come from one text box? How do i enter them as seperate values rather then one single value (because they come from the same text box)

"It could be a risky proposition
If without control and authentication a form/box is offered to the general public.
Watch for SQL injection
Your concept has merits but you need to secure the process. "

I've always wondered about sql injection, people always mention it. I'll google more info (id like to research before asking) but if you have any suggestions it would be helpful :)

henry0

9:07 pm on Mar 15, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Nfs2,
Use our PHP forum G box (top)
we have discussed it many times, if I remember correctly you should find here some good info.

nfs2

9:36 pm on Mar 15, 2006 (gmt 0)

10+ Year Member



Thanks i'll do that, first i'll focus on making my script work, then making sure its safe.

Right now i have a table with 3 columns. The table is called "journal_tags" and the tables are called "journal_id" (the username of the member) "entry_id" (the unique value assigned to the entry) and of course "tag".

The way my script is working now, it just throws all 4 tags into one row with the journal_id and entry_id.

 $tags_form = ($_POST['tags']);
$table2 = 'journal_tags';
$sql = "INSERT INTO $table2 (journal_id, entry_id, tag) values ('$journal_id','$entry_id','$tags_form')";
mysql_query($sql) or die(mysql_error());

Now, i tried exploding it so it but it just inserts the value "Array"

$tags_form = ($_POST['tags']);
$tags = explode(" ", $tags_form);

Now, i could just do $tags[1] $tags[2] etc, but there could be any number of tags and i dont want to limit it. Is there a way?

Ok i really suck at explaining things clearly. The table has all 4 tags in one row. So the value is all 4 tags unseperated. Lets call them tag1 tag2 tag3 tag4. Now if i were searching to tag3, is there a way to pick it out when its in there with all the rest? because the way it is, nothing == tag3, it only equals tag1 tag2 tag 3 tag4.

I hope i explained it well

henry0

11:48 pm on Mar 15, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



First you should be sure that you properly enter that array in your DB.

I did not test it, but you may get a feeling fot it
also look at the second part we use here "unserialize" function instead of implode/explode
Play with this it should get oyu on your way!
<<<
$array;
// what will go in your DB

$str = implode("`/`", $array);
// saves the array as a string ; separe pieces with:/

//use that $str to load input in DB

//get result:

// using $str;

$array = explode("`/`", $str);
// splits $str into array elements and del `/`.

>>>

<<<
//using your DB

if($numrows>0)
{
while($str=mysql_fetch_array($sql))
{
$str=unserialize($str['str']);

foreach($str as $tags)
{
print $tags.', ';
}
}
}
else
{
print 'No search terms in database.';
}
>>>

<edit> I would (if it was me) limit the search terms to 4 or 5 and have a col per tag1, tag2, tag3 etc...
Do not forget testing input to avoid garbage
grab a list of bad words, dislallow html char etc.. etc..
</edit>

tomda

6:25 am on Mar 16, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Your concept has merits but you need to secure the process.

Always! I didn't mention it because I was just explaining the concept.

nfs2

3:11 pm on Mar 16, 2006 (gmt 0)

10+ Year Member



<edit> I would (if it was me) limit the search terms to 4 or 5 and have a col per tag1, tag2, tag3 etc...
Do not forget testing input to avoid garbage
grab a list of bad words, dislallow html char etc.. etc..
</edit>

Id like to allow as many tags per entry as the poster sees fit. Wether it be zero or 100. I doubt people will assign nearly tha many tags to an entry though.

About your last point about cleaning, thats what im struggling with now (although im sure ill figure it out soon). The following code is what im using;

$table2 = 'journal_tags';
$tags1 = trim($_POST['tags'],"\x7f..\xff\x0..\x1f");
$tags = split(' ', $tags1);
foreach ($tags as $tag) {
$sql = "INSERT INTO $table2 (username, entry_id, tag) values ('$username','$entry_id','$tag')";
mysql_query($sql) or die(mysql_error());
}

This does exactly what i want, inserting a new row for each tag. This allows me to make use of indexes whereas the LIKE %tag& would be inefficent and mostly un-exact. Tags need to be exact. Not just search terms. searching for the tag "book" should return only entries marked with "book" and nothing else. No iBook, no Books, just "book".

The code works fine if people seperate their words with one space, but if 2 or more spaces are inserted, it creates a new row with a blank tag. Im trying to trim it with no luck..

henry0

3:33 pm on Mar 16, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



At least I am glad that we did lead you to some positive

Double spacing
Google exactly
"php trimming double spacing"
review the second link

Another option will be using ereg();
and preg_replace();
But that's not my forte, so it would be great to have some help :)

nfs2

3:43 pm on Mar 16, 2006 (gmt 0)

10+ Year Member



Thanks for your help, but i fixed it already :)

$tags1 = trim($_POST['tags'], " ");
$tags = split(' ', $tags1);
foreach ($tags as $tag) {
if ($tag!== "") {
$sql = "INSERT INTO tags (username, entry_id, tag) values ('$username','$id','$tag')";
mysql_query($sql) or die(mysql_error());
}
}

that was the easy part. Now i need to study up on sql injection.. My whole site is about inputting data so i need to make it safe :)

Thanks again

henry0

1:07 pm on Mar 19, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



While searching for an old post
I found this one:

SQL injection
[webmasterworld.com]

Jordo needs a drink

9:57 pm on Mar 24, 2006 (gmt 0)

10+ Year Member



This does exactly what i want, inserting a new row for each tag. This allows me to make use of indexes whereas the LIKE %tag& would be inefficent and mostly un-exact. Tags need to be exact. Not just search terms. searching for the tag "book" should return only entries marked with "book" and nothing else. No iBook, no Books, just "book".

I use tags and store all the tags into one field. I get over the LIKE problem you talk about by using

tag LIKE '% tag %' OR tag LIKE 'tag %' OR tag LIKE '% tag'
Notice that your storing the spaces so you use them in your query to keep the iBook example from happening. The no Books example you have above is a non issue because you're using spaces as your delimiter.