Forum Moderators: open

Message Too Old, No Replies

Why use association tables?

General question about designing tables.

         

Seidl

6:17 pm on Oct 13, 2009 (gmt 0)

10+ Year Member



I have often seen a database for a blog or e-commerce site using a special table dedicated only to indicating the relationship between fields of other databases. For example, a certain database will have one table for products, one for categories, and one for category-product associations. I have often wondered why people do this--is it for simplicity for the developer or does it enhance performance? To give a query example with the association table:

SELECT product_id, product_associate, category_id, associate_id FROM products, categories, associate WHERE product_id = associate_id AND category_id = associate_id

And an example without the association table:

SELECT product_id, product_category, category_id FROM products, categories WHERE product_category = category_id

Why not simply use the second of these examples and save the trouble of a third table?

LifeinAsia

6:51 pm on Oct 13, 2009 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



Normalization and flexibility.

Let's say a product belongs to 2 categories. How would you set that up without an association table? You could:
A) Have Category1, Category2 fields in the Products table. Problems:
1) You're wasting space if there is only 1 category
2) You'd have to modify the table to add another field if you ever have a product in 3 categories
3) You'd have to check both Category1 and Category 2 against the Categories table (then change your code to check Category3 as well if you ever add another field).

B) Make Category_id a list of category IDs. Ick! Problems:
1) You'll need to make the field large enough to have multiple IDs- you may need to make larger in the future.
2) When joining tables, now you're trying to compare a character list against integer field- messy and inefficient.

With an association table, you can add categories to your heart's content without needing to make DB or code changes.

rocknbil

7:37 pm on Oct 13, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Let's say you have 5000 products. For the purpose of demonstration, let's say the categories are varchar (text) fields, and you have 3000 products in cat a, 1000 in cat b, 1000 in cat c.

So not your table has 3000 rows with "cat a", and 1000 rows each with "cat b" and "cat c." Textual data takes more space, and is slower on searches. With a relational table you have integer fields for these that relate to the unique id of the category, and a small table containing . . . three rows for the category titles.

Reason two: too often I see this

id夷tem_code¦title她ption1她ption2她ption3她ption4她ption5她ption6她ption7她ption8

So 50% of your products have only two options, color and size. So you're carrying around 6 null fields for most of your products. Eve though they are null/empty, it's a waste of space and resources.

And what if you want to add a 9th, 10th option?

Rather than modifying the table, you just add a record to the options table associating it with that product.

The most important reason of all, IMO, is to let databases to what they do by nature of structuring the tables properly, rather than applying a programmatic workaround - L.I.A.'s example of "comma separated values" for multiple items in a field is a perfect example. To search on such a field, you have to complicate your programming even more.

Last, when problems arise, having resources separated from other resource types makes debugging a heck of a lot easier.

Seidl

9:24 pm on Oct 13, 2009 (gmt 0)

10+ Year Member



Thanks for your explanations; everything makes perfect sense now. It may seem odd, but I've never had to build a database with multiple associations, although for flexibility's sake I think I'll do so in the future. By the way, I do use integers fields.

LifeinAsia

9:27 pm on Oct 13, 2009 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



If you are absolutely sure that you will never ever ever do more than a 1:1 relationship between the tables, then you're fine without an association table. In fact, most of mine are that way.

russkern

12:54 pm on Nov 14, 2009 (gmt 0)

10+ Year Member



Cool... I was looking for just this sort of discussion...

So if I am cataloging images and I'm allowing the user to associate each image with as many categories as apply PLUS i need to make sure that image is associated with a particular Client...

I have an IMAGE Table storing any info specific to just the image

A CATEGORIES Table - only storing the category name and ID

And a CLIENTS Table - only storing Client info...

Then It would then be most efficient to create an IMAGES_CATEGORY_CLIENT association table and store JUST the numeric IDs of each property - storing one record for each image-category-client relationship.

Correct?

Although this requires a bit more coding to then present to the screen information that a user can understand, it is definitely better than say...

Adding a category column to my images table and storing all associated categories as an array in this column?

This is a much easier structure (it seems) to update in the future if I need to create a form to allow someone to change category associations.

R

rocknbil

7:41 pm on Nov 14, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Then It would then be most efficient to create an IMAGES_CATEGORY_CLIENT association table and store JUST the numeric IDs of each property - storing one record for each image-category-client relationship.

It depends. :-) if each client is allowed only one image, it would be more efficient and easier to code if you just store the image ID in the client table, or even just the file name and dispense with the images table. If at some point you decide to allow multiple images, a relational table would be better as you can allow an infinite number of images. If you go with solution 1 first and decide to change later, now you have to scrap it and restructure the database, etc. A good example is a "forum avatar" versus a user's uploaded images.

better than say... Adding a category column to my images table and storing all associated categories as an array in this column?

Storing an array is highly inefficient and only has relevance in programming. It's inefficient because you need to have a text field to store

2,3,678,567

and text fields will always be slower than integer fields in search queries. It also presents difficulties in extracting the data in searches and joins; instead of

... where image_id=1234

or

... left join images_table on image_id=some_table.image_id

you have to resort to regexps or like operators and all the considerations that come with it.

You also have to have programming to extract and split/explode on delimiters, extra programming to avoid overwriting existing values, blank entries and trailing delimiters.

If you have multiple images assigned to a category, it **might** be one solution to duplicate the concept above, create a table to join image id against category id, allowing infinite associations with the category.

russkern

9:35 pm on Nov 14, 2009 (gmt 0)

10+ Year Member



OK... in my example, each client would be allowed hundreds of images... and each client be allowed multiple categories...

Thanks for the info...

R