homepage Welcome to WebmasterWorld Guest from 54.211.180.175
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Visit PubCon.com
Home / Forums Index / Code, Content, and Presentation / Databases
Forum Library, Charter, Moderator: open

Databases Forum

    
Assistance Normalizing a Database
Normalizing a Database for An E-Commerce Site
RandallK




msg:4060911
 7:32 pm on Jan 14, 2010 (gmt 0)

I just posted a question, that led me to have 50,000 more questions. So I am starting a new thread to help keep things as clean as possible.

I know that Normalization can be confusing, so I am trying to be as clear as I can here. I appreciate any/all assistance that anyone can give.

****************

I have a couple hundred different products for a jewelry line. Around half of those products come in various sizes, which means I've got around 2,500 unique "SKUs" (Stock Keeping Unit).

Each SKU will have one product, but a product may have many SKUs associated. So I believe my 'central' table is going to be the 'SKU' table.

A product may have multiple categories. Originally I was going to put multiple categories into one field and the 'explode' the results (1¦3¦5), but that doesn't jive with 'normalization'.

I didn't take out size into its own table thinking I would just put "N/A" under items that do not have sizes, but I'm not sure that is proper.

I can't figure out if I've gotten this to where it needs to be, or if I need to take it further. I'm not sure how to properly connect these tables, or how/where to place the foreign keys.

Any help would be greatly appreciated!

****************

SKU
----------
SKU ID
SKU Name
SKU Description
Price Adjustment (Certain Sizes cost more than the "Base Price")
Size

Products
----------
Product ID
Weight
Name
Description
Part Number
Base Price
Introduction Date
Active?

Categories
----------
Category ID
Name
Parent ID (To deal with the issue of Sub-categories)

 

syber




msg:4061726
 10:42 pm on Jan 15, 2010 (gmt 0)

The relationship between the SKU and Products tables is one to many, so you need to add Product ID to the SKU table as a Foreign Key.

The relationship between Products and Categories is many to many (products can have many categories and categories can have many products). Therefore, you need to add a join table that has Category ID and Product ID as a compound primary key.

The problem with Size can be solved by having the column allow NULLS

physics




msg:4061768
 12:04 am on Jan 16, 2010 (gmt 0)

Welcome to WebmasterWorld.com RandallK!

You're doing pretty well here I think but I'm a little unclear on something.

Each SKU will have one product, but a product may have many SKUs associated.

Can you give a specific example of SKU's / products?

I.e. is a SKU a certain diamond while a product is a ring with a diamond in it?

rocknbil




msg:4062142
 9:14 pm on Jan 16, 2010 (gmt 0)

Input provided in your thread in Ecommerce
[webmasterworld.com...]

[edited by: engine at 10:00 am (utc) on Jan. 17, 2010]
[edit reason] Tidied up [/edit]

RandallK




msg:4061562
 6:44 pm on Jan 15, 2010 (gmt 0)


System: The following 2 messages were spliced on to this thread from: http://www.webmasterworld.com/databases_sql_mysql/4062904.htm [webmasterworld.com] by engine - 3:58 pm on Jan. 18, 2010 (utc 0)


I have an eCommerce site I am trying to set up the database for. I posted in the database forum, but wasn't really getting any traction there so I do apologize for the x-post.

I've done some additional work on it and I've got something I *think* may work, but I'd really like a second opinion. I'm new to database design, and I want to get it right.

I made a image of the database layout, but I don't think I am allowed to link to an image... is that right?

So here it is in text format, as best I know how to display it. Sorry if I am not using standard conventions, I am new to this.

Thank you in advance for any help you can provide. I really appreciate it.

Table Layout

"Size" 1 --- M "SKU"
"Products" 1 ---- M "SKU"
"SKU" 1 ---- M "SKU_Category"
"SKU_Category" M ---- 1 "Category"

Each individual table:

Size
------
Size_ID
Size

Products
------
Product_ID
Product_Name
Product_Description
Product_Number
Product_ID
BasePrice
IntroductionDate
Active
Weight

SKU
--------
SKU_ID
SKU
SKU_Description
PriceAdjustment
Size_ID
Product_ID

SKU_Category
---------
SKU_Category_ID
SKU_ID
Category_ID

Category
---------
Category_ID
Category_Name
ParentID

rocknbil




msg:4061685
 9:42 pm on Jan 15, 2010 (gmt 0)

Sorry RandallK, not ignoring you over there, just been busy as well as ill lately. :-)

This has been discussed quite a few times, see links below. I think you may be on the wrong track. Look at this:

Size_ID
Size

What if the "option" is not always size? What if, say, you have a ring with various colors of gemstones, and the only thing different is the color and SKU? I have answers below . . .

The one thing I see that's on the right track is

PriceAdjustment

Which is easy to manage. If a price exists in "priceAdjustment" (better named "option_price," below) this takes precedence over the base price in the products table. Furthermore, you are going to need another field, "required option." That is, if it's a required option to select this product, such as size, color, if there is a price here, it overrides the base price, otherwise it's an "add on" and increases the base price if selected ("same ring but with 3 stones, same ring but with four stones.")

Boiling it down, you have two main elements: products and options. these options can be anything - size, color, # stones, metal type, whatever. And you should be able to add options at any time, and the option types should be unlimited, as well as the option values.

You **might** reduce your database size by doing a lookup for existing options and implementing them, like

table option_types
rec_id¦option_name
1¦Size
2¦Color
3¦Stone
4¦Metal
So that you have a small table for these names. Examples, this might only contain size, color, stone, metal . . . then you would have

table option_values
rec_id¦option_type_id¦option_value
1¦2¦Red
2¦2¦Green
3¦2¦Blue
4¦4¦Gold
5¦4¦Silver
6¦3¦Diamond

Now we get to it: the options table. There are a couple things to note about this.

- You can have **unlimited**, or zero options for any given product
- You can have **unlimited**, or zero option values for any given product
- Your option table values are all numeric, making it light speed faster than text lookups
- The values are often "recycled" from the short and sweet option types and option values tables

The down side, which really isn't one: this is going to make for some pretty complex joins in your selects.

So a simple scenario:

products
rec_id¦product_id¦title... etc.

options
rec_id¦product_id¦sku¦option_type¦option_value¦required¦option_price
1¦123¦AZ123¦2¦1¦1¦0.00

Our "single product" with ID 123, is a ring with a color option of blue, and blue is a required selection, but it does not alter the base price. You would have multiple rows for this product to select size, etc.

It may be difficult to get your head around this, but the worst thing you can do is try to predict the option types you will have - this locks you in and once done, you're stuck with reprogramming it for "exceptions." Program for them now. :-)

Some more examples I've posted on this very topic, and only slight variations in approach:

Hammers and Nails [webmasterworld.com]
The unflappable Widget [webmasterworld.com]

RandallK




msg:4067391
 5:31 pm on Jan 25, 2010 (gmt 0)

I *really* want to link to a picture of the database structure I've come up with based off of these recommendations. Is there anyway to do it 'legally'?

whoisgregg




msg:4067571
 8:30 pm on Jan 25, 2010 (gmt 0)

There are some rare circumstances where a link to a picture would be helpful. However, one of the reasons WebmasterWorld still doesn't allow links to images/screenshots/etc. is because those links tend to rot very quickly.

Someone who comes back to a thread in a year that is crucially helpful to them may find that the "key" is in the image, and the link to that image is dead. I think we've all been there and realize how frustrating that is... I've even done that a few times on other forums where it was *my* link that I let die.

It takes a bit more work to talk it all out, but it helps a lot more people down the road. :)

RandallK




msg:4070092
 10:50 pm on Jan 28, 2010 (gmt 0)

I will try my best to explain this in words. Any help is greatly appreciated. I think I am close. I am going to ignore the 'category' part of this database since I am pretty sure I've got that nailed. I've taken the advice above, read the other threads, and came up with this. I feel close, I also think I might be a little off.

I have 5 Tables. SKU, Product, Options, Option_Type, Option_Value.

Table SKU has a relationship with Products and Options. There are many SKUS and each sku will have exactly 1 product, but might have many options.

The product table is only connected to the SKU table. It lists the name, description, price, etc..

The Option table is then related to two other tables. Option_Type and Option_Value. I'm not comfortable with the way this is linked to SKU, I think I might have it incorrect. Am I going too crazy here with Foreign Keys?

Listed out we have:

SKU
-----------
SKU_id(index)|SKU|SKU_description|product_id(fk)|option_id(fk)

Products
-----------
product_id(index)|product_name|product_description|product_number|base_price|intro_date|weight|active|

Options
-----------
option_id(index)|option_type_id(fk)|option_value_id(fk)|sku_id(fk)|price_adj|

Option_Type
-----------
option_type_id(index)|option_name

Option_Value
-----------
option_value_id(index)|option_type_id(fk)|option_value

Thank you so much for anyone taking time to help me think this through.

rocknbil




msg:4071542
 6:47 pm on Jan 31, 2010 (gmt 0)

The real way to test this is throw some garbage data in the table and experiment with some selects. Also, I think your idea many be rounded out a little more if you start visualizing other tables in your system - for example, inventory and purchase tables.

The one things I see, and am not sure about, is the management of SKU's.

There are many SKUS and each sku will have exactly 1 product, but might have many options.

So you are saying, different options do not affect the sku? Isn't the SKU a unique identifier in terms of cataloging? (not in terms of your database.)

There are two veins of thought here:

widget large blue SKU W001LB
widget large red SKU W001LR
widget large yellow SKU W001LY

It seems like this is the best case, as it allows you to uniquely locate a product variation by the unique SKU. W001LR identifies the widget W001 in the variation of large and red, different than the other widgets.

In this scenario, I maintain you'd have an SKU in both the products table and the product options table, which SKU is applied to the purchase is dependent on what they select. Returning to your original post,

Around half of those products come in various sizes, which means I've got around 2,500 unique "SKUs" (Stock Keeping Unit).

Following the above train of thought, size is an option, like # of stones or color. So I am still seeing these as one product, it's the option that has the unique SKU. So I'd probably dispense with the SKU table, move the SKU into the option for this product. If there's no "base SKU," leave it blank.

The other idea, as you've laid out, no unique SKU for varying options. I can't get my head around how you'd easily identify an option selection. For example, someone orders W001 in large and blue. So you'd have to store these in the purchase tables somehow. What you will have to do, then, is store the selected product options in the purchase, probably a related table, to look them up each time you need them.

With a unique SKU for each product combination, you'll only have to store a single value in the purchase table, the SKU. Without that, you'd need a relational table for selected options, which might complicate your selects a bit more.

At any rate, if the above is true and works for you, one SKU for each product, I don't see a need for a separate SKU table, I'd just put it in the products table.

RandallK




msg:4072856
 5:17 pm on Feb 2, 2010 (gmt 0)

One sku for each product is correct. The issue I have some products do not have sizes and thus there are no 'sub-skus'. Widget A doesn't come in any shapes or sizes so the 'product id' from the manufacturer and the 'sku' are the same. SKU is W001 AND Product Number is W001. That is it.

There are a number of widgets that DO have options though. This manufacturer at this time ONLY has one option of Size, but I take your earlier point and I do want to set this up so that if in the future they add a new option, I can add it without a complete redesign. So Widget B has a product number of W002, but related SKUs W002-5, W002-55, W002-6, etc... which align with the different sizes. We call these "Sub-Skus".

*I* am having trouble getting my head around this:

So I'd probably dispense with the SKU table, move the SKU into the option for this product.

If there is no SKU table, and the SKU is located in the options table, what about those products that do not have a "sub-sku"? I still need to have the SKU information, don't I?

And what about options that DON'T affect the product? Like "finish" or "stone". I'd like to have the information about what other features the product has, but they do not affect the SKU in anyway. Should they still be a part of the options table?

And not to throw *another* question out there... but you mentioned an inventory table. My original though was to maybe have a "QOH" field in the SKU table... is that not the correct way to think of it?

rocknbil




msg:4073000
 8:26 pm on Feb 2, 2010 (gmt 0)

OK, let's take this out of context for a second just to clarify the concept. For one, your tables are on the other page. :-) Another, any complex task can be broken down into a collection of simple sub-tasks, this is a "sub task."

I have some products do not have sizes and thus there are no 'sub-skus'. Widget A doesn't come in any shapes or sizes so the 'product id' from the manufacturer and the 'sku' are the same. SKU is W001 AND Product Number is W001. That is it.


Right, then,

There are a number of widgets that DO have options though. This manufacturer at this time ONLY has one option of Size, but I take your earlier point and I do want to set this up so that if in the future they add a new option, I can add it without a complete redesign.


Also right. Consider this simplification:

products
id|prod_id|sku|title|price

options
id|prod_id|sku|option_name_id|option_value_id|price|required

Take your first scenario:

The issue I have some products do not have sizes and thus there are no 'sub-skus'


I'm going to make this single select statements for clarity. In real application, they are likely to be joins and more efficient methods.

select * from products where prod_id=1234;
-> 12|1234|W001|Some widget|24.95

Now,

select count(*) from options where prod_id=1234;
->0

So:
- the output for ordering this item has no options, just a quantity field
- it's price is $24.95
- It's SKU is W001
- There are no options. We're done.

Step it up a bit:

select * from products where prod_id=1235;
-> 17|1235|W002|Some other widget|0.00

select count(*) from options where prod_id=1235;
->5

Since option count > 0, we now do this:

-- select * from options where prod_id=1235;

-- (I'm just putting the values you'd get from a join on the option names and option values table:)

-- 1007|1235|NULL|Size|10|0.00|1
-- 1008|1235|NULL|Size|11|0.00|1
-- 1009|1235|WOO2-D|Stone|Diamond|109.95|1
-- 1010|1235|WOO2-Z|Stone|Zirconium|19.95|1
-- 1011|1235|NULL|Engrave Name|30 characters|9.95|0

- Note base price is 0.00.
- The output for this form has two select lists for the required options Size and Stone. The SKU is assigned based on the stone they select. Another way to go: "create" an SKU that is defined by the options selections. That is, with the base as W001, the SKU's for stone would be D or Z. The SKU's (that are NULL in the example above) for size 10 and 11 could be . . 10 and 11. So when selecting, the SKU could be W001-10-D, W001-11-Z, etc. (Not a great way to work, as the SKU is not actually stored anywhere because it traverses several rows.)
- The form has a checkbox representing the non-required option Engrave Name.

So when the user selects Size 10, Diamond, and checks Engrave Name, your programming does this:

if ( selected option is required and price > 0 ) { price = selected option price }
if (selected option is NOT required and price > 0) { price = price PLUS the selected option }

An additional bonus, if the user submits this form without selecting the required options, this is a trigger to trap this user error in both the server side programming andd Javascript: "The size option is required."

Does that clarify it at all?

So I'd probably dispense with the SKU table, move the SKU into the option for this product.


This is because I didn't understand the problem, I see it now. Per the above: a product with no options will have no SKU, so as you see, the SKU comes directly from the products table.

My original though was to maybe have a "QOH" field in the SKU table... is that not the correct way to think of it?


You're going to have some of this option combo in stock, which will differ from that combination, and you're probably going to want to tally up all the W001's regardless of options. Based on that alone, sure, an added field - in both the products base and options - will work. However . . . when you really start looking at the needs of inventory tracking, you will want to know what's been sold, and what is "pending" - orders placed that have out of stock items in the order, so they'e not really sold yet but they're not in stock. When orders are cancelled, or returned, you will need to adjust stocks accordingly. It's much more efficient to have more skinny tables comprised of integer data types than really wide tables. And last, you won't **always** need acces to the inventory data, so it removes the overhead of carrying extra data around with select * from. So I like to keep my inventory tables - stock, pending, sold - separate from the product data.

But that . . . as a whole different sub task. :-P

RandallK




msg:4073055
 9:41 pm on Feb 2, 2010 (gmt 0)

Thanks for the thorough reply. I believe I understand where you are coming from.

So let's imagine a world where I want a simple inventory system for now.

I would have a QOH field in both Products and Options, and if options exist I would leave the QOH in Products as NULL and then look in the Options table to find the quantity for the specific option?

RandallK




msg:4073599
 4:03 pm on Feb 3, 2010 (gmt 0)

Okay, and also, if I wanted to do a simple "inventory" table... I'd need basically every SKU listed there and the number in stock, BO'ed etc...? But since there is no 'sku' id per se... what is the best way to construct that? Just use the SKU?

Sorry for so many questions, but the method you've presented to me is a bit different than what I originally had planned, so its a little hard to adjust to new ideas.

RandallK




msg:4096447
 4:37 pm on Mar 12, 2010 (gmt 0)

I thought I'd pop back in and say what I finally did for future reference. Thanks to rocknbil for all the help.

I have a table for the products that contains most of the important information. Title, Description, Base Price, etc... but most importantly "Parent SKU".

There are three tables to deal with products that have options that will affect the final SKU: option_type, option_values, and product_option. If a product has options it is listed here, if it does not there are no entries.

Lastly there is a table called "inventory" which stores all possible SKUs, the associated product, the QOH and the price_adjustment (if there is no price adjustment the value is 0).

When a product is selected it first checks for options. If there are no options, its sets "SKU" to the parent_SKU and then checks the inventory table for the QOH.

If there ARE options it creates a drop down for all possible options for that product. If any of these options have a price_adjustment the "Price" is listed as "Starting at: $base_price". If not then it just lists "Price: $base_price".

Until we know the final SKU, availability is listed as "Please select a size".

Once an option is selected we compare the value passed to a list of acceptable options and then set "SKU_modifier" to the proper value as determined by the vendor (So if the customer selects "Size 8" the SKU_modifier is set to "080". After the SKU_modifier has been found it builds the SKU by taking the parent_SKU and adding the modifier the the end. We then look in the inventory table for QOH and a price adjustment (if it exists).

Two things about this method I really like: 1) Each SKU exists as a physical row in a table 2) There are no NULL values which makes the BCNF in me happy.

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Code, Content, and Presentation / Databases
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved