In the field of relational database design, normalization is a systematic way of ensuring that a database structure is suitable for general-purpose querying and free of certain undesirable characteristics—insertion, update, and deletion anomalies—that could lead to a loss of data integrity.
On a basic level, what that means is that you don't want data that you may update in more than one table. Similarly, you want to have common data separated into a different table to reduce duplicate data.
EmployeeName | EmployeePhone | EmployeeID
Bob Bobbins | 555-5555 | 5
EmployeeName | EmployeePhone | Position
Bob Bobbins | 555-5555 | Catcher
If Bob ever changes his phone number, you have to update the phone number in every single table where Bob's record may appear. (Similarly, if an employee changes his/her name, you would also have to update every field where Bob's record may appear.)
The way to normalize the second table would be as follows:
EmployeeID | Position
5 | catcher
If Bob Bobbins changes his phone number or name, you only update the Employees table.
EmployeeID | Street | City | State | ZIPCode
5 | 123 Main Street | Smallville | CA | 99987
7 | 321 1st Street | Smallville | CA | 99987
You know that the 99987 ZIPCode is for Smallville and that Smallville is in California. So you can normalize as follows:
EmployeeID | Street | ZIPCode
5 | 123 Main Street | 99987
7 | 321 1st Street | 99987
City | State | ZIPCode
Smallville | CA | 99987
(Again, if the city name changes for the 99987 ZIPCode, you only need to update the ZIPCodes table.)
There are times when you don't want to completely normalize your database. In Example 1, maybe you want to use Bob's home phone number for company records, but his cell phone for the softball league. In example 2, one ZIP Code may have 1 official name and 1 or more acceptable names and a person may prefer to use the unofficial name.