Normalization is the process of organizing data efficiently in a database and has two primary goals; the elimination of redundant data and ensuring that data dependencies are correct by having only related data stored in the same table. These goals are equally important since they ensure that data is stored in a logical manner and reduce the overall size of the database.
The Normal Forms
The database community has established formal normalization guidelines referred to as “normal forms” and are numbered from the lowest “First Normal Form (1NF)” to the highest “Fifth Normal Form (5NF).” In many cases you see first through third Normal Forms, occasional fourth, but rarely see the Fifth so we will leave it out of this discussion.
It is extremely important to mention that these are only guidelines and there are always exceptions to the rule. You might need to stray occasionally from the guidelines due to practical business issues, but if you do, it is important that you evaluate their potential impact on your account and system of possible data inconsistencies.
The following standards are extracted from Wikibooks.
First Normal Form (1NF)
First Normal Form (1NF) sets the very basic rules for an organized database as follows:
- Eliminate duplicate columns from the same table.
- Create separate tables for each group of related data and identify each row by using a unique column or set of columns (i.e., primary key).
Second Normal Form (2NF)
Second Normal Form (2NF) further addresses the concept of removing duplicate data as follows:
- Meet all the requirements of the First Normal Form.
- Remove subsets of data that apply to multiple rows of a table and place them in separate tables.
- Create relationships between these new tables and their predecessors through the use of foreign keys.
Third Normal Form (3NF)
Third normal form (3NF) goes one step further as follows:
- Meet all the requirements of the Second Normal Form.
- Remove columns that are not dependent upon the primary key.