Now that we’ve shown you some of the problems you might encounter, you need to
learn the ways to find and eliminate these anomalies. This process is known as normalization.
Understanding normalization is vital to working with relational databases.
But, to anyone who has database experience, normalization is not the be-all
and end-all of data design. Experience and instinct also play a part in creating a
good database. In the examples presented later in this book, the data will be normalized,
for the most part—but there will also be occasions when an unnormalized
structure is preferable.
One other quick caveat. The normalization process consists of several “normal
forms.” In this chapter we will cover 1st, 2nd, and 3rd normal forms. In addition to
these, the normalization process can continue through four other normal forms. (For
the curious, these are called Boyce-Codd normal form, 4th normal form, 5th normal
form, and Domain/Key normal form). I know about these because I read about them
in a book. In the real world, where real people actually develop database applications,
these normal forms just don’t get talked about. If you get your data into 3rd normal
form that’s about good enough. Yes, there is a possibility that anomalies will exist in
3rd normal form, but if you get this far you should be OK.
1st normal form
Getting data into 1st normal form is fairly easy. Data need to be in a table structure
and meet the following criteria:
Each column must contain an “atomic” value. That means that there will
be only one value per cell. No arrays or any other manner of representing
more than one value will exist in any cell.
Each column must have a unique name.
The table must have a set of values that uniquely identifies the row (This
is known as the primary key of the table).
No two rows can be identical.
No repeating groups of data are allowed.
10 Part I: Working with MySQL