The Fourth Normal Form (4NF) is an advanced level in the normalization process, aiming to handle certain types of anomalies which aren't addressed by the Third Normal Form (3NF). Specifically, 4NF addresses multi-valued dependencies.
A relation is in 4NF if:
1. It is already in 3NF.
2. No multi-valued dependencies exist. A multi-valued dependency occurs when an attribute depends on another attribute but not on the primary key.
To clarify, consider a relation R with attributes X, Y, and Z. We say that there is a multi-valued dependency from X to Y, denoted \(X \twoheadrightarrow Y\), if for a single value of X, there are multiple values of Y associated with it, independent of Z.
Let's illustrate 4NF with a scenario involving students, their hobbies, and the courses they've taken:
Initial Table
| Student_ID | Hobby | Course |
|------------|------------|------------|
| S1 | Painting | Math |
| S1 | Painting | Physics |
| S1 | Hiking | Math |
| S1 | Hiking | Physics |
| S2 | Reading | Chemistry |
| S2 | Reading | Biology |
In the table:
To bring the table to 4NF, we can decompose it into two separate tables:
StudentHobbies Table:
| Student_ID | Hobby |
|------------|------------|
| S1 | Painting |
| S1 | Hiking |
| S2 | Reading |
StudentCourses Table:
| Student_ID | Course |
|------------|------------|
| S1 | Math |
| S1 | Physics |
| S2 | Chemistry |
| S2 | Biology |
With this separation:
There are no more multi-valued dependencies. This setup not only reduces redundancy but also prevents the possibility of certain types of inconsistencies and anomalies in the data.
For most practical applications, normalization up to 3NF or BCNF is often adequate. However, when specific types of redundancy or data anomalies are a concern, proceeding to 4NF or even 5NF can be beneficial.