Learn Analytics Engineering

Learn Analytics Engineering

Share this post

Learn Analytics Engineering
Learn Analytics Engineering
The 5 Forms of Normalization

The 5 Forms of Normalization

They get more and more “normal” as they go on…

Jun 06, 2024
∙ Paid
2

Share this post

Learn Analytics Engineering
Learn Analytics Engineering
The 5 Forms of Normalization
1
Share

As data people, we are constantly hearing about normalized and denormalized data. Many of us understand the high-level concept of it, but have we taken the time to understand the different forms?

Well, coming from an untraditional tech background, where I had no formal engineering education, I never did. Until now.

Understanding the different forms of normalization will help you build better data pipelines, especially when deciding on how to ingest your data using a tool like Airbyte.

When you understand the configurations you are setting up, and why you would want to do them one way versus another, you produce high-quality data that you can properly assert your expectations on. You also become better at modeling the data because you know exactly how it’s going to behave, or how it should behave.

Learn Analytics Engineering is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.

In this article, I’ll walk through the 5 forms of normalization and the differences between them. We will then look at some examples of normalization in the wild!

Types of Anomalies

There are a few types of problems that can occur in your database when your data is not properly normalized. Let’s review what these are.

Update anomaly

This occurs when you update a record and unintentionally affect the validity of other records in your table.

Let’s say you have a table with order_id, product_id, and product_name. You have two records that have a product_id of 11 and a product_name of “serenity crayons”. The company decides to rename the product to “calm crayons”. You only update the one record’s product_name but not the other. Now you have two records with conflicting data, making it unclear which one has the correct value.

Deletion anomaly

This occurs when you delete a record in one place but not another. This causes inconsistencies across tables, making it hard to know which records are accurate.

If we think of the table from the previous example and instead want to delete product_id 11, it would be easy to delete one of these records but not the other. Then, we do not know if this product was truly deleted or not.

Insert anomaly

Insert anomalies prevent you from adding important data to your tables because of the lack of that data in another spot. This may occur if you need to add a product to an order record but can’t add the product in the system until someone orders it.

The 5 Forms of Normalization

They get more and more “normal” as they go on…

Keep reading with a 7-day free trial

Subscribe to Learn Analytics Engineering to keep reading this post and get 7 days of free access to the full post archives.

Already a paid subscriber? Sign in
© 2025 Madison Mae
Privacy ∙ Terms ∙ Collection notice
Start writingGet the app
Substack is the home for great culture

Share