Now that you’ve loaded your raw data into Snowflake, you are ready to transform it! This is where the real magic happens.
Raw data is great but doesn’t become helpful until properly transformed into a usable form. This especially goes for raw data that is messy! You need to cast it to the correct datatype, rename fields to follow naming conventions, and compute fields that don’t exist in the raw data.
In the last email on data ingestion, we talked about the incremental method for loading data. You can also build your models incrementally using raw data already ingested, helping increase the performance of your data models.
In today’s issue of Data Pipeline Summer, we will discuss the benefits of building data models incrementally, how incremental syncs work, and how to write one using dbt.
As always, we will end with a challenge for you to complete as the next step of building your data pipeline! By the end, you will have built an incremental model in dbt using the raw data you ingested into Snowflake last week.
Let’s dive in!
Keep reading with a 7-day free trial
Subscribe to Learn Analytics Engineering to keep reading this post and get 7 days of free access to the full post archives.