Master the 4-Step Dimensional Modeling Process
#4: Data Pipeline Summer- Including a hands-on project teaching you how to create your own dimensional data model using business data
Dimensional data modeling is one of the foundational skills that every analytics engineer needs to know. However, I didn’t start my career knowing anything about this.
Like most people, I started with learning modern data stack tools, looking for the best tool to solve a certain problem. While that worked at the time, I quickly realized the importance of understanding the principles outside of any tool.
Dimensional data modeling allows you to build models with performance and analytics ease of use in mind. When you’re working with a small amount of data, you can get away with not thinking about these two things. However, as your data grows, it becomes essential to scale.
Unfortunately, free datasets on the internet don’t mimic actual business problems, making data modeling hard to master. Luckily, our Data Pipeline Summer Trail Trekker project does. The problem we’ll solve in this part’s challenge looks similar to problems I’ve faced in my full-time analytics engineering roles.
In the last part of Data Pipeline Summer, we learned the differences between traditional data transformation tools and SQLMesh. We discussed the main benefit of SQLMesh, which is how it builds models, only focusing on models affected by code changes.
We also walked through the creation of a SQLMesh project, including setting up our staging models for Trail Trekker.
If you missed that, be sure to check it out here so you can get your project up to speed.
What is Dimensional Data Modeling?
Dimension modeling delivers on two things- fast query performance and data that’s understandable by the business. Simplicity, speed, and usability.
Keep reading with a 7-day free trial
Subscribe to Learn Analytics Engineering to keep reading this post and get 7 days of free access to the full post archives.