For the longest time, I would write SQL queries based on what I thought was the most performant. No rhyme or reason, I would use the functions that I thought were optimized, while also writing my code to be readable.
This approach can only take you so far. Eventually, you’ll see run times creeping up and costs in your data warehouse exploding. When this happens, you need a concrete plan for how to tackle it.
You need to see the performance numbers and understand how you can improve them. This is where the EXPLAIN plan comes in.
Recently, someone on my team introduced me to the EXPLAIN plan. This is a feature of the data warehouse that shows you the operations that the data warehouse will perform to execute a given query.
The EXPLAIN plan allows you to identify the different operations happening, in what order they are executed, and the time it takes to execute each. It allows you to find the bottlenecks in your query by looking at how long the different steps take to execute.
In other words, the EXPLAIN plan is the key to understanding how performant your queries and data models are. It is the key to gaining a deeper understanding of your data warehouse and how you can optimize performance.
Why guess at what’s most performant when you can see it in the data? 😉
How to read an EXPLAIN plan
Keep reading with a 7-day free trial
Subscribe to Learn Analytics Engineering to keep reading this post and get 7 days of free access to the full post archives.