Exceptional SQL skills will always separate the best data professionals from the average ones. While I always emphasize how important SQL is, it can get monotonous to hear about it repeatedly in newsletters or on LinkedIn when you use it every single day.
After all, you won’t get much value from learning about the different types of aggregate functions for the millionth time.
However, you can learn more about SQL from the mistakes of other analytics engineers, like me, who are in the weeds every day transforming complicated data.
I keep mentioning the revenue model I’ve been building these last few months because it’s a huge project (the biggest of my career) and one I’ve learned so much from.
This past week I ran into a roadblock that kept driving me insane. For some reason, the MRR categories assigned to each month for certain customers made no sense. The values didn’t align with the customer’s behavior and I couldn’t figure out why.
Eventually, I traced the inconsistencies to a plan_id
field in a model upstream. I used multiple window functions to find the plan at the end of the month. The plan can change multiple times per month, and each plan is associated with different invoice types.
Thinking I knew window functions so well and that the data was quite predictable (it never is), I never stopped to validate that the window function worked as I expected. I kept going, transforming the data CTE after CTE, until I had the final result I wanted.
When validating this final result, I quickly learned my lesson. Code will never meet your expectations! Always test as you go so you can easily identify where the problem is occurring. When you leave it to the end like I did, you will drive yourself crazy tracing back each and every line of code.
In this newsletter, we will dive into the hidden behaviors of window functions and how to validate them so you don’t make the same mistakes.
If you’re wondering what the heck a window function even is, I recommend pausing to read this deep dive I wrote breaking down the different elements of a window function and why they are important.
Handling of NULL values
Keep reading with a 7-day free trial
Subscribe to Learn Analytics Engineering to keep reading this post and get 7 days of free access to the full post archives.