Oh, the CASE statement… such an easy thing to write but a real killer to the performance of complex queries. While this SQL function can feel like an easy go-to when writing code, it should be replaced with more efficient functions whenever possible.
Knowing SQL swaps like these is one of the ways to uplevel from a beginner analytics engineer to one that’s more advanced.
1. COALESCE
This SQL function is top of mind for me as I write this newsletter from dbt Labs’ Coalesce conference. In this example, I use a CASE statement to check the NULLness of a value.
If it’s NULL, return a status of unknown. If it’s NOT NULL, return the user_id
.
SELECT
session_id,
user_id,
CASE
WHEN user_id IS NULL THEN 'unknown'
ELSE user_id
END AS user_status,
session_created_at
FROM sessions
This can be replaced with a simple COALESCE function. This takes different inputs and returns the first that is NOT NULL.
SELECT
session_id,
user_id,
COALESCE(user_id, 'unknown') AS user_status,
session_created_at
FROM sessions
Doesn’t this look a lot cleaner? Most data people reading this code will understand this function a lot more compared to a CASE statement evaluating the NULLness of various fields.
For more on learning SQL, check out some of my past newsletters:
⭐️ Aggregates
⭐️ Joins
Keep reading with a 7-day free trial
Subscribe to Learn Analytics Engineering to keep reading this post and get 7 days of free access to the full post archives.