4 Things to STOP Doing
...if you want to become a better analytics engineer this year
Every analytics engineer learns lessons the hard way sooner or later. They learn them through failed projects, wasted time, and data products that do not scale.
In fact, I’ve learned all of the lessons I’m about to share with you the hard way. Nobody ever taught me these things when I started my career in analytics engineering, but I wish they had. Not so I didn’t make the mistakes that I did, but so I made them earlier on and could easily pinpoint exactly what I did wrong.
Most of us are trying to figure it out as we go. We don’t realize something is a problem until we face it head-on. However, the awareness of these problems can help guide you in making more informed mistakes.
Staying quiet in meetings with your stakeholders.
As an introvert, this is the soft skill I’ve struggled with most as an analytics engineer, especially in a room with outspoken stakeholders. It can feel hard to stand firm in your opinions and actually speak them into existence.
However, if you want a project to go smoothly, without making more work for yourself, you need to ask questions and make firm statements when you know something won’t work as they expect.
Asking thoughtful questions to get to the root of a business problem is a key first (and crucial) step to any data modeling project.
Here’s what I recommend:
create a meeting agenda with talking points (this way you are leading the flow of the meeting)
do your research beforehand and come with examples to back up any concerns
focus on the problem, not the solution
I feel way more comfortable setting expectations and addressing any concerns when I have them written in a document with examples to support the points I’m making. This has made it so I get everything I need answered and addressed without feeling like I’m put on the spot.
Building exactly what the business asks for.
This has bitten me in the butt more times than I can count. A request comes in with something that a stakeholder needs. It could be a new field, a new dashboard, or a quick change to a data model. In my first few years of analytics engineering, I wouldn’t think twice, and I would just make the change that was requested.
Well, this often has unintended consequences. The small changes often compound, creating a data product that doesn’t behave as intended or isn’t able to scale in the way the data team would like. Whenever the business asks for something specific, push back and try to understand the problem they are facing rather than the solution they are asking you to implement.
Remember, you are the technical expert! You know best when it comes to translating a business problem to a technical solution.
Back when I frequently worked with Hubspot data and the sales team at Kit, I would get many requests to make small changes to fields. Eventually, there became a time when the original field definitions strayed so far from the standard and they were being used in different ways than intended. If I had gotten to the root of why these fields needed to be changed, I could have created lasting solutions to their problems.
Modeling data based on a problem, not a process.
This advice is stolen from Ralph Kimball himself. He is known for emphasizing processes over problems.
“Each business process is represented by a dimensional model that consists of a fact table containing the event’s numeric measurements surrounded by a halo of dimension tables that contain the textual context“.
When someone comes to you with a problem, consider the business process and how that can be modeled, first. Then think about how that model can be used to solve the problem at hand. Designing data models around a problem will give you a very narrowly scoped data model that can’t be used for much else.
For example, right now I am working on building out a brand budgeting model. Instead of thinking about the problem of manual billing and proration calculations, I’m considering how the process of determining a brand’s budget works. This will allow me to create something that scales with changes to the budget process without the problem of manual work considered.
Modeling data on a BROKEN process.
Every time I’ve tried to model data around a business process that was inherently broken, I’ve regretted it. If you find yourself writing a lot of tricky case statements, this is a sign that you are modeling a broken business process.
Of course, processes are inherently messy. But if you want to use data to measure them, they ultimately need to be predictable. It’s not our jobs as analytics engineers to make sure people are following expected behaviors. When certain behaviors are expected but not followed, you have a process problem.
While building an MRR model last year, I frequently discovered time after time that my stakeholders were following unpredictable processes. They were offering refunds and prorations without a real set of rules. Doing this led MRR to be unpredictable, no matter how I coded the “rules” of the business processes.
When this is the case, you ar setting yourself up for failure. If you agree to codify something that can’t be, you are putting all of the blame on yourself when things go wrong. I highly encourage you to avoid this by making it clear what patterns you can model and what you can’t.
If a process is unpredictable, fix it before you agree to model it.
If you’re interested in learning more about my data model design process and improving the way you approach data modeling, I highly recommend checking out this newsletter.
How to Decide on a Data Model Design
Data model design is one of those skills that’s really hard to improve in unless you are doing it every day, with real business processes to model. The more you do it, the more mistakes you make, and the more you learn.
Every month paid subscribers will receive technical deep dives like this to help them improve their skills as an analytics engineer.
Have a great week!
Madison



Excellent breakdown of the common traps in analytics work. The point about refusing to model broken processes is something I learned the ahrd way too, usually by spending weeks on a model that became instantly outdated. What's interesting is how this forces a healthy boundry between engineering and ops - it's actually better for everyone when teams fix their workflows before expecting data miracles.