Machine learning engineer, platform engineer, DevOps engineer, analytics engineer… there are a lot of different terms floating out there for a data engineer.
Many people get angry with the rise of all these different terms. Honestly, I never really understand why.
They each have their own unique responsibilities and requirements. Just because they are all technically “engineering data”, doesn’t mean they should all be called simply data engineers.
If you’ve ever searched for a data engineering role, you’re familiar with how difficult it is to find something that matches your unique experience and what you want in a role.
Applying for any old data engineering role doesn’t work because it’s not guaranteed that you will have the skills the hiring manager is looking for. Heck, the role may not be what you are looking for!
With the rising capabilities of data, including new technologies and tools every week, we need to accept that one job title can’t encompass it all.
It’s time we embrace the niches of data engineering, recognizing that they are all a bit different.
Before I get into the differences between the newly emerging data engineering roles, I have one last shameless plug for my upcoming course- Transform Your Data Stack with dbt.
We start with the first live class on Monday at 4pm PST! The course consists of 4 live sessions where you will be learning directly from me, getting the chance to ask all of your burning dbt and analytics engineering questions.
You will build a dbt project using real business data, utilizing tools like GitHub, Snowflake, and dbt.
You will:
write your own dbt style guide
document your data using best practices
build generic and custom tests into your project
write a custom macro
refactor your code to make it more reusable
You don’t want to miss it. Whip out the learning and development budget and let’s turn you into a dbt expert!
Sign up here :)
I started my career at Capital One as a data engineer. I was fresh out of college, only having taken one computer science class, and had no idea what to expect.
After a 6-month boot camp, I began my first data engineering role within the Commercial Tech division of the company. Because my cohort and I were just starting out, we were placed on whatever team had availability for entry-level data engineers.
My first role ended up being on a DevOps team. I knew little about DevOps, only that it required a deep knowledge of AWS and infrastructure. Given I wanted to learn Python and build fancy data science models, this was a far cry from the projects that interested me most.
As a data engineer, I didn’t expect to deploy code that software engineers wrote, ensuring the changes happened successfully. This introduced me to a different side of data engineering that my ignorant eyes hadn’t yet seen.
Site Reliability/DevOps/Platform Engineer
DevOps data engineers have a few different names. I’ve seen them called site reliability engineers and platform engineers. It really depends on the company, but they all require a similar skillset.
SREs are in charge of building out the infrastructure to support data and/or software. A common job description may look like this:
The software and/or data engineering experience required, along with the specific languages, may change from description to description. However, experience with cloud providers, Kubernetes, and infrastructure is a tell-tale sign of a platform engineering role.
After working as a DevOps data engineer for a year, I moved on to another data engineering role within Capital One. Except this one was completely different. I worked on a financial team which was much more business-facing than my previous team, where we served software engineers.
In this role, I strengthened my SQL skills and was introduced to dbt. I still used Python, writing applications here and there, but SQL was the bread and butter of this role. I frequently worked with product managers to meet deadlines and produce data that the business teams within the organization could use.
Looking back, this felt more like an analytics engineering role than a data engineering role. It just goes to show you that a job title doesn’t always fit the job description. There truly is a wide range of “data engineers”, all requiring different skill sets.
Data Engineer
Based on my personal experience in two niches of data engineering, I would consider a true data engineer someone who uses languages like Python, Scale, Java, or Go. They should be familiar with SQL, of course, but it isn’t necessarily what they code in every day.
I also see experience with distributed data tools like Spark and Kafka as a sign of a true data engineering role. These aren’t tools used by DevOps or analytics engineers.
Here is what a common job description may look like:
Interestingly enough, while interviewing for data engineering roles, I interviewed for one with a job description similar to the one above.
The interviews were much more technical and code-heavy compared to those of the analytics engineering roles I interviewed for. The content covered was like night and day.
I remember struggling through a Python interview focusing on algorithms, as I have never taken an algorithms course and never had to code these in my day-to-day at Capital One. I thought I bombed it, yet I was still offered a role.
Ultimately, I knew that the role didn’t meet my expectations for the role I wanted. On paper, it looked perfect. The company was doing well, and the people were great. However, I knew the role, despite being a “data engineer” role, didn’t match what I was looking for in my career.
After I turned down the data engineering offer, I accepted one as an analytics engineer. The role was a perfect combination of my previous data engineering experience yet still aligned with where I wanted to go. It was a match made in heaven!
When I started interviewing for the role, it was my first time hearing of an analytics engineer. It felt like a fresh breath of air to have a role whose description matched what I wanted to do.
Searching through tons of job descriptions, and not being able to find a role that matches what you want is defeating. Thank goodness data engineering is now being niched down into roles like analytics engineer, platform engineers, and machine learning engineers.
Analytics Engineer
Analytics engineers eat and breathe SQL. They use it to model data using transformation tools like dbt. However, they also have experience building data pipelines using languages like Python.
Analytics engineers focus on the business value when it comes to data. They think about the business processes and how different objects relate to one another. This means lots of conversations directly with stakeholders so they can find a way to code what they need out of the data.
Analytics engineers also own data quality, focusing on how changes can affect downstream data used by the business. They ensure all analytics processes are properly documented. They are truly the bridge between data engineers (or software engineers) and data analysts!
Here’s what a common analytics engineering job description looks like:
If you enjoy solving business problems with data, analytics engineering may be for you. You interact with stakeholders in a way I’ve never interacted with them in any of my previous data engineering roles.
Keep reading with a 7-day free trial
Subscribe to Learn Analytics Engineering to keep reading this post and get 7 days of free access to the full post archives.