Learn Analytics Engineering

Learn Analytics Engineering

How to Write a Good README

#6: How to build out your GitHub portfolio projects to attract potential employers

Sep 04, 2025
∙ Paid
6
Share

After spending weeks working on a portfolio project, most people ship their code to GitHub and call it a day. They think the code is enough to prove their skill sets to future employers, clients, you name it.

It’s not.

Code without any explanation of the problem you are solving, the roadblocks you overcame along the way, and the tradeoffs you made makes it virtually pointless. Code without context means nothing.

Not to mention, someone looking at it has no idea whether the code was written by AI or another person entirely. You need to inject your own unique point of view into the repo.

Learn Analytics Engineering is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.

Now that you’ve officially built a working data transformation pipeline using modern data stack tools like DuckDB and SQLMesh, it’s time to package all of your work up so you have something to show off on GitHub to your network and any potential employers.

First, let’s look back on all that you’ve accomplished in the last 5 weeks:

  • learned the purpose of a data pipeline and its different pieces

  • set up a DuckDB database and loaded it with Trail Trekker’s data from CSVs

  • created a SQLMesh project complete with models and tests

  • documented Trail Trekker source data and models

  • modeled Trail Trekker customer subscription changes

  • orchestrated data models to run hourly in SQLMesh

If you haven’t completed the Data Pipeline Summer challenge yet, it’s not too late! Start here with the first article and follow through each week until completion. Paid subscribers have access to the challenge indefinitely!

How to Build an Open-Source Data Pipeline

Madison Mae
·
Jul 31
How to Build an Open-Source Data Pipeline

Welcome to the first email of this 6-week series, Data Pipeline Summer! In the next 6 weeks, we will learn how to build a data pipeline from start to finish, using some of the most popular open-source data tools.

Read full story

How to add your work to GitHub

If you don’t already have your project organized in a directory, create one and move all of your project files there. This includes your SQLMesh project as well as your orchestration script.

You can create a directory locally with the command:

mkdir trail_trekker

You can then move your project files to this directory with the following command:

mv <path of file> <path of trail_trekker directory>

This makes it easier to then turn your directory into a GitHub repo. If you don’t have a GitHub account, or Git installed on your local CLI, reference these instructions.

You can turn the directory into a GitHub repo with the following command:

git init -b main

This also ensures you are on the main branch. To add your files to your repo, run the following command:

git add .

If you don’t want to add all of the files in the directory, replace . with the path of the file you want to add to GitHub.

For additional help, check out these instructions.

The importance of a README

The README (.md file) in a GitHub repo is the heart and soul of your entire project. It is where you introduce the reader to the problem you are solving, the datasets you are using, and your thought process along the way.

It gives the reader context and insight that is so important to understanding the questions- “why should I care?” and “how do I know this is good work?”.

A README for any technical data project should always include the following headlines:

Keep reading with a 7-day free trial

Subscribe to Learn Analytics Engineering to keep reading this post and get 7 days of free access to the full post archives.

Already a paid subscriber? Sign in
© 2025 Madison Mae
Privacy ∙ Terms ∙ Collection notice
Start writingGet the app
Substack is the home for great culture