How to create projects for myself to enrich my resume?

Our great sponsors

WorkOS - The modern identity platform for B2B SaaS

InfluxDB - Power Real-Time Data Analytics at Scale

SaaSHub - Software Alternatives and Reviews

Our great sponsors

sqlfluff

35 7,199 9.6 Python

A modular SQL linter and auto-formatter with support for multiple dialects and templated code.

Include bells and whistles to impress the reader: Most projects will have the common things like ETL scripts (e.g. SQL, Python, Airflow, dbt, etc) covered. To go the extra mile and stand out, you should also include things like data quality tests (e.g. dbt tests, great expectations, soda), linting scripts (e.g. sqlfluff, black), CI pipelines that check for linting and unit tests for ETL code before code can be merged to main (e.g. github actions). Include instructions on how to run those tests or linting or CI pipelines in your README file and include screenshots of the success or failure output to give the reader an example.

awesome-readme

30 16,912 6.4

A curated list of awesome READMEs

Provide a succinct and comprehensive README: readers of your personal project will always start with the README to know where to begin. The goal of the README is to provide the reader an understanding of the business problem you are trying to solve, how your solution goes about solving it (solution architecture diagram), and how to get started and run your code. There are plenty of great README examples here: https://github.com/matiassingers/awesome-readme

WorkOS

workos.com sponsored

The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
modern-elt-demo

14 20 10.0 PLpgSQL

A modern ELT demo using airbyte, dbt, snowflake and dagster

Break your project down into components and folders: technical readers of your project will want to see that you have broken down the project into logical folders so that the code appears organized. There's nothing worse than clicking on a github link and seeing 40 files at the root of the repository and the reader asking themselves "where do I start?". Here is an example that I threw together in a day: https://github.com/Data-Engineer-Camp/modern-elt-demo

diataxis-documentation-framework

72 706 8.7 HTML

A systematic approach to creating better documentation.

High quality blog articles Writing blog articles is a great way to (1) solidify your understanding on a topic and (2) show readers and potential employers your understanding. Solidifying your understanding is really important for your personal development, and will prove useful when an interviewer quizzes you on hard technical concepts and you are able to impress them with your concise and comprehensive explanation. "Ok, you've convinced me - now how do I write a high quality blog article?" According to the diataxis documentation framework, there are several different kinds of documentation or blog article you can write. The one's I would recommend you focus on are: explanation articles, and how-to articles. Explanation articles, as its name suggests, explain a particular topic e.g. “What is Spark?”. Whereas how-to articles are focussed on documenting the steps to perform a specific task e.g. “How to dockerize your ETL project?”. See the diataxis framework for more detailed definitions and examples. Once you've written your articles, you can publish them on a blog site like substack or medium. Both of the above tasks takes effort. You may have to invest several weekends to get it to a quality you are happy with. Whilst not everyone who sees your resume or LinkedIn profile will go through your personal projects and blog articles in detail, but you will get a small portion of people that will see and recognize the effort you have put in, and those people will be the ones that would provide you with your first opportunity. I hope this helps, and good luck!

black

322 37,348 9.4 Python

The uncompromising Python code formatter

Include bells and whistles to impress the reader: Most projects will have the common things like ETL scripts (e.g. SQL, Python, Airflow, dbt, etc) covered. To go the extra mile and stand out, you should also include things like data quality tests (e.g. dbt tests, great expectations, soda), linting scripts (e.g. sqlfluff, black), CI pipelines that check for linting and unit tests for ETL code before code can be merged to main (e.g. github actions). Include instructions on how to run those tests or linting or CI pipelines in your README file and include screenshots of the success or failure output to give the reader an example.

InfluxDB

www.influxdata.com sponsored

Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Front page news headline scraping data engineering project
3 projects | /r/dataengineering | 13 May 2023
How to setup Black and pre-commit in python for auto text-formatting on commit
3 projects | dev.to | 29 Mar 2024
Let's meet Black: Python Code Formatting
2 projects | dev.to | 7 Feb 2024
Show HN: Visualize the Entropy of a Codebase with a 3D Force-Directed Graph
6 projects | news.ycombinator.com | 31 Jan 2024
Introducing Flask-Muck: How To Build a Comprehensive Flask REST API in 5 Minutes
3 projects | dev.to | 20 Dec 2023

How to create projects for myself to enrich my resume?

This page summarizes the projects mentioned and recommended in the original post on /r/dataengineering
Code Analysis awesome-list sql-linter Documentation Code Formatters
Post date: 29 Oct 2022

sqlfluff

awesome-readme

WorkOS

modern-elt-demo

diataxis-documentation-framework

black

InfluxDB

Related posts

How to create projects for myself to enrich my resume?

This page summarizes the projects mentioned and recommended in the original post on /r/dataengineering Code Analysis awesome-list sql-linter Documentation Code Formatters Post date: 29 Oct 2022

sqlfluff

awesome-readme

WorkOS

modern-elt-demo

diataxis-documentation-framework

black

InfluxDB

Related posts

This page summarizes the projects mentioned and recommended in the original post on /r/dataengineering
Code Analysis awesome-list sql-linter Documentation Code Formatters
Post date: 29 Oct 2022