How to create projects for myself to enrich my resume?

This page summarizes the projects mentioned and recommended in the original post on reddit.com/r/dataengineering

Our great sponsors
  • InfluxDB - Access the most powerful time series database as a service
  • Sonar - Write Clean Python Code. Always.
  • SaaSHub - Software Alternatives and Reviews
  • sqlfluff

    A modular SQL linter and auto-formatter with support for multiple dialects and templated code.

    Include bells and whistles to impress the reader: Most projects will have the common things like ETL scripts (e.g. SQL, Python, Airflow, dbt, etc) covered. To go the extra mile and stand out, you should also include things like data quality tests (e.g. dbt tests, great expectations, soda), linting scripts (e.g. sqlfluff, black), CI pipelines that check for linting and unit tests for ETL code before code can be merged to main (e.g. github actions). Include instructions on how to run those tests or linting or CI pipelines in your README file and include screenshots of the success or failure output to give the reader an example.

  • awesome-readme

    A curated list of awesome READMEs

    Provide a succinct and comprehensive README: readers of your personal project will always start with the README to know where to begin. The goal of the README is to provide the reader an understanding of the business problem you are trying to solve, how your solution goes about solving it (solution architecture diagram), and how to get started and run your code. There are plenty of great README examples here: https://github.com/matiassingers/awesome-readme

  • InfluxDB

    Access the most powerful time series database as a service. Ingest, store, & analyze all types of time series data in a fully-managed, purpose-built database. Keep data forever with low-cost storage and superior data compression.

  • modern-elt-demo

    A modern ELT demo using airbyte, dbt, snowflake and dagster

    Break your project down into components and folders: technical readers of your project will want to see that you have broken down the project into logical folders so that the code appears organized. There's nothing worse than clicking on a github link and seeing 40 files at the root of the repository and the reader asking themselves "where do I start?". Here is an example that I threw together in a day: https://github.com/Data-Engineer-Camp/modern-elt-demo

  • diataxis-documentation-framework

    "The Grand Unified Theory of Documentation" (David Laing) - a popular and transformative documentation authoring framework

    High quality blog articles Writing blog articles is a great way to (1) solidify your understanding on a topic and (2) show readers and potential employers your understanding. Solidifying your understanding is really important for your personal development, and will prove useful when an interviewer quizzes you on hard technical concepts and you are able to impress them with your concise and comprehensive explanation. "Ok, you've convinced me - now how do I write a high quality blog article?" According to the diataxis documentation framework, there are several different kinds of documentation or blog article you can write. The one's I would recommend you focus on are: explanation articles, and how-to articles. Explanation articles, as its name suggests, explain a particular topic e.g. “What is Spark?”. Whereas how-to articles are focussed on documenting the steps to perform a specific task e.g. “How to dockerize your ETL project?”. See the diataxis framework for more detailed definitions and examples. Once you've written your articles, you can publish them on a blog site like substack or medium. Both of the above tasks takes effort. You may have to invest several weekends to get it to a quality you are happy with. Whilst not everyone who sees your resume or LinkedIn profile will go through your personal projects and blog articles in detail, but you will get a small portion of people that will see and recognize the effort you have put in, and those people will be the ones that would provide you with your first opportunity. I hope this helps, and good luck!

  • black

    The uncompromising Python code formatter

    Include bells and whistles to impress the reader: Most projects will have the common things like ETL scripts (e.g. SQL, Python, Airflow, dbt, etc) covered. To go the extra mile and stand out, you should also include things like data quality tests (e.g. dbt tests, great expectations, soda), linting scripts (e.g. sqlfluff, black), CI pipelines that check for linting and unit tests for ETL code before code can be merged to main (e.g. github actions). Include instructions on how to run those tests or linting or CI pipelines in your README file and include screenshots of the success or failure output to give the reader an example.

  • Sonar

    Write Clean Python Code. Always.. Sonar helps you commit clean code every time. With over 225 unique rules to find Python bugs, code smells & vulnerabilities, Sonar finds the issues while you focus on the work.

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts