How should I start learning/implementing DevOps in data engineering projects?

This page summarizes the projects mentioned and recommended in the original post on /r/dataengineering

Our great sponsors
  • WorkOS - The modern identity platform for B2B SaaS
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • SaaSHub - Software Alternatives and Reviews
  • DevOps-Roadmap

    DevOps Roadmap for 2024. with learning resources

  • In DevOps tools I've worked with GitHub + Jenkins, GitLab + k8s, and I'm now primarily working in the Argo Stack. Depending on where you're at technically, you might use something different. IaC is a ust as well, maybe some config management. Generally I've found that as a Data Engineer with a lot of infra/CICD knowledge, I generally get pigeonholed into those positions on a team, so be prepared for that. I really like this roadmap for DevOps , so you can see where your tech skills are at currently, and what you may need to learn. On top of that, you'll need to learn some data tools. Airflow + dbt is hot right now, Argo is sometimes used in MLOps, Azure Data Stack (I'm not familiar with it) seems common, and probably Spark in almost all cases. You can also checkout in visualization tools probably further down the line, I generally stick to something free when learning on my own, Superset or Google Data Studio (Might be Looker Studio now? Not sure, it's been a while). Here's a roadmap for DE too. I love these roadmaps for getting started, but don't let them distract you from exploring a path more appropriate to what you want to achieve. Generally I've found that as a Data Enigneer with a lot of infra/CICD knowledge, I generally get pigeonholed into those positions on a team

  • data-engineer-roadmap

    Roadmap to becoming a data engineer in 2021

  • In DevOps tools I've worked with GitHub + Jenkins, GitLab + k8s, and I'm now primarily working in the Argo Stack. Depending on where you're at technically, you might use something different. IaC is a ust as well, maybe some config management. Generally I've found that as a Data Engineer with a lot of infra/CICD knowledge, I generally get pigeonholed into those positions on a team, so be prepared for that. I really like this roadmap for DevOps , so you can see where your tech skills are at currently, and what you may need to learn. On top of that, you'll need to learn some data tools. Airflow + dbt is hot right now, Argo is sometimes used in MLOps, Azure Data Stack (I'm not familiar with it) seems common, and probably Spark in almost all cases. You can also checkout in visualization tools probably further down the line, I generally stick to something free when learning on my own, Superset or Google Data Studio (Might be Looker Studio now? Not sure, it's been a while). Here's a roadmap for DE too. I love these roadmaps for getting started, but don't let them distract you from exploring a path more appropriate to what you want to achieve. Generally I've found that as a Data Enigneer with a lot of infra/CICD knowledge, I generally get pigeonholed into those positions on a team

  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

    WorkOS logo
NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts