The Data Engineer Roadmap 🗺

This page summarizes the projects mentioned and recommended in the original post on

Our great sponsors
  • SonarLint - Deliver Cleaner and Safer Code - Right in Your IDE of Choice!
  • Scout APM - Less time debugging, more time building
  • OPS - Build and Run Open Source Unikernels
  • GitHub repo introduction-to-sql

    Free Introduction to SQL eBook

    SQL basics

  • GitHub repo data-engineer-roadmap

    Roadmap to becoming a data engineer in 2021

  • SonarLint

    Deliver Cleaner and Safer Code - Right in Your IDE of Choice!. SonarLint is a free and open source IDE extension that identifies and catches bugs and vulnerabilities as you code, directly in the IDE. Install from your favorite IDE marketplace today.

  • GitHub repo RabbitMQ

    Open source RabbitMQ: core server and tier 1 (built-in) plugins


  • GitHub repo Neo4j

    Graphs for Everyone

    Graph: Neo4j

  • GitHub repo materialize

    Materialize simplifies application development with streaming data. Incrementally-updated materialized views - in PostgreSQL and in real time. Materialize is powered by Timely Dataflow. (by MaterializeInc)

    Materialize - The Streaming Database for Real-time Analytics

  • GitHub repo Apache Hive

    Apache Hive

    Apache Hive

  • GitHub repo Apache HBase

    Apache HBase

    Wide column: Apache Cassandra, Apache HBase

  • Scout APM

    Less time debugging, more time building. Scout APM allows you to find and fix performance issues with no hassle. Now with error monitoring and external services monitoring, Scout is a developer's best friend when it comes to application development.

  • GitHub repo Apache Hadoop

    Apache Hadoop

    Apache Hadoop and HDFS

  • GitHub repo beam

    Apache Beam is a unified programming model for Batch and Streaming

    Apache Beam

  • GitHub repo Apache Arrow

    Apache Arrow is a multi-language toolbox for accelerated data interchange and in-memory processing

    Apache Arrow

  • GitHub repo Airflow

    Apache Airflow - A platform to programmatically author, schedule, and monitor workflows

    Apache Airflow

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts