How To Start Your Next Data Engineering Project

This page summarizes the projects mentioned and recommended in the original post on

Our great sponsors
  • SonarLint - Deliver Cleaner and Safer Code - Right in Your IDE of Choice!
  • JetBrains - Developer Ecosystem Survey 2022
  • Scout APM - Less time debugging, more time building
  • Apache Spark

    Apache Spark - A unified analytics engine for large-scale data processing

    Apache Spark

  • Druid

    Apache Druid: a high performance real-time analytics database.

    Apache Druid

  • SonarLint

    Deliver Cleaner and Safer Code - Right in Your IDE of Choice!. SonarLint is a free and open source IDE extension that identifies and catches bugs and vulnerabilities as you code, directly in the IDE. Install from your favorite IDE marketplace today.

  • dagster

    An orchestration platform for the development, production, and observation of data assets.

    These are not the only choices, by any stretch of the imagination. Other popular options for orchestration are Dagster, *and Prefect. We actually recommend starting with *Airflow and then looking to others as you get more familiar with the processes.

  • d3

    Bring data to life with SVG, Canvas and HTML. :bar_chart::chart_with_upwards_trend::tada:


  • nodejs-bigquery

    Node.js client for Google Cloud BigQuery: A fast, economical and fully-managed enterprise data warehouse for large-scale data analytics.

    If you wanted to upgrade that idea, track down articles relating to that swing for discussion and post those. There is definite value in that data, and it is a pretty simple thing to do. You are just using a Cloud Composer to ingest the data and storing it in a data warehouse like BigQuery or Snowflake, creating a Twitter bot to post outputs to Twitter using something like Airflow.

  • Airflow

    Apache Airflow - A platform to programmatically author, schedule, and monitor workflows


NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts