How To Start Your Next Data Engineering Project

This page summarizes the projects mentioned and recommended in the original post on dev.to

Our great sponsors
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • Onboard AI - ChatGPT with full context of any GitHub repo.
  • WorkOS - The modern API for authentication & user identity.
  • Apache Spark

    Apache Spark - A unified analytics engine for large-scale data processing

    Apache Spark

  • Druid

    Apache Druid: a high performance real-time analytics database.

    Apache Druid

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

  • dagster

    An orchestration platform for the development, production, and observation of data assets.

    These are not the only choices, by any stretch of the imagination. Other popular options for orchestration are Dagster, *and Prefect. We actually recommend starting with *Airflow and then looking to others as you get more familiar with the processes.

  • d3

    Bring data to life with SVG, Canvas and HTML. :bar_chart::chart_with_upwards_trend::tada:

    D3.js

  • nodejs-bigquery

    Node.js client for Google Cloud BigQuery: A fast, economical and fully-managed enterprise data warehouse for large-scale data analytics.

    If you wanted to upgrade that idea, track down articles relating to that swing for discussion and post those. There is definite value in that data, and it is a pretty simple thing to do. You are just using a Cloud Composer to ingest the data and storing it in a data warehouse like BigQuery or Snowflake, creating a Twitter bot to post outputs to Twitter using something like Airflow.

  • Airflow

    Apache Airflow - A platform to programmatically author, schedule, and monitor workflows

    Airflow

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts