Your default tool for ETL

This page summarizes the projects mentioned and recommended in the original post on /r/dataengineering

Our great sponsors
  • WorkOS - The modern identity platform for B2B SaaS
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • SaaSHub - Software Alternatives and Reviews
  • dbt-spark

    dbt-spark contains all of the code enabling dbt to work with Apache Spark and Databricks

  • T: SQL - views and scheduled queries in BigQuery; planning to go hard with dbt as soon as I can find some breathing room)

  • airbyte

    The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.

  • E & L: Airbyte and bespoke Google cloud functions (Python)

  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

    WorkOS logo
  • cargo-crates

    An easy way to build data extractors in Docker.

  • I went a little crazy and built my own set of data extractors that I can deploy with CDK to ECS.

  • damons-data-lake

    All the code related to building my own data lake

  • I went a little crazy and built my own set of data extractors that I can deploy with CDK to ECS.

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts