Python elt

Open-source Python projects categorized as elt

Top 16 Python elt Projects

  • Airflow

    Apache Airflow - A platform to programmatically author, schedule, and monitor workflows

  • Project mention: Building in Public: Leveraging Tublian's AI Copilot for My Open Source Contributions | dev.to | 2024-02-12

    Contributing to Apache Airflow's open-source project immersed me in collaborative coding. Experienced maintainers rigorously reviewed my contributions, providing constructive feedback. This ongoing dialogue refined the codebase and honed my understanding of best practices.

  • airbyte

    The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.

  • Project mention: Launch HN: Bracket (YC W22) – Two-Way Sync Between Salesforce and Postgres | news.ycombinator.com | 2023-12-12

    I'l also give a shout-out to Airbyte (https://airbyte.com/), with which I've had some limited success with integrating Salesforce to a local database. The particular pull for Airbyte is that we can self-host the open source version, rather than pay Fivetran a significant sum to do this for us.

    It's an immature tool, so I don't yet know that I can claim we've spent _less_ than Fivetran on the additional engineering and ops time, but it feels like it has potential to do so once stabilized.

  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

    WorkOS logo
  • dbt-core

    dbt enables data analysts and engineers to transform their data using the same practices that software engineers use to build applications.

  • Project mention: Dbt | news.ycombinator.com | 2024-02-18
  • Mage

    🧙 The modern replacement for Airflow. Mage is an open-source data pipeline tool for transforming and integrating data. https://github.com/mage-ai/mage-ai

  • Project mention: A mage on the Hero’s Journey: a fantasy epic on how a startup rose from the ashes | dev.to | 2023-06-12

    In the coming years, Mage will create a cooperative experience so that developers can build data pipelines with their team and level up together. After that journey, Mage will go on an epic quest to create the 1st open world community experience in the data universe.

  • dlt

    data load tool (dlt) is an open source Python library that makes data loading easy 🛠️

  • Project mention: Ask HN: Freelancer? Seeking freelancer? (December 2023) | news.ycombinator.com | 2023-12-03

    SEEKING FREELANCER | REMOTE | GERMANY

    dltHub is looking for a freelance help in the following repos:

    - https://github.com/dlt-hub/dlt

  • meltano

    Meltano: the declarative code-first data integration engine that powers your wildest data and ML-powered product ideas. Say goodbye to writing, maintaining, and scaling your own API integrations.

  • Project mention: meltano VS cloudquery - a user suggested alternative | libhunt.com/r/meltano | 2023-06-02
  • sqlmesh

    Efficient data transformation and modeling framework that is backwards compatible with dbt.

  • Project mention: Launch HN: Serra (YC S23) – Open-source, Python-based dbt alternative | news.ycombinator.com | 2023-08-14

    There is also sqlmesh (https://sqlmesh.com/). Pretty new as well. It introduces some interesting concepts. For smaller dbt projects it could be a drop-in replacement as it allows importing dbt projects.

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • dbt-metabase

    dbt + Metabase integration

  • versatile-data-kit

    One framework to develop, deploy and operate data workflows with Python and SQL.

  • Project mention: Looking for a data blogger | /r/opensource | 2023-05-19

    Here's the project: https://github.com/vmware/versatile-data-kit

  • astro-sdk

    Astro SDK allows rapid and clean development of {Extract, Load, Transform} workflows using Python and SQL, powered by Apache Airflow.

  • Project mention: Orchestration: Thoughts on Dagster, Airflow and Prefect? | /r/dataengineering | 2023-06-01

    Have you tried the Astro SDK? https://github.com/astronomer/astro-sdk

  • reddit-detective

    Play detective on Reddit: Discover political disinformation campaigns, secret influencers and more

  • dbt-coves

    CLI tool for dbt users to simplify creation of staging models (yml and sql) files

  • Project mention: Is there something wrong with me, I hate dbt, what am I missing ? | /r/dataengineering | 2023-05-15

    This just feels like you aren’t using the plentiful tools to make those “mind-numbingly slow” dev steps faster. For ex., using dbt-coves to generate the staging models with casting to types in a couple clicks. And pulling directly from Fivetran tables is just poor practice, with the additional steps needed to do it “right” being inconsequential at best.

  • sayn

    Data processing and modelling framework for automating tasks (incl. Python & SQL transformations).

  • Meltano Singer SDK

    Write 70% less code by using the SDK to build custom extractors and loaders that adhere to the Singer standard: https://sdk.meltano.com (by meltano)

  • dbd

    dbd is a database prototyping tool that enables data analysts and engineers to quickly load and transform data in SQL databases.

  • reddit_api_elt

  • Project mention: Reddit ELT Pipeline | /r/dataengineering | 2023-12-11

    Hi everyone, this is my first DE project. Baitur5/reddit_api_elt (github.com) . It is basically about a data pipeline that extracts Reddit data for a Google Data Studio report, focusing on a specific subreddit Can you guys check it out , and give some advice & tips on how to improve it or the next things I should add.

  • SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020). The latest post mention was on 2024-02-18.

Python elt related posts

Index

What are some of the best open-source elt projects in Python? This list will help you:

Project Stars
1 Airflow 34,397
2 airbyte 13,821
3 dbt-core 8,842
4 Mage 6,953
5 dlt 1,694
6 meltano 1,577
7 sqlmesh 1,231
8 dbt-metabase 422
9 versatile-data-kit 409
10 astro-sdk 315
11 reddit-detective 206
12 dbt-coves 204
13 sayn 117
14 Meltano Singer SDK 83
15 dbd 55
16 reddit_api_elt 2
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com