Python data-pipelines

Open-source Python projects categorized as data-pipelines

Top 7 Python data-pipeline Projects

  • dagster

    An orchestration platform for the development, production, and observation of data assets.

    Project mention: Experience with | | 2023-07-25
  • Mage

    🧙 The modern replacement for Airflow. Mage is an open-source data pipeline tool for transforming and integrating data.

    Project mention: A mage on the Hero’s Journey: a fantasy epic on how a startup rose from the ashes | | 2023-06-12

    In the coming years, Mage will create a cooperative experience so that developers can build data pipelines with their team and level up together. After that journey, Mage will go on an epic quest to create the 1st open world community experience in the data universe.

  • Mergify

    Tired of breaking your main and manually rebasing outdated pull requests?. Managing outdated pull requests is time-consuming. Mergify's Merge Queue automates your pull request management & merging. It's fully integrated to GitHub & coordinated with any CI. Start focusing on code. Try Mergify for free.

  • meltano

    Meltano: the declarative code-first data integration engine that powers your wildest data and ML-powered product ideas. Say goodbye to writing, maintaining, and scaling your own API integrations.

    Project mention: meltano VS cloudquery - a user suggested alternative | | 2023-06-02
  • versatile-data-kit

    One framework to develop, deploy and operate data workflows with Python and SQL.

    Project mention: Looking for a data blogger | /r/opensource | 2023-05-19

    Here's the project:

  • dbt-data-reliability

    Data anomalies monitoring as dbt tests and dbt artifacts uploader.

  • patterns-devkit

    Data pipelines from re-usable components

  • SmartPipeline

    A framework for rapid development of robust data pipelines following a simple design pattern

    Project mention: Show HN: SmartPipeline, robust and light data pipelines in Python | | 2023-05-03
  • Sonar

    Write Clean Python Code. Always.. Sonar helps you commit clean code every time. With over 225 unique rules to find Python bugs, code smells & vulnerabilities, Sonar finds the issues while you focus on the work.

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020). The latest post mention was on 2023-07-25.

Python data-pipelines related posts


What are some of the best open-source data-pipeline projects in Python? This list will help you:

Project Stars
1 dagster 8,500
2 Mage 5,516
3 meltano 1,185
4 versatile-data-kit 358
5 dbt-data-reliability 243
6 patterns-devkit 106
7 SmartPipeline 13
Collect and Analyze Billions of Data Points in Real Time
Manage all types of time series data in a single, purpose-built database. Run at any scale in any environment in the cloud, on-premises, or at the edge.