InfluxDB 3 OSS is now GA. Transform, enrich, and act on time series data directly in the database. Automate critical tasks and eliminate the need to move data externally. Download now. Learn more →
Top 13 Python apache-airflow Projects
-
Project mention: 8 Tool Tech Stack to Build an Enterprise-Grade RAG System (Without the Headaches) | dev.to | 2025-08-26
8. Data Ingestion & Scraping (Firecrawl, Airflow, etc.)
-
Civic Auth
Simple auth for Python backends. Drop Civic Auth into your Python backend with just a few lines of code. Email login, SSO, and route protection built-in. Minimal config. Works with FastAPI, Flask, or Django.
-
-
-
astronomer-cosmos
Run your dbt Core or dbt Fusion projects as Apache Airflow DAGs and Task Groups with a few lines of code
-
couler
Unified Interface for Constructing and Managing Workflows on different workflow engines, such as Argo Workflows, Tekton Pipelines, and Apache Airflow.
-
ethereum-etl-airflow
Airflow DAGs for exporting, loading, and parsing the Ethereum blockchain data. How to get any Ethereum smart contract into BigQuery https://towardsdatascience.com/how-to-get-any-ethereum-smart-contract-into-bigquery-in-8-mins-bab5db1fdeee
-
-
Sevalla
Deploy and host your apps and databases, now with $50 credit! Sevalla is the PaaS you have been looking for! Advanced deployment pipelines, usage-based pricing, preview apps, templates, human support by developers, and much more!
-
-
MCP-Airflow-API
🔍Model Context Protocol (MCP) server for Apache Airflow API integration. Provides comprehensive tools for managing Airflow clusters including service operations, configuration management, status monitoring, and request tracking.
Project mention: Show HN: MCP-Server for Control Airflow Cluster | news.ycombinator.com | 2025-08-16 -
covid-19-data-engineering-pipeline
A Covid-19 data pipeline on AWS featuring PySpark/Glue, Docker, Great Expectations, Airflow, and Redshift, templated in CloudFormation and CDK, deployable via Github Actions.
-
e2e-structured-streaming
End-to-end data pipeline that ingests, processes, and stores data. It uses Apache Airflow to schedule scripts that fetch data from an API, sends the data to Kafka, and processes it with Spark before writing to Cassandra. The pipeline, built with Python and Apache Zookeeper, is containerized with Docker for easy deployment and scalability.
-
-
twitter_data-lakehouse_minio_drill_superset
Building a Data Lakehouse for Analyzing Elon Musk Tweets using MinIO, Apache Airflow, Apache Drill and Apache Superset
Python apache-airflow discussion
Python apache-airflow related posts
-
Top ETL Tools for MongoDB in 2025: Which One Fits Your Use Case?
-
Apache Airflow
-
Building Effective AI Agents \ Anthropic
-
AI Is Spamming Open Source Repos with Fake Issues
-
Enabling Apache Airflow to copy large S3 objects
-
New Apache Airflow Operators for Google Generative AI
-
Data on Kubernetes: Part 3 - Managing Workflows with Job Schedulers and Batch-Oriented Workflow Orchestrators
-
A note from our sponsor - InfluxDB
www.influxdata.com | 31 Aug 2025
Index
What are some of the best open-source apache-airflow projects in Python? This list will help you:
# | Project | Stars |
---|---|---|
1 | Airflow | 41,770 |
2 | elyra | 1,949 |
3 | airflow-maintenance-dags | 1,724 |
4 | astronomer-cosmos | 1,018 |
5 | couler | 940 |
6 | ethereum-etl-airflow | 422 |
7 | airflow-chart | 287 |
8 | pyjaws | 43 |
9 | MCP-Airflow-API | 38 |
10 | covid-19-data-engineering-pipeline | 23 |
11 | e2e-structured-streaming | 20 |
12 | F2-Data-Pipeline | 10 |
13 | twitter_data-lakehouse_minio_drill_superset | 3 |