Judoscale integrates with Django, FastAPI, Celery, and RQ to make autoscaling easy and reliable. Save big, and say goodbye to request timeouts and backed-up task queues. Learn more →
Airflow Alternatives
Similar projects and alternatives to Airflow
-
-
Judoscale
Save 47% on cloud hosting with autoscaling that just works. Judoscale integrates with Django, FastAPI, Celery, and RQ to make autoscaling easy and reliable. Save big, and say goodbye to request timeouts and backed-up task queues.
-
terraform
Terraform enables you to safely and predictably create, change, and improve infrastructure. It is a source-available tool that codifies APIs into declarative configuration files that can be shared amongst team members, treated as code, edited, reviewed, and versioned.
-
PostgreSQL
Mirror of the official PostgreSQL GIT repository. Note that this is just a *mirror* - we don't work with pull requests on github. To contribute, please see https://wiki.postgresql.org/wiki/Submitting_a_Patch
-
-
n8n
Fair-code workflow automation platform with native AI capabilities. Combine visual building with custom code, self-host or cloud, 400+ integrations.
-
airbyte
The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.
-
-
InfluxDB
InfluxDB high-performance time series database. Collect, organize, and act on massive volumes of high-resolution data to power real-time intelligent systems.
-
-
-
-
dbt-core
dbt enables data analysts and engineers to transform their data using the same practices that software engineers use to build applications.
-
-
-
-
nodejs-bigquery
Node.js client for Google Cloud BigQuery: A fast, economical and fully-managed enterprise data warehouse for large-scale data analytics.
-
-
-
-
Apache Camel
Apache Camel is an open source integration framework that empowers you to quickly and easily integrate various systems consuming or producing data. (by apache)
-
luigi
Luigi is a Python module that helps you build complex pipelines of batch jobs. It handles dependency resolution, workflow management, visualization etc. It also comes with Hadoop support built in.
-
CodeRabbit
CodeRabbit: AI Code Reviews for Developers. Revolutionize your code reviews with AI. CodeRabbit offers PR summaries, code walkthroughs, 1-click suggestions, and AST-based analysis. Boost productivity and code quality across all major languages with each PR.
Airflow discussion
Airflow reviews and mentions
-
Airflow AI SDK to build simple LLM workflows
Hi HN,
We've built an SDK for building DAGs / data pipelines with LLMs in Apache Airflow [1] using Pydantic AI [2] under the hood. I've seen success across the board with Airflow users building simple LLM workflows before moving on to "AI agents". In my experience, the noise around building agents means that people forget that there are other ways to get more immediate value out of LLMs.
Coupling Airflow for orchestration and Pydantic AI for LLM interactions has turned out to be a very pragmatic approach to building these workflows (and agents). Neither tool "gets in the way" of what you're trying to do. Airflow's been around for 10+ years and has a very well-built orchestration engine rich with everything you need to write production grade data pipelines, and Pydantic AI's been a refreshing take on working with LLMs.
Would love some feedback from this community!
[1] https://github.com/apache/airflow
- The DOJ Still Wants Google to Sell Off Chrome
-
10 Must-Know Open Source Platform Engineering Tools for AI/ML Workflows
Apache Airflow offers simplicity when it comes to scheduling, authoring, and monitoring ML workflows using Python. The tool's greatest advantage is its compatibility with any system or process you are running. This also eliminates manual intervention and increases team productivity, which aligns with the principles of Platform Engineering tools.
-
AI Is Spamming Open Source Repos with Fake Issues
Examples: https://github.com/apache/airflow/issues?q=is%3Aissue%20stat...
Other than the content (which indeed makes no sense), these usually can be recognized by subjective adjectives and polish language[1].
[1] https://news.ycombinator.com/item?id=42864854
-
Data Orchestration Tool Analysis: Airflow, Dagster, Flyte
Data orchestration tools are key for managing data pipelines in modern workflows. When it comes to tools, Apache Airflow, Dagster, and Flyte are popular tools serving this need, but they serve different purposes and follow different philosophies. Choosing the right tool for your requirements is essential for scalability and efficiency. In this blog, I will compare Apache Airflow, Dagster, and Flyte, exploring their evolution, features, and unique strengths, while sharing insights from my hands-on experience with these tools in a weather data pipeline project.
-
AIOps, DevOps, MLOps, LLMOps – What’s the Difference?
Data pipelines: Apache Kafka and Airflow are often used for building data pipelines that can continuously feed data to models in production.
-
Data Engineering with DLT and REST
This article demonstrates how to work with near real-time and historical data using the dlt package. Whether you need to scale data access across the enterprise or provide historical data for post-event analysis, you can use the same framework to provide customer data. In a future article, I'll demonstrate how to use dlt with a workflow orchestrator such as Apache Airflow or Dagster.``
-
Enabling Apache Airflow to copy large S3 objects
This approach means the API doesn't change, i.e., you can just replace the S3CopyObjectOperator instances with S3CopyOperator instances. Additionally, we only perform the extra work of doing the multipart upload when the simpler method is insufficient. The trade-off is that we're inefficient if almost every object is larger than 5GB because we're doing a "useless" API call first. As usual, it depends. A similar approach has been discussed in this Github Issue in the Airflow repository.
-
Deploy Apache Airflow on AWS Elastic Kubernetes Service (EKS)
helm repo add apache-airflow https://airflow.apache.org
-
New Apache Airflow Operators for Google Generative AI
We only use KubernetesOperators, but this has many downsides, and it's very clearly a 2nd thought of the Airflow project. It creates confusion because users of Airflow expect features A, B, and C, and when using KubernetesOperators they aren't functional because your biz logic needs to be separated. There are a number of blog posts echoing a similar critique[1]. Using KubernetesOperators creates a lot of wrong abstractions, impedes testability, and makes Airflow as a whole a pretty overkill system just to monitor external tasks. At that point, you should have just had your orchestration in client code to begin with, and many other frameworks made this correct division between client and server. That would also make it easier to support multiple languages.
According to their README: https://github.com/apache/airflow#approach-to-dependencies-o...
-
A note from our sponsor - Judoscale
judoscale.com | 25 Apr 2025
Stats
apache/airflow is an open source project licensed under Apache License 2.0 which is an OSI approved license.
The primary programming language of Airflow is Python.