SaaSHub helps you find the best software and product alternatives Learn more →
Airflow Alternatives
Similar projects and alternatives to Airflow
-
dagster
An orchestration platform for the development, production, and observation of data assets.
-
Apache Spark
Apache Spark - A unified analytics engine for large-scale data processing
-
WorkOS
The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
-
luigi
Luigi is a Python module that helps you build complex pipelines of batch jobs. It handles dependency resolution, workflow management, visualization etc. It also comes with Hadoop support built in.
-
n8n
Free and source-available fair-code licensed workflow automation tool. Easily automate tasks across different services.
-
Kedro
Kedro is a toolbox for production-ready data science. It uses software engineering best practices to help you create data engineering and data science pipelines that are reproducible, maintainable, and modular.
-
nodejs-bigquery
Node.js client for Google Cloud BigQuery: A fast, economical and fully-managed enterprise data warehouse for large-scale data analytics.
-
airbyte
The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
-
Apache Camel
Apache Camel is an open source integration framework that empowers you to quickly and easily integrate various systems consuming or producing data.
-
superset
Apache Superset is a Data Visualization and Data Exploration Platform
-
terraform
Terraform enables you to safely and predictably create, change, and improve infrastructure. It is a source-available tool that codifies APIs into declarative configuration files that can be shared amongst team members, treated as code, edited, reviewed, and versioned.
-
dbt-core
dbt enables data analysts and engineers to transform their data using the same practices that software engineers use to build applications.
-
-
-
-
Pandas
Flexible and powerful data analysis / manipulation library for Python, providing labeled data structures similar to R data.frame objects, statistical functions, and much more
-
streamify
A data engineering project with Kafka, Spark Streaming, dbt, Docker, Airflow, Terraform, GCP and much more!
-
-
d3
Bring data to life with SVG, Canvas and HTML. :bar_chart::chart_with_upwards_trend::tada:
-
-
PostgreSQL
Mirror of the official PostgreSQL GIT repository. Note that this is just a *mirror* - we don't work with pull requests on github. To contribute, please see https://wiki.postgresql.org/wiki/Submitting_a_Patch
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
Airflow reviews and mentions
-
Airflow VS quix-streams - a user suggested alternative
2 projects | 7 Dec 2023
-
Simplifying Data Transformation in Redshift: An Approach with DBT and Airflow
Airflow is the most widely used and well-known tool for orchestrating data workflows. It allows for efficient pipeline construction, scheduling, and monitoring.
-
Ask HN: What is the correct way to deal with pipelines?
I agree there are many options in this space. Two others to consider:
- https://github.com/spotify/luigi
There are also many Kubernetes based options out there. For the specific use case you specified, you might even consider a plain old Makefile and incrond if you expect these all to run on a single host and be triggered by a new file showing up in a directory…
- Cómo construir tu propia data platform. From zero to hero.
-
Is it impossible to contribute to open source as a data engineer?
You can try and contribute some new connectors/operators for workflow managers like Airflow or Airbyte
-
Exploring MLOps Tools and Frameworks: Enhancing Machine Learning Operations
Apache Airflow:
-
Python task scheduler with a web UI
Looks interesting as a light-weight alternative to https://www.prefect.io/ (which itself is a lighter-weight / more modern alternative to https://airflow.apache.org/ ).
-
Working with Managed Workflows for Apache Airflow (MWAA) and Amazon Redshift
You can actually setup and delete new Redshift clusters using Apache Airflow. We can see in the example_dags of a DAG that does a complete setup and delete of a Redshift cluster. There are a few things to think about however.
-
.NET Modern Task Scheduler
A few years ago, I opened a GitHub issue with Microsoft telling them that I think the .NET ecosystem needs its own equivalent of Apache Airflow or Prefect. Fast forward 'til now, and I still don't think we have anything close to these frameworks.
-
How do you decide when to keep a project in a single python file vs break it up into multiple files?
Check out taskinstance.py in the Airflow project, it's a well targeted file, it has only one main class TaskInstance and a few small supporting classes and functions. It is ~3000 lines long: https://github.com/apache/airflow/blob/main/airflow/models/taskinstance.py
-
A note from our sponsor - SaaSHub
www.saashub.com | 28 Mar 2024
Stats
apache/airflow is an open source project licensed under Apache License 2.0 which is an OSI approved license.
The primary programming language of Airflow is Python.