Airflow

Apache Airflow - A platform to programmatically author, schedule, and monitor workflows (by apache)

Airflow Alternatives

Similar projects and alternatives to Airflow

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a better Airflow alternative or higher similarity.

Suggest an alternative to Airflow

Reviews and mentions

Posts with mentions or reviews of Airflow. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2021-10-17.
  • Anything Comparable to power automate or flow for Linux?
    reddit.com/r/sysadmin | 2021-10-17
    I never used Power Automate, but it looks like a workflow orchestrator. So checkout https://airflow.apache.org/
  • Airflow with different conda environments
    If Airflow is the way to go then try DockerOperators (https://github.com/apache/airflow/blob/main/airflow/providers/docker/example_dags/example_docker.py). It's not the easiest set up but will do what you from what I get from your question.
  • Databricks jobs and Airflow on Kubernetes
    I have not used databricks but it is something we are looking into integrating into our infrastructure in the future. Since Databricks is a service that does not run locally, I would use the databricks Operators/Hooks that come with airflow, rather than trying to build out anything of my own. https://github.com/apache/airflow/blob/main/airflow/providers/databricks/hooks/databricks.py
  • what do you think about airflow?
    I think one of the main design problems I have with Airflow is the fact that it tends to tightly couple processing/transform code with data movement code which makes debugging tricky. The way I have solved this is by building a command line interface to all the processing code so I can debug the processing code outside of any airflow infrastructure (which can be painful to get running locally if one does not use Airflow Breeze.
  • BigQuery vs Relational Databases
    reddit.com/r/bigquery | 2021-09-08
    However, my typical go-to is to utilize something like [DBT](https://www.getdbt.com/) or [Airflow](https://airflow.apache.org/) to orchestrate sets of related queries. There are a lot of powerful patterns you can adopt by using these kind of orchestration services in conjunction with BigQuery.
  • Airflow, Spark, other tool ?
    I actually used this example from Airflow: https://github.com/apache/airflow/blob/main/airflow/example_dags/tutorial_taskflow_api_etl.py
  • Ask HN: What is a good Python project for a mid lv engineer to contribute to?
    news.ycombinator.com | 2021-08-26
    Come help out with Apache Airflow! It's a great project to get involved with because it has a ton of users and real world problems, but it's still early enough that there's low hanging fruit in terms of adding functionality.

    A helpful place to start is the provider packages, since Airflow has integrations with so many 3rd party providers, and if you have knowledge in any of them it can be a good jumping off point.

    https://github.com/apache/airflow/

  • Serverless data-engineering pipeline suggestions
  • https://np.reddit.com/r/dataengineering/comments/p5w7xl/serverless_dataengineering_pipeline_suggestions/h99q86w/
  • Introducing Elyra pipelines with custom component support
    dev.to | 2021-08-10
    The Elyra open source project for JupyterLab aims to simplify common data science tasks. Its most popular feature is the Visual Pipeline Editor, which is used to create pipelines without the need for coding. You can run these pipelines in JupyterLab or on Kubeflow Pipelines or Apache Airflow.
  • Power Automate for ML workflows?
    Is the workflow on-prem or cloud? We've just done a project where we joined up a load of workflow tasks using airflow. The python bindings to turn scripts into jobs (DAGs) was really straightforward, just decorating the functions. Worked really nicely, and can be really low code - tasks can be scheduled or manually triggered.
  • Install Airflow 2 on a Raspberry Pi (using Python 3.x)
    dev.to | 2021-07-22
    Go to Airflow's github repo and download the airflow-webserver.service and the airflow-scheduler.service
  • That one question again: an alternative to Jenkins?
    reddit.com/r/devops | 2021-07-05
    Circle CI has positioned themselves as a modern Jenkins replacement, but I should note I haven't seen them used over a long period of time which is the real measure of quality. Honestly most pipelines seem to be better suited to workflow frameworks like Apache Airflow than these CI/CD specific tools, they may be a little harder to learn up front but they seem to be more resilient to the type of customization that usually gets you in trouble with Jenkins down the line.
  • Occasional "No such file" errors even though file exists when trying to wget files from FTP?
    reddit.com/r/sysadmin | 2021-06-22
    Occasionally, we get "No such file" errors, even though the file does exist on the FTP server (when inspecting via FTP GUI), when trying to import. Note, this does not happen all the time nor for the same files each time. For context, we have an airflow (https://airflow.apache.org/) scheduled process (running on "local" executor mode) that imports multiple TSV files from an FTP server in a multithreaded manner in pools of 3 at a time.
  • New to data orchestration? Start here.
    dev.to | 2021-06-02
    First-generation data orchestration tools like Airflow are primarily focused on improving usability for data scientists with the introduction of Python support (vs previous tools that required queries to be written in JSON and YAML). This improved UI made it easier for data teams to manage their pipeline flows without getting as caught up in the process.

Stats

Basic Airflow repo stats
53
23,377
10.0
5 days ago

apache/airflow is an open source project licensed under Apache License 2.0 which is an OSI approved license.

Scout APM: A developer's best friend. Try free for 14-days
Scout APM uses tracing logic that ties bottlenecks to source code so you know the exact line of code causing performance issues and can get back to building a great product faster.
scoutapm.com
Find remote Python jobs at our new job board 99remotejobs.com.
There are 10 new remote jobs listed recently.
Are you hiring? Post a new remote job listing for free.