Our great sponsors
-
windmill
Open-source developer platform to turn scripts into workflows and UIs. Fastest workflow engine (5x vs Airflow). Open-source alternative to Airplane and Retool.
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
-
dbt-core
dbt enables data analysts and engineers to transform their data using the same practices that software engineers use to build applications.
-
luigi
Luigi is a Python module that helps you build complex pipelines of batch jobs. It handles dependency resolution, workflow management, visualization etc. It also comes with Hadoop support built in.
-
WorkOS
The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
-
magniv-core
Magniv Core - A Python-decorator based job orchestration platform. Avoid responsibility handoffs by abstracting infra and DevOps.
-
toil
A scalable, efficient, cross-platform (Linux/macOS) and easy-to-use workflow engine in pure Python.
shameless plug: I am building such system where the modules are code (typescript-deno or python) but the orchestration is no code (flows). It is fully OSS: https://github.com/windmill-labs/windmill
We kept hearing this from our users. We’ve just released our k8s operator based deployment of Orchest that should give you a good experience running an orchestration tool on k8s without much trouble.
https://github.com/orchest/orchest
Is anybody out there doing anything interesting with Airflow monitoring?
At my startup Cronitor we have an Airflow sdk* that makes it pretty easy to provision monitoring for each DAG, but essentially we are only monitoring that a DAG started on time and the total time taken.
* https://github.com/cronitorio/cronitor-airflow
https://github.com/checkr/states-language-cadence allows you to define workflows in states language over cadence.
When it comes to scale and DS work I'd use the ploomber open-source (https://github.com/ploomber/ploomber). It allows an easy transition between dev and production, incrementally building the DAG so you avoid expensive compute time and costs. It's easier to maintain and integrates seamlessly with Airflow, generating the DAGs for you.
dbt has just opened a serious conversation about supporting Python models. I'm sure they'd value your viewpoint! https://github.com/dbt-labs/dbt-core/discussions/5261
What are you trying to do? Distributed scheduler with a single instance? No database? Are you sure you don't just mean "a scheduler" ala Luigi? https://github.com/spotify/luigi
We at magniv.io are building an alternative.
Our core is open source https://github.com/MagnivOrg/magniv-core
We can set you up with our hosted if you would like to poke around!
You're probably thinking of Temporal (https://temporal.io/), which is a fork of the Cadence project originally developed at Uber.
I feel you. That's why we wrote a little library on top of SFN so that we can program SFN with Clojure instead of YAML https://github.com/Motiva-AI/stepwise. Application code sits with SFN definition and SFN Tasks are automatically integrated as polling Activities from Clojure code.
Thoughtworks made a case for this distinction in https://martinfowler.com/articles/cant-buy-integration.html#...
So being completely transparent, we're the creators of Direktiv (https://github.com/direktiv/direktiv). We're genuinely curious to have users who have previously used Airflow and other DAGs (mentioned in here is Argo workflows) try Direktiv and give us more feedback.
- direktiv runs containers as part of workflows from any compliant container registry, passing JSON structured data between workflow states.