awesome-argo
couler
awesome-argo | couler | |
---|---|---|
6 | 1 | |
1,792 | 889 | |
2.1% | 1.1% | |
7.3 | 5.2 | |
12 days ago | 14 days ago | |
Python | ||
Apache License 2.0 | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
awesome-argo
-
awesome-argo: A curated list of awesome projects and resources related to Argo (a CNCF hosted project)
Great idea. Just added 1-line summary as well as description for each project. https://github.com/terrytangyuan/awesome-argo/commit/c306766a726b34cebfb70304bceda3cc71d24823
- terrytangyuan/awesome-argo: A curated list of awesome projects and resources related to Argo (a CNCF hosted project)
- awesome-argo: A curated list of projects and resources related to Argo
couler
-
(Not) to Write a Pipeline
author seems to be describing the kind of patterns you might make with https://argoproj.github.io/argo-workflows/ . or see for example https://github.com/couler-proj/couler , which is an sdk for describing tasks that may be submitted to different workflow engines on the backend.
it's a little confusing to me that the author seems to object to "pipelines" and then equate them with messaging-queues. for me at least, "pipeline" vs "workflow-engine" vs "scheduler" are all basically synonyms in this context. those things may or may not be implemented with a message-queue for persistence, but the persistence layer itself is usually below the level of abstraction that $current_problem is really concerned with. like the author says, eventually you have to track state/timestamps/logs, but you get that from the beginning if you start with a workflow engine.
i agree with author that message-queues should not be a knee-jerk response to most problems because the LoE for edge-cases/observability/monitoring is huge. (maybe reach for a queue only if you may actually overwhelm whatever the "scheduler" can handle.) but don't build the scheduler from scratch either.. use argowf, kubeflow, or a more opinionated framework like airflow, mlflow, databricks, aws lamda or step-functions. all/any of these should have config or api that's robust enough to express rate-limit/retry stuff. almost any of these choices has better observability out-of-the-box than you can easily get from a queue. but most importantly.. they provide idioms for handling failure that data-science folks and junior devs can work with. the right way to structure code is just much more clear and things like structuring messages/events, subclassing workers, repeating/retrying tasks, is just harder to mess up.
What are some alternatives?
argocd-lovely-plugin - A plugin to make Argo CD behave like we'd like.
soopervisor - ☁️ Export Ploomber pipelines to Kubernetes (Argo), Airflow, AWS Batch, SLURM, and Kubeflow.
hera - Hera is an Argo Python SDK. Hera aims to make construction and submission of various Argo Project resources easy and accessible to everyone! Hera abstracts away low-level setup details while still maintaining a consistent vocabulary with Argo. ⭐️ Remember to star!
community - Information about the Kubeflow community including proposals and governance information.
awesome-kubernetes-security - A curated list of awesome Kubernetes security resources
argo - Workflow Engine for Kubernetes
kubernetes-demo-gitops - This is the GitOps repo for project vjanz/kubernetes-demo-app
sig-release - Repo for SIG release
elyra - Elyra extends JupyterLab with an AI centric approach.
memorials - 🕯️💐CNCF Community Memorials
argo-workflows-aws-plugin - Argo Workflows Executor Plugin for AWS Services, e.g. SageMaker Pipelines, Glue, etc.