flyte
Celery-Kubernetes-Operator
Our great sponsors
flyte | Celery-Kubernetes-Operator | |
---|---|---|
31 | 1 | |
4,645 | 77 | |
5.7% | - | |
9.8 | 3.6 | |
7 days ago | 4 months ago | |
Go | Python | |
Apache License 2.0 | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
flyte
-
First 15 Open Source Advent projects
9. Flyte by Union AI | Github | tutorial
-
Orchestration: Thoughts on Dagster, Airflow and Prefect?
Anyone tried Flyte?
-
Flyte(v1.5.0) - Self-hosted solution to build production-grade data and ML pipelines; now ships with streaming support, pod templates, partial tasks and more 🚀 (3.2k stars on GitHub)
Flyte is an open source orchestration tool for managing the workflow of machine learning and AI projects. It runs on top of Kubernetes.
GitHub: https://github.com/flyteorg/flyte
- Kubernetes for Data Science with Kubeflow
- Dabbling with Dagster vs. Airflow
-
Airflow's Problem
Some of these were the core problems that we wanted to address as part of https://flyte.org. We started with a team first and multi-tenant approach at the core. For example, each team can have separate IAM roles, secrets are restricted to teams, tasks and workflows are shareable across teams, without making libraries. and it is possible to trigger workflows across teams.
-
Introducing Flyte (v1.1.0): Orchestrate Your Machine Learning and Data Pipelines with Ease (2.5K Stars on GitHub, Kubernetes-Native)
GitHub: https://github.com/flyteorg/flyte
Website: https://flyte.org/
Celery-Kubernetes-Operator
-
Help: from Docker-Compose to Production (EC2, ECS, EKR)
The take-out from that course is: don't deploy anything stateful on Kubernetes in production, period. Even disregarding that, don't deploy anything stateful that doesn't come in a form of an operator. For celery, https://github.com/celery/Celery-Kubernetes-Operator is a WIP, so obviously not suitable for anything.
What are some alternatives?
metaflow - :rocket: Build and manage real-life ML, AI, and data science projects with ease!
argo - Workflow Engine for Kubernetes
temporal - Temporal service
kubeflow - Machine Learning Toolkit for Kubernetes
flower - Real-time monitor and web admin for Celery distributed task queue
Kedro - Kedro is a toolbox for production-ready data science. It uses software engineering best practices to help you create data engineering and data science pipelines that are reproducible, maintainable, and modular.
rq - Simple job queues for Python
hera - Hera is an Argo Python SDK. Hera aims to make construction and submission of various Argo Project resources easy and accessible to everyone! Hera abstracts away low-level setup details while still maintaining a consistent vocabulary with Argo. ⭐️ Remember to star!
pachyderm - Data-Centric Pipelines and Data Versioning
polyaxon - MLOps Tools For Managing & Orchestrating The Machine Learning LifeCycle
kestra - Infinitely scalable, event-driven, language-agnostic orchestration and scheduling platform to manage millions of workflows declaratively in code.
whylogs - An open-source data logging library for machine learning models and data pipelines. 📚 Provides visibility into data quality & model performance over time. 🛡️ Supports privacy-preserving data collection, ensuring safety & robustness. 📈