Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality. Learn more →
Top 23 Python Mlops Projects
-
Project mention: Building in Public: Leveraging Tublian's AI Copilot for My Open Source Contributions | dev.to | 2024-02-12
Contributing to Apache Airflow's open-source project immersed me in collaborative coding. Experienced maintainers rigorously reviewed my contributions, providing constructive feedback. This ongoing dialogue refined the codebase and honed my understanding of best practices.
-
-
WorkOS
The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
-
-
nni
An open source AutoML toolkit for automate machine learning lifecycle, including feature engineering, neural architecture search, model compression and hyper-parameter tuning.
-
-
Project mention: Data Quality at Scale with Great Expectations, Spark, and Airflow on EMR | dev.to | 2023-04-24
Great Expectations (GE) is an open-source data validation tool that helps ensure data quality.
-
Kedro
Kedro is a toolbox for production-ready data science. It uses software engineering best practices to help you create data engineering and data science pipelines that are reproducible, maintainable, and modular.
Interesting, thanks for sharing. I'll definitely take a look, although at this point I am so comfortable with Snakemake, it is a bit hard to imagine what would convince me to move to another tool. But I like the idea of composable pipelines: I am building a tool (too early to share) that would allow to lay Snakemake pipelines on top of each other using semi-automatic data annotations similar to how it is done in kedro (https://github.com/kedro-org/kedro).
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
-
wandb
🔥 A tool for visualizing and tracking your machine learning experiments. This repo contains the CLI and Python API.
Project mention: A list of SaaS, PaaS and IaaS offerings that have free tiers of interest to devops and infradev | dev.to | 2024-02-05Weights & Biases — The developer-first MLOps platform. Build better models faster with experiment tracking, dataset versioning, and model management. Free tier for personal projects only, with 100 GB of storage included.
-
deeplake
Database for AI. Store Vectors, Images, Texts, Videos, etc. Use with LLMs/LangChain. Store, query, version, & visualize any AI data. Stream data in real-time to PyTorch/TensorFlow. https://activeloop.ai
Project mention: Qdrant, the Vector Search Database, raised $28M in a Series A round | news.ycombinator.com | 2024-01-23I think Activeloop(YC) is too: https://github.com/activeloopai/deeplake/
-
-
I’ve been working in tech for more than five years. I started as a Data Scientist, and now I’m exploring and loving the DevRel 🥑 role for Taipy. Needless to say, evolving in the tech scene has been a ride full of ups, downs, and everything in between.
-
Link to GitHub -->
-
-
clearml
ClearML - Auto-Magical CI/CD to streamline your ML workflow. Experiment Manager, MLOps and Data-Management
-
-
courses
This repository is a curated collection of links to various courses and resources about Artificial Intelligence (AI) (by SkalskiP)
Project mention: If you are looking for free courses about AI, LLMs, CV, or NLP, I created the repository with links to resources that I found super high quality and helpful. The link is in the comment. | /r/ChatGPT | 2023-07-02I found it: https://github.com/SkalskiP/courses
-
superduperdb
🔮 SuperDuperDB: Bring AI to your database! Build, deploy and manage any AI application directly with your existing data infrastructure, without moving your data. Including streaming inference, scalable model training and vector search.
-
FedML
FEDML - The unified and scalable ML library for large-scale distributed training, model serving, and federated learning. FEDML Launch, a cross-cloud scheduler, further enables running any AI jobs on any GPU cloud or on-premise cluster. Built on this library, FEDML Nexus AI (https://fedml.ai) is the dedicated cloud service for generative AI
Project mention: [Experiment] The future of AI is open-source, and here is the plan | /r/samkoesnadi | 2023-06-05FedML https://github.com/FedML-AI/FedML might already provide a lot of tools to do the job
-
Project mention: What are some open-source ML pipeline managers that are easy to use? | /r/mlops | 2023-05-03
-
lightning-hydra-template
PyTorch Lightning + Hydra. A very user-friendly template for ML experimentation. ⚡🔥⚡
Project mention: User-friendly PyTorch Lightning and Hydra template for ML experimentation | news.ycombinator.com | 2024-02-05 -
-
-
Lots of good projects on https://github.com/kelvins/awesome-mlops too
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
Python Mlops related posts
- VLLM Sacrifices Accuracy for Speed
- Detect, Defend, Prevail: Payments Fraud Detection using ML & Deepchecks
- Easy, fast, and cheap LLM serving for everyone
- Introduction to NannyML: Model Evaluation without labels
- vllm
- Mixtral Expert Parallelism
-
modeldb VS cascade - a user suggested alternative
2 projects | 12 Dec 2023
-
A note from our sponsor - InfluxDB
www.influxdata.com | 18 Mar 2024
Index
What are some of the best open-source Mlops projects in Python? This list will help you:
Project | Stars | |
---|---|---|
1 | Airflow | 33,864 |
2 | jina | 19,773 |
3 | vllm | 16,141 |
4 | nni | 13,646 |
5 | dagster | 9,866 |
6 | great_expectations | 9,342 |
7 | Kedro | 9,249 |
8 | wandb | 7,999 |
9 | deeplake | 7,584 |
10 | metaflow | 7,450 |
11 | Taipy | 7,373 |
12 | BentoML | 6,389 |
13 | feast | 5,184 |
14 | clearml | 5,123 |
15 | aim | 4,672 |
16 | courses | 4,377 |
17 | superduperdb | 4,209 |
18 | FedML | 4,004 |
19 | zenml | 3,574 |
20 | lightning-hydra-template | 3,545 |
21 | polyaxon | 3,461 |
22 | pipelines | 3,406 |
23 | awesome-mlops | 3,370 |