SaaSHub helps you find the best software and product alternatives Learn more →
Top 23 Python Mlops Projects
-
Data pipelines: Apache Kafka and Airflow are often used for building data pipelines that can continuously feed data to models in production.
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
-
vLLM stands for virtual large language models. It is one of the open source fast inferencing and serving libraries. As the name suggests, ‘virtual’ encapsulates the concept of virtual memory and paging from operating systems, which allows addressing the problem of maximum utilization of resources and providing faster token generation by utilizing PagedAttention. Traditional LLM serving involves storing large attention keys and value tensors in GPU memory, leading to inefficient memory usage.
-
-
Project mention: Build a Stock Dashboard in less than 40 lines of Python code!🤓 | dev.to | 2024-12-05
Star ⭐ Taipy repo
-
This article demonstrates how to work with near real-time and historical data using the dlt package. Whether you need to scale data access across the enterprise or provide historical data for post-event analysis, you can use the same framework to provide customer data. In a future article, I'll demonstrate how to use dlt with a workflow orchestrator such as Apache Airflow or Dagster.``
-
-
OpenLLM
Run any open-source LLMs, such as Llama, Mistral, as OpenAI compatible API endpoint in the cloud.
OpenLLM is a powerful platform that empowers developers to leverage the potential of open-source large language models (LLMs). It is like a Swiss Army knife for LLMs. It's a set of tools that helps developers overcome these deployment hurdles.
-
-
Kedro
Kedro is a toolbox for production-ready data science. It uses software engineering best practices to help you create data engineering and data science pipelines that are reproducible, maintainable, and modular.
Project mention: 20 Open Source Tools I Recommend to Build, Share, and Run AI Projects | dev.to | 2024-11-13Kedro is an ML development framework that brings data science projects from pilot development to production by creating reproducible, maintainable, and modular data science code. Kedro has a data catalog for data handling, support pipeline building, and a standardized template for code maintainability and consistency to effectively do this. Its data catalog uses lightweight data connectors to manage and track datasets. This allows you to use the same pipeline to build multiple production-level codes across your system.
-
wandb
The AI developer platform. Use Weights & Biases to train and fine-tune models, and manage models from experimentation to production.
Project mention: [Python] How do we lazyload a Python module? - analyzing LazyLoader from MLflow | dev.to | 2024-10-05One day I was hopping around a few popular ML libraries in Python, including MLflow. While glancing at its source code, one class attracted my interest, LazyLoader in __init__.py (well, this actually mirrors from the wandb project, but the original code has changed from what MLflow is using now, as you can see).
-
Project mention: Show HN: Flow – A Dynamic Task Engine for AI Agents Without DAG | news.ycombinator.com | 2024-12-02
Interesting! I feel like this is a cross between https://github.com/dagworks-inc/burr (switch state for context) and https://github.com/Netflix/metaflow because the output of the "task" declares its next hop...
-
deeplake
Database for AI. Store Vectors, Images, Texts, Videos, etc. Use with LLMs/LangChain. Store, query, version, & visualize any AI data. Stream data in real-time to PyTorch/TensorFlow. https://activeloop.ai
Finally, we stored these vectors in our chosen database: the activeloop DeepLake database. This database is open source, something near and dear to our own open-source hearts. We will cover some additional details in a further section, but it is specifically designed to handle vector data and perform efficient similarity searches, which is crucial for quick and accurate retrieval during the RAG process.
-
BentoML
The easiest way to serve AI apps and models - Build Model Inference APIs, Job queues, LLM apps, Multi-model pipelines, and more!
Project mention: Recapping the AI, Machine Learning and Computer Meetup — August 15, 2024 | dev.to | 2024-08-15As a data scientist/ML practitioner, how would you feel if you can independently iterate on your data science projects without ever worrying about operational overheads like deployment or containerization? Let’s find out by walking you through a sample project that helps you do so! We’ll combine Python, AWS, Metaflow and BentoML into a template/scaffolding project with sample code to train, serve, and deploy ML models…while making it easy to swap in other ML models.
-
clearml
ClearML - Auto-Magical CI/CD to streamline your AI workload. Experiment Management, Data Management, Pipeline, Orchestration, Scheduling & Serving in one MLOps/LLMOps solution
-
Project mention: RFC: The Feast Kubernetes Operator (The Open Source Feature Store) | news.ycombinator.com | 2024-09-24
Hey folks!
I'm a maintainer for Feast (https://github.com/feast-dev/feast) (the Open Source Feature Store) and the Feast community is working on creating a Kubernetes Operator for deploying Feast on Kubernetes and would love any feedback you have before we get started!
Here is the GitHub issue: https://github.com/feast-dev/feast/issues/4561, a design doc: https://docs.google.com/document/d/1vGKMizf3_14IyiF_W_Ik7CR0..., and a Slack channel: https://communityinviter.com/apps/feastopensource/feast-the-...!
Thanks a ton in advance for your interest/comments!
-
courses
This repository is a curated collection of links to various courses and resources about Artificial Intelligence (AI) (by SkalskiP)
-
-
superduper
Superduper: Build end-to-end AI applications and agent workflows on your existing data infrastructure and preferred tools - without migrating your data.
Project mention: Build fully portable AI applications on top of Snowflake with SuperDuperDB | dev.to | 2024-06-26Customize how AI and databases work together. Scale your AI projects to handle more data and users. Move AI projects between different environments easily. Extend the system with new AI features and database functionality. Check it out: Blog: https://blog.superduperdb.com/version-02 Github: https://github.com/SuperDuperDB/superduperdb (leave us a star ⭐️🥳)
-
lightning-hydra-template
PyTorch Lightning + Hydra. A very user-friendly template for ML experimentation. ⚡🔥⚡
Project mention: User-friendly PyTorch Lightning and Hydra template for ML experimentation | news.ycombinator.com | 2024-02-05 -
-
FedML
FEDML - The unified and scalable ML library for large-scale distributed training, model serving, and federated learning. FEDML Launch, a cross-cloud scheduler, further enables running any AI jobs on any GPU cloud or on-premise cluster. Built on this library, TensorOpera AI (https://TensorOpera.ai) is your generative AI platform at scale.
-
-
When originally designing the chatbot, we opted to build it in Python, despite being a heavily JavaScript-oriented shop. This decision was driven by the availability of more mature analytic tools for objectively testing chatbot hallucination and accuracy in Python. So far, we've been evaluating answers qualitatively, but we plan to incorporate a tool like Giskard to bring a more quantitative approach to our evaluations. This step is crucial and one that, anecdotally, is often overlooked in many production chatbots.
Python Mlops discussion
Python Mlops related posts
-
Data Engineering with DLT and REST
-
Argilla: Build high quality datasets for your AI models
-
Creation of the ApostropheCMS Documentation Chatbot
-
Kedro – An open-source framework for data science code
-
Recapping the AI, Machine Learning and Computer Meetup — August 15, 2024
-
10 Open Source MLOps Projects You Didn’t Know About
-
Show HN: We made glhf.chat – run almost any open-source LLM, including 405B
-
A note from our sponsor - SaaSHub
www.saashub.com | 14 Jan 2025
Index
What are some of the best open-source Mlops projects in Python? This list will help you:
Project | Stars | |
---|---|---|
1 | Airflow | 38,226 |
2 | vllm | 33,579 |
3 | serve | 21,237 |
4 | Taipy | 17,605 |
5 | dagster | 12,243 |
6 | ml-engineering | 12,268 |
7 | OpenLLM | 10,368 |
8 | great_expectations | 10,100 |
9 | Kedro | 10,099 |
10 | wandb | 9,366 |
11 | metaflow | 8,409 |
12 | deeplake | 8,294 |
13 | BentoML | 7,262 |
14 | clearml | 5,771 |
15 | feast | 5,715 |
16 | courses | 5,529 |
17 | aim | 5,294 |
18 | superduper | 4,907 |
19 | lightning-hydra-template | 4,382 |
20 | zenml | 4,302 |
21 | FedML | 4,225 |
22 | awesome-mlops | 4,223 |
23 | giskard | 4,208 |