Python Mlops

Open-source Python projects categorized as Mlops

Top 23 Python Mlops Projects

  • Airflow

    Apache Airflow - A platform to programmatically author, schedule, and monitor workflows

  • Project mention: AI Strategy Guide: How to Scale AI Across Your Business | dev.to | 2024-05-11

    Level 1 of MLOps is when you've put each lifecycle stage and their intefaces in an automated pipeline. The pipeline could be a python or bash script, or it could be a directed acyclic graph run by some orchestration framework like Airflow, dagster or one of the cloud-provider offerings. AI- or data-specific platforms like MLflow, ClearML and dvc also feature pipeline capabilities.

  • jina

    ☁️ Build multimodal AI applications with cloud-native stack

  • Project mention: Jina.ai: Self-host Multimodal models | news.ycombinator.com | 2024-01-26
  • Scout Monitoring

    Free Django app performance insights with Scout Monitoring. Get Scout setup in minutes, and let us sweat the small stuff. A couple lines in settings.py is all you need to start monitoring your apps. Sign up for our free tier today.

    Scout Monitoring logo
  • vllm

    A high-throughput and memory-efficient inference and serving engine for LLMs

  • Project mention: AI leaderboards are no longer useful. It's time to switch to Pareto curves | news.ycombinator.com | 2024-04-30

    I guess the root cause of my claim is that OpenAI won't tell us whether or not GPT-3.5 is an MoE model, and I assumed it wasn't. Since GPT-3.5 is clearly nondeterministic at temp=0, I believed the nondeterminism was due to FPU stuff, and this effect was amplified with GPT-4's MoE. But if GPT-3.5 is also MoE then that's just wrong.

    What makes this especially tricky is that small models are truly 100% deterministic at temp=0 because the relative likelihoods are too coarse for FPU issues to be a factor. I had thought 3.5 was big enough that some of its token probabilities were too fine-grained for the FPU. But that's probably wrong.

    On the other hand, it's not just GPT, there are currently floating-point difficulties in vllm which significantly affect the determinism of any model run on it: https://github.com/vllm-project/vllm/issues/966 Note that a suggested fix is upcasting to float32. So it's possible that GPT-3.5 is using an especially low-precision float and introducing nondeterminism by saving money on compute costs.

    Sadly I do not have the money[1] to actually run a test to falsify any of this. It seems like this would be a good little research project.

    [1] Or the time, or the motivation :) But this stuff is expensive.

  • nni

    An open source AutoML toolkit for automate machine learning lifecycle, including feature engineering, neural architecture search, model compression and hyper-parameter tuning.

  • dagster

    An orchestration platform for the development, production, and observation of data assets.

  • Project mention: AI Strategy Guide: How to Scale AI Across Your Business | dev.to | 2024-05-11

    Level 1 of MLOps is when you've put each lifecycle stage and their intefaces in an automated pipeline. The pipeline could be a python or bash script, or it could be a directed acyclic graph run by some orchestration framework like Airflow, dagster or one of the cloud-provider offerings. AI- or data-specific platforms like MLflow, ClearML and dvc also feature pipeline capabilities.

  • great_expectations

    Always know what to expect from your data.

  • Kedro

    Kedro is a toolbox for production-ready data science. It uses software engineering best practices to help you create data engineering and data science pipelines that are reproducible, maintainable, and modular.

  • Project mention: Nextflow: Data-Driven Computational Pipelines | news.ycombinator.com | 2023-08-10

    Interesting, thanks for sharing. I'll definitely take a look, although at this point I am so comfortable with Snakemake, it is a bit hard to imagine what would convince me to move to another tool. But I like the idea of composable pipelines: I am building a tool (too early to share) that would allow to lay Snakemake pipelines on top of each other using semi-automatic data annotations similar to how it is done in kedro (https://github.com/kedro-org/kedro).

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • Taipy

    Turns Data and AI algorithms into production-ready web applications in no time.

  • Project mention: Python Day 9: Building Interactive Web Apps without HTML/CSS and JavaScript | dev.to | 2024-04-26

    Taipy is an open-source Python library that enables data scientists and developers to build robust end-to-end data pipelines.

  • wandb

    🔥 A tool for visualizing and tracking your machine learning experiments. This repo contains the CLI and Python API.

  • Project mention: A list of SaaS, PaaS and IaaS offerings that have free tiers of interest to devops and infradev | dev.to | 2024-02-05

    Weights & Biases — The developer-first MLOps platform. Build better models faster with experiment tracking, dataset versioning, and model management. Free tier for personal projects only, with 100 GB of storage included.

  • deeplake

    Database for AI. Store Vectors, Images, Texts, Videos, etc. Use with LLMs/LangChain. Store, query, version, & visualize any AI data. Stream data in real-time to PyTorch/TensorFlow. https://activeloop.ai

  • Project mention: FLaNK AI Weekly 25 March 2025 | dev.to | 2024-03-25
  • metaflow

    :rocket: Build and manage real-life ML, AI, and data science projects with ease!

  • Project mention: FLaNK Stack 05 Feb 2024 | dev.to | 2024-02-05
  • BentoML

    The easiest way to serve AI/ML models in production - Build Model Inference Service, LLM APIs, Multi-model Inference Graph/Pipelines, LLM/RAG apps, and more!

  • Project mention: Who's hiring developer advocates? (December 2023) | dev.to | 2023-12-04

    Link to GitHub -->

  • feast

    The Open Source Feature Store for Machine Learning

  • Project mention: What's Happening with Feast? | news.ycombinator.com | 2023-12-07
  • clearml

    ClearML - Auto-Magical CI/CD to streamline your AI workload. Experiment Management, Data Management, Pipeline, Orchestration, Scheduling & Serving in one MLOps/LLMOps solution

  • Project mention: FLaNK Stack Weekly 12 February 2024 | dev.to | 2024-02-12
  • aim

    Aim 💫 — An easy-to-use & supercharged open-source experiment tracker.

  • Project mention: aim VS cascade - a user suggested alternative | libhunt.com/r/aim | 2023-12-05
  • courses

    This repository is a curated collection of links to various courses and resources about Artificial Intelligence (AI) (by SkalskiP)

  • Project mention: If you are looking for free courses about AI, LLMs, CV, or NLP, I created the repository with links to resources that I found super high quality and helpful. The link is in the comment. | /r/ChatGPT | 2023-07-02

    I found it: https://github.com/SkalskiP/courses

  • superduperdb

    🔮 SuperDuperDB: Bring AI to your database! Build, deploy and manage any AI application directly with your existing data infrastructure, without moving your data. Including streaming inference, scalable model training and vector search.

  • Project mention: FLaNK Stack Weekly 12 February 2024 | dev.to | 2024-02-12
  • FedML

    FEDML - The unified and scalable ML library for large-scale distributed training, model serving, and federated learning. FEDML Launch, a cross-cloud scheduler, further enables running any AI jobs on any GPU cloud or on-premise cluster. Built on this library, TensorOpera AI (https://TensorOpera.ai) is your generative AI platform at scale.

  • Project mention: [Experiment] The future of AI is open-source, and here is the plan | /r/samkoesnadi | 2023-06-05

    FedML https://github.com/FedML-AI/FedML might already provide a lot of tools to do the job

  • lightning-hydra-template

    PyTorch Lightning + Hydra. A very user-friendly template for ML experimentation. ⚡🔥⚡

  • Project mention: User-friendly PyTorch Lightning and Hydra template for ML experimentation | news.ycombinator.com | 2024-02-05
  • zenml

    ZenML 🙏: Build portable, production-ready MLOps pipelines. https://zenml.io.

  • Project mention: FLaNK AI - 01 April 2024 | dev.to | 2024-04-01
  • awesome-mlops

    :sunglasses: A curated list of awesome MLOps tools (by kelvins)

  • Project mention: Choosing an Orchestrator in a green-field setup | /r/mlops | 2023-12-07

    Lots of good projects on https://github.com/kelvins/awesome-mlops too

  • polyaxon

    MLOps Tools For Managing & Orchestrating The Machine Learning LifeCycle

  • pipelines

    Machine Learning Pipelines for Kubeflow

  • SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

Python Mlops related posts

  • AI leaderboards are no longer useful. It's time to switch to Pareto curves

    1 project | news.ycombinator.com | 30 Apr 2024
  • Building an Email Assistant Application with Burr

    6 projects | dev.to | 26 Apr 2024
  • Show HN: Evaluate LLM-based RAG Applications with automated test set generation

    1 project | news.ycombinator.com | 11 Apr 2024
  • VLLM Sacrifices Accuracy for Speed

    1 project | news.ycombinator.com | 23 Jan 2024
  • Detect, Defend, Prevail: Payments Fraud Detection using ML & Deepchecks

    1 project | dev.to | 13 Jan 2024
  • Easy, fast, and cheap LLM serving for everyone

    1 project | news.ycombinator.com | 17 Dec 2023
  • Introduction to NannyML: Model Evaluation without labels

    1 project | dev.to | 15 Dec 2023
  • A note from our sponsor - SaaSHub
    www.saashub.com | 1 Jun 2024
    SaaSHub helps you find the best software and product alternatives Learn more →

Index

What are some of the best open-source Mlops projects in Python? This list will help you:

Project Stars
1 Airflow 34,877
2 jina 20,235
3 vllm 20,017
4 nni 13,819
5 dagster 10,468
6 great_expectations 9,567
7 Kedro 9,409
8 Taipy 9,282
9 wandb 8,354
10 deeplake 7,799
11 metaflow 7,688
12 BentoML 6,650
13 feast 5,312
14 clearml 5,324
15 aim 4,865
16 courses 4,618
17 superduperdb 4,433
18 FedML 4,082
19 lightning-hydra-template 3,742
20 zenml 3,703
21 awesome-mlops 3,654
22 polyaxon 3,497
23 pipelines 3,464

Sponsored
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com