Python Mlops

Open-source Python projects categorized as Mlops

Top 23 Python Mlops Projects

  • Airflow

    Apache Airflow - A platform to programmatically author, schedule, and monitor workflows

    Project mention: Building in Public: Leveraging Tublian's AI Copilot for My Open Source Contributions | dev.to | 2024-02-12

    Contributing to Apache Airflow's open-source project immersed me in collaborative coding. Experienced maintainers rigorously reviewed my contributions, providing constructive feedback. This ongoing dialogue refined the codebase and honed my understanding of best practices.

  • jina

    ☁️ Build multimodal AI applications with cloud-native stack

    Project mention: Jina.ai: Self-host Multimodal models | news.ycombinator.com | 2024-01-26
  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

  • vllm

    A high-throughput and memory-efficient inference and serving engine for LLMs

    Project mention: Show HN: We got fine-tuning Mistral-7B to not suck | news.ycombinator.com | 2024-02-07

    Great question! scheduling workloads onto GPUs in a way where VRAM is being utilised efficiently was quite the challenge.

    What we found was the IO latency for loading model weights into VRAM will kill responsiveness if you don't "re-use" sessions (i.e. where the model weights remain loaded and you run multiple inference sessions over the same loaded weights).

    Obviously projects like https://github.com/vllm-project/vllm exist but we needed to build out a scheduler that can run a fleet of GPUs for a matrix of text/image vs inference/finetune sessions.

    disclaimer: I work on Helix

  • nni

    An open source AutoML toolkit for automate machine learning lifecycle, including feature engineering, neural architecture search, model compression and hyper-parameter tuning.

    Project mention: Filter Pruning for PyTorch | /r/deeplearning | 2023-04-13
  • dagster

    An orchestration platform for the development, production, and observation of data assets.

    Project mention: Experience with Dagster.io? | news.ycombinator.com | 2023-07-25
  • great_expectations

    Always know what to expect from your data.

    Project mention: Data Quality at Scale with Great Expectations, Spark, and Airflow on EMR | dev.to | 2023-04-24

    Great Expectations (GE) is an open-source data validation tool that helps ensure data quality.

  • Kedro

    Kedro is a toolbox for production-ready data science. It uses software engineering best practices to help you create data engineering and data science pipelines that are reproducible, maintainable, and modular.

    Project mention: Nextflow: Data-Driven Computational Pipelines | news.ycombinator.com | 2023-08-10

    Interesting, thanks for sharing. I'll definitely take a look, although at this point I am so comfortable with Snakemake, it is a bit hard to imagine what would convince me to move to another tool. But I like the idea of composable pipelines: I am building a tool (too early to share) that would allow to lay Snakemake pipelines on top of each other using semi-automatic data annotations similar to how it is done in kedro (https://github.com/kedro-org/kedro).

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

  • wandb

    🔥 A tool for visualizing and tracking your machine learning experiments. This repo contains the CLI and Python API.

    Project mention: A list of SaaS, PaaS and IaaS offerings that have free tiers of interest to devops and infradev | dev.to | 2024-02-05

    Weights & Biases — The developer-first MLOps platform. Build better models faster with experiment tracking, dataset versioning, and model management. Free tier for personal projects only, with 100 GB of storage included.

  • deeplake

    Database for AI. Store Vectors, Images, Texts, Videos, etc. Use with LLMs/LangChain. Store, query, version, & visualize any AI data. Stream data in real-time to PyTorch/TensorFlow. https://activeloop.ai

    Project mention: Qdrant, the Vector Search Database, raised $28M in a Series A round | news.ycombinator.com | 2024-01-23

    I think Activeloop(YC) is too: https://github.com/activeloopai/deeplake/

  • metaflow

    :rocket: Build and manage real-life ML, AI, and data science projects with ease!

    Project mention: FLaNK Stack 05 Feb 2024 | dev.to | 2024-02-05
  • BentoML

    Build Production-Grade AI Applications

    Project mention: Who's hiring developer advocates? (December 2023) | dev.to | 2023-12-04

    Link to GitHub -->

  • taipy

    Turns Data and AI algorithms into production-ready web applications in no time.

    Project mention: Show HN: Building data and AI apps, an alternative to Streamlit | news.ycombinator.com | 2024-02-12
  • feast

    Feature Store for Machine Learning

    Project mention: What's Happening with Feast? | news.ycombinator.com | 2023-12-07
  • clearml

    ClearML - Auto-Magical CI/CD to streamline your ML workflow. Experiment Manager, MLOps and Data-Management

    Project mention: FLaNK Stack Weekly 12 February 2024 | dev.to | 2024-02-12
  • aim

    Aim 💫 — An easy-to-use & supercharged open-source experiment tracker.

    Project mention: aim VS cascade - a user suggested alternative | libhunt.com/r/aim | 2023-12-05
  • courses

    This repository is a curated collection of links to various courses and resources about Artificial Intelligence (AI) (by SkalskiP)

    Project mention: If you are looking for free courses about AI, LLMs, CV, or NLP, I created the repository with links to resources that I found super high quality and helpful. The link is in the comment. | /r/ChatGPT | 2023-07-02

    I found it: https://github.com/SkalskiP/courses

  • superduperdb

    🔮 SuperDuperDB: Bring AI to your database! Build, deploy and manage any AI application directly with your existing data infrastructure, without moving your data. Including streaming inference, scalable model training and vector search.

    Project mention: FLaNK Stack Weekly 12 February 2024 | dev.to | 2024-02-12
  • FedML

    FEDML - The unified and scalable ML library for large-scale distributed training, model serving, and federated learning. FEDML Launch, a cross-cloud scheduler, further enables running any AI jobs on any GPU cloud or on-premise cluster. Built on this library, FEDML Nexus AI (https://fedml.ai) is the dedicated cloud service for generative AI

    Project mention: [Experiment] The future of AI is open-source, and here is the plan | /r/samkoesnadi | 2023-06-05

    FedML https://github.com/FedML-AI/FedML might already provide a lot of tools to do the job

  • zenml

    ZenML 🙏: Build portable, production-ready MLOps pipelines. https://zenml.io.

    Project mention: What are some open-source ML pipeline managers that are easy to use? | /r/mlops | 2023-05-03
  • lightning-hydra-template

    PyTorch Lightning + Hydra. A very user-friendly template for ML experimentation. ⚡🔥⚡

    Project mention: User-friendly PyTorch Lightning and Hydra template for ML experimentation | news.ycombinator.com | 2024-02-05
  • polyaxon

    MLOps Tools For Managing & Orchestrating The Machine Learning LifeCycle

  • pipelines

    Machine Learning Pipelines for Kubeflow

  • ploomber

    The fastest ⚡️ way to build data pipelines. Develop iteratively, deploy anywhere. ☁️

    Project mention: Show HN: JupySQL – a SQL client for Jupyter (ipython-SQL successor) | news.ycombinator.com | 2023-12-06

    - One-click sharing powered by Ploomber Cloud: https://ploomber.io

    Documentation: https://jupysql.ploomber.io

    Note that JupySQL is a fork of ipython-sql; which is no longer actively developed. Catherine, ipython-sql's creator, was kind enough to pass the project to us (check out ipython-sql's README).

    We'd love to learn what you think and what features we can ship for JupySQL to be the best SQL client! Please let us know in the comments!

  • SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020). The latest post mention was on 2024-02-12.

Python Mlops related posts

Index

What are some of the best open-source Mlops projects in Python? This list will help you:

Project Stars
1 Airflow 33,691
2 jina 19,727
3 vllm 15,141
4 nni 13,610
5 dagster 9,744
6 great_expectations 9,291
7 Kedro 9,216
8 wandb 7,896
9 deeplake 7,537
10 metaflow 7,398
11 BentoML 6,343
12 taipy 5,824
13 feast 5,156
14 clearml 5,088
15 aim 4,634
16 courses 4,303
17 superduperdb 4,166
18 FedML 3,981
19 zenml 3,543
20 lightning-hydra-template 3,503
21 polyaxon 3,453
22 pipelines 3,383
23 ploomber 3,340
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com