Python llm-serving

Open-source Python projects categorized as llm-serving

Top 5 Python llm-serving Projects

  • Ray

    Ray is a unified framework for scaling AI and Python applications. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.

  • Project mention: Open Source Advent Fun Wraps Up! | dev.to | 2024-01-05

    22. Ray | Github | tutorial

  • vllm

    A high-throughput and memory-efficient inference and serving engine for LLMs

  • Project mention: Mistral AI Launches New 8x22B Moe Model | news.ycombinator.com | 2024-04-09

    The easiest is to use vllm (https://github.com/vllm-project/vllm) to run it on a Couple of A100's, and you can benchmark this using this library (https://github.com/EleutherAI/lm-evaluation-harness)

  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

    WorkOS logo
  • OpenLLM

    Run any open-source LLMs, such as Llama 2, Mistral, as OpenAI compatible API endpoint, locally and in the cloud.

  • Project mention: First 15 Open Source Advent projects | dev.to | 2023-12-15

    13. OpenLLM by BentoML | Github | tutorial

  • skypilot

    SkyPilot: Run LLMs, AI, and Batch jobs on any cloud. Get maximum savings, highest GPU availability, and managed execution—all with a simple interface.

  • Project mention: Ask HN: Most efficient way to fine-tune an LLM in 2024? | news.ycombinator.com | 2024-04-04
  • mosec

    A high-performance ML model serving framework, offers dynamic batching and CPU/GPU pipelines to fully exploit your compute machine

  • Project mention: 20x Faster as the Beginning: Introducing pgvecto.rs extension written in Rust | dev.to | 2023-08-06

    Mosec - A high-performance serving framework for ML models, offers dynamic batching and CPU/GPU pipelines to fully exploit your compute machine. Simple and faster alternative to NVIDIA Triton.

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

Python llm-serving related posts

Index

What are some of the best open-source llm-serving projects in Python? This list will help you:

Project Stars
1 Ray 31,101
2 vllm 18,041
3 OpenLLM 8,733
4 skypilot 5,636
5 mosec 703

Sponsored
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com