Vllm Alternatives

Similar projects and alternatives to vllm

llama.cpp

769 56,891 10.0 C++ vllm VS llama.cpp

LLM inference in C/C++
ROCm

198 3,637 0.0 Python vllm VS ROCm

Discontinued AMD ROCm™ Software - GitHub Home [Moved to: https://github.com/ROCm/ROCm]
WorkOS

workos.com sponsored

The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
mlc-llm

89 16,955 9.9 Python vllm VS mlc-llm

Enable everyone to develop, optimize and deploy AI models natively on everyone's devices.
FastChat

82 33,877 9.6 Python vllm VS FastChat

An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.
FLiPStackWeekly

79 14 9.9 vllm VS FLiPStackWeekly

FLaNK AI Weekly covering Apache NiFi, Apache Flink, Apache Kafka, Apache Spark, Apache Iceberg, Apache Ozone, Apache Pulsar, and more...
bruno

55 18,935 9.9 JavaScript vllm VS bruno

Opensource IDE For Exploring and Testing Api's (lightweight alternative to postman/insomnia)
text-generation-inference

29 7,881 9.6 Python vllm VS text-generation-inference

Large Language Model Text Generation Inference
InfluxDB

www.influxdata.com sponsored

Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
TensorRT

22 9,110 5.0 C++ vllm VS TensorRT

NVIDIA® TensorRT™ is an SDK for high-performance deep learning inference on NVIDIA GPUs. This repository contains the open source components of TensorRT.
axolotl

29 5,641 9.8 Python vllm VS axolotl

Go ahead and axolotl questions
nerd-dictation

28 1,158 3.6 Python vllm VS nerd-dictation

Simple, hackable offline speech to text - using the VOSK-API.
CTranslate2

13 2,799 8.9 C++ vllm VS CTranslate2

Fast inference engine for Transformer models
AdaptiveCpp

19 1,040 9.7 C++ vllm VS AdaptiveCpp

Implementation of SYCL and C++ standard parallelism for CPUs and GPUs from all vendors: The independent, community-driven compiler for C++-based heterogeneous programming models. Lets applications adapt themselves to all the hardware in the system - even at runtime!
LAVIS

18 8,738 6.3 Jupyter Notebook vllm VS LAVIS

LAVIS - A One-stop Library for Language-Vision Intelligence
fiftyone

18 6,674 10.0 Python vllm VS fiftyone

The open-source tool for building high-quality datasets and computer vision models
virtualagc

13 2,482 8.9 Assembly vllm VS virtualagc

Virtual Apollo Guidance Computer (AGC) software
OpenPipe

13 2,367 9.9 TypeScript vllm VS OpenPipe

Turn expensive prompts into cheap fine-tuned models
oasdiff

12 580 9.2 Go vllm VS oasdiff

OpenAPI Diff and Breaking Changes
clip-retrieval

11 2,124 7.9 Jupyter Notebook vllm VS clip-retrieval

Easily compute clip embeddings and build a clip retrieval system with them
lmdeploy

3 2,324 9.8 Python vllm VS lmdeploy

LMDeploy is a toolkit for compressing, deploying, and serving LLMs.
Llama-2-Onnx

3 983 6.7 Python vllm VS Llama-2-Onnx
SaaSHub

www.saashub.com sponsored

SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a better vllm alternative or higher similarity.

Suggest an alternative to vllm

vllm reviews and mentions

Posts with mentions or reviews of vllm. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2024-04-09.

Mistral AI Launches New 8x22B Moe Model
4 projects | news.ycombinator.com | 9 Apr 2024

The easiest is to use vllm (https://github.com/vllm-project/vllm) to run it on a Couple of A100's, and you can benchmark this using this library (https://github.com/EleutherAI/lm-evaluation-harness)
FLaNK AI for 11 March 2024
46 projects | dev.to | 11 Mar 2024
Show HN: We got fine-tuning Mistral-7B to not suck
4 projects | news.ycombinator.com | 7 Feb 2024

Great question! scheduling workloads onto GPUs in a way where VRAM is being utilised efficiently was quite the challenge.
What we found was the IO latency for loading model weights into VRAM will kill responsiveness if you don't "re-use" sessions (i.e. where the model weights remain loaded and you run multiple inference sessions over the same loaded weights).
Obviously projects like https://github.com/vllm-project/vllm exist but we needed to build out a scheduler that can run a fleet of GPUs for a matrix of text/image vs inference/finetune sessions.
disclaimer: I work on Helix
Mistral CEO confirms 'leak' of new open source AI model nearing GPT4 performance
5 projects | news.ycombinator.com | 31 Jan 2024

FYI, vLLM also just added experiment multi-lora support: https://github.com/vllm-project/vllm/releases/tag/v0.3.0
Also check out the new prefix caching, I see huge potential for batch processing purposes there!
VLLM Sacrifices Accuracy for Speed
1 project | news.ycombinator.com | 23 Jan 2024
Easy, fast, and cheap LLM serving for everyone
1 project | news.ycombinator.com | 17 Dec 2023
vllm
1 project | news.ycombinator.com | 15 Dec 2023
Mixtral Expert Parallelism
1 project | news.ycombinator.com | 15 Dec 2023
Mixtral 8x7B Support
1 project | news.ycombinator.com | 11 Dec 2023
Mixtral of Experts
4 projects | news.ycombinator.com | 11 Dec 2023
A note from our sponsor - SaaSHub
www.saashub.com | 28 Apr 2024

SaaSHub helps you find the best software and product alternatives Learn more →

Stats

Basic vllm repo stats

Mentions

Stars

18,571

Activity

9.9

Last Commit

about 6 hours ago

vllm-project/vllm is an open source project licensed under Apache License 2.0 which is an OSI approved license.

The primary programming language of vllm is Python.

Popular Comparisons