distributed-llama
PowerInfer
distributed-llama | PowerInfer | |
---|---|---|
4 | 4 | |
780 | 6,996 | |
- | 3.5% | |
9.2 | 9.8 | |
6 days ago | 18 days ago | |
C++ | C++ | |
MIT License | MIT License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
distributed-llama
PowerInfer
- FLaNK 25 December 2023
- High-Speed Large Language Model Serving on PCs with Consumer-Grade GPUs
-
PowerInfer: Fast Large Language Model Serving with a Consumer-Grade GPU [pdf]
> PowerInfer’s source code is publicly available at https://github.com/SJTU-IPADS/PowerInfer
- PowerInfer: High-Speed Large Language Model Serving on Consumer-Grade GPUs
What are some alternatives?
oceanbase - OceanBase is an enterprise distributed relational database with high availability, high performance, horizontal scalability, and compatibility with SQL standards.
Cgml - GPU-targeted vendor-agnostic AI library for Windows, and Mistral model implementation.
lizardfs - LizardFS is an Open Source Distributed File System licensed under GPLv3.
zilla-examples - A collection of pre-canned Zilla feature demos. Deploy on K8s via Helm.
LeanCopilot - LLMs as Copilots for Theorem Proving in Lean
llama.cpp - LLM inference in C/C++
GenerativeAIExamples - Generative AI reference workflows optimized for accelerated infrastructure and microservice architecture.
FLaNK-SaoPauloBrazil - FLaNK-SaoPauloBrazil
k3s - Lightweight Kubernetes
llama2-high-level-cpp - Inference Llama2 with High-Level C++.
mlx - MLX: An array framework for Apple silicon
direnv - unclutter your .profile