GenerativeAIExamples
PowerInfer
GenerativeAIExamples | PowerInfer | |
---|---|---|
1 | 4 | |
1,597 | 7,008 | |
18.5% | 3.5% | |
7.5 | 9.8 | |
11 days ago | 1 day ago | |
Python | C++ | |
Apache License 2.0 | MIT License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
GenerativeAIExamples
PowerInfer
- FLaNK 25 December 2023
- High-Speed Large Language Model Serving on PCs with Consumer-Grade GPUs
-
PowerInfer: Fast Large Language Model Serving with a Consumer-Grade GPU [pdf]
> PowerInfer’s source code is publicly available at https://github.com/SJTU-IPADS/PowerInfer
- PowerInfer: High-Speed Large Language Model Serving on Consumer-Grade GPUs
What are some alternatives?
Pearl - A Production-ready Reinforcement Learning AI Agent Library brought by the Applied Reinforcement Learning team at Meta.
Cgml - GPU-targeted vendor-agnostic AI library for Windows, and Mistral model implementation.
LLMCompiler - [ICML 2024] LLMCompiler: An LLM Compiler for Parallel Function Calling
zilla-examples - A collection of pre-canned Zilla feature demos. Deploy on K8s via Helm.
java-iceberg-toolkit - Java implementation for performing operations on Apache Iceberg and Hive tables
llama.cpp - LLM inference in C/C++
FLaNK-SaoPauloBrazil - FLaNK-SaoPauloBrazil
k3s - Lightweight Kubernetes
llama2-high-level-cpp - Inference Llama2 with High-Level C++.
mlx - MLX: An array framework for Apple silicon
direnv - unclutter your .profile