Top 3 Python exllama Projects
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
https://github.com/c0sogi/llama-api , right? This offers better performance on GPU-optimized models, right?
Other than Ooba, this is my fav (and works with a TON of model architectures) -> https://github.com/shinomakoi/magi_llm_gui
Project mention: Mixture-of-Depths: Dynamically allocating compute in transformers | news.ycombinator.com | 2024-04-08There are already some implementations out there which attempt to accomplish this!
Here's an example: https://github.com/silphendio/sliced_llama
A gist pertaining to said example: https://gist.github.com/silphendio/535cd9c1821aa1290aa10d587...
Here's a discussion about integrating this capability with ExLlama: https://github.com/turboderp/exllamav2/pull/275
And same as above but for llama.cpp: https://github.com/ggerganov/llama.cpp/issues/4718#issuecomm...
Python exllama related posts
Index
What are some of the best open-source exllama projects in Python? This list will help you:
Project | Stars | |
---|---|---|
1 | llama-api | 104 |
2 | magi_llm_gui | 39 |
3 | sliced_llama | 15 |
Sponsored