LLaMA-8bit-LoRA
Repository for Chat LLaMA - training a LoRA for the LLaMA (1 or 2) models on HuggingFace with 8-bit or 4-bit quantization. Research only. (by serp-ai)
llama.cpp
LLM inference in C/C++ (by ggerganov)
LLaMA-8bit-LoRA | llama.cpp | |
---|---|---|
3 | 773 | |
145 | 56,891 | |
0.7% | - | |
5.1 | 10.0 | |
8 months ago | 5 days ago | |
Python | C++ | |
- | MIT License |
The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
LLaMA-8bit-LoRA
Posts with mentions or reviews of LLaMA-8bit-LoRA.
We have used some of these posts to build our list of alternatives
and similar projects. The last one was on 2023-04-06.
-
Any news on training LoRAs in 4-bit mode?
https://github.com/serp-ai/LLaMA-8bit-LoRA/blob/main/docs/merging_the_weights.md < merge models
- [R] 🤖🌟 Unlock the Power of Personal AI: Introducing ChatLLaMA, Your Custom Personal Assistant! 🚀💬
llama.cpp
Posts with mentions or reviews of llama.cpp.
We have used some of these posts to build our list of alternatives
and similar projects. The last one was on 2024-04-21.
-
Better and Faster Large Language Models via Multi-Token Prediction
For anyone interested in exploring this, llama.cpp has an example implementation here:
https://github.com/ggerganov/llama.cpp/tree/master/examples/...
- Llama.cpp Bfloat16 Support
-
Fine-tune your first large language model (LLM) with LoRA, llama.cpp, and KitOps in 5 easy steps
Getting started with LLMs can be intimidating. In this tutorial we will show you how to fine-tune a large language model using LoRA, facilitated by tools like llama.cpp and KitOps.
- GGML Flash Attention support merged into llama.cpp
-
Phi-3 Weights Released
well https://github.com/ggerganov/llama.cpp/issues/6849
- Lossless Acceleration of LLM via Adaptive N-Gram Parallel Decoding
- Llama.cpp Working on Support for Llama3
-
Embeddings are a good starting point for the AI curious app developer
Have just done this recently for local chat with pdf feature in https://recurse.chat. (It's a macOS app that has built-in llama.cpp server and local vector database)
Running an embedding server locally is pretty straightforward:
- Get llama.cpp release binary: https://github.com/ggerganov/llama.cpp/releases
- Mixtral 8x22B
- Llama.cpp: Improve CPU prompt eval speed