llama.cppav
llama
llama.cppav | llama | |
---|---|---|
1 | 2 | |
- | 79 | |
- | - | |
- | 6.2 | |
- | 11 months ago | |
Python | ||
- | GNU General Public License v3.0 or later |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
llama.cppav
-
Llama 2 – Meta AI
I think using this project https://github.com/ggerganov/llama.cppav
on a CPU machine with AVX instructions would be a better bang for your buck than GPU. Depends on if your use case can tolerate the latency
llama
-
A comprehensive guide to running Llama 2 locally
Self-plug. Here’s a fork of the original llama 2 code adapted to run on the CPU or MPS (M1/M2 GPU) if available:
https://github.com/krychu/llama
It runs with the original weights, and gets you to ~4 tokens/sec on MacBook Pro M1 with the 7B model.
-
Llama 2 – Meta AI
Version that runs on the CPU: https://github.com/krychu/llama
I get 1 word per ~1.5 secs on a Mac Book Pro M1.
What are some alternatives?
marsha - Marsha is a functional, higher-level, English-based programming language that gets compiled into tested Python software by an LLM
llama2-chatbot - LLaMA v2 Chatbot
OpenPipe - Turn expensive prompts into cheap fine-tuned models
cog - Containers for machine learning
ollama - Get up and running with Llama 3, Mistral, Gemma, and other large language models.
llama - Inference code for Llama models
llama.cpp - LLM inference in C/C++
stable-diffusion-webui - Stable Diffusion web UI