Llama.cpp Alternatives

Similar projects and alternatives to llama.cpp

text-generation-webui

876 36,552 9.9 Python llama.cpp VS text-generation-webui

A Gradio web UI for Large Language Models. Supports transformers, GPTQ, AWQ, EXL2, llama.cpp (GGUF), Llama models.
ollama

204 64,536 9.9 Go llama.cpp VS ollama

Get up and running with Llama 3, Mistral, Gemma, and other large language models.
InfluxDB

www.influxdata.com featured

Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
whisper.cpp

187 31,426 9.8 C llama.cpp VS whisper.cpp

Port of OpenAI's Whisper model in C/C++
llama

184 53,227 8.1 Python llama.cpp VS llama

Inference code for Llama models
koboldcpp

180 3,887 10.0 C++ llama.cpp VS koboldcpp

A simple one-file way to run various GGML and GGUF models with KoboldAI's UI
gpt4all

139 64,901 9.8 C++ llama.cpp VS gpt4all

gpt4all: run open-source LLMs anywhere
stanford_alpaca

108 28,856 2.0 Python llama.cpp VS stanford_alpaca

Code and documentation to train Stanford's Alpaca models, and generate the data.
SaaSHub

www.saashub.com featured

SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
alpaca-lora

107 18,217 3.6 Jupyter Notebook llama.cpp VS alpaca-lora

Instruct-tune LLaMA on consumer hardware
alpaca.cpp

94 9,878 9.4 C llama.cpp VS alpaca.cpp

Discontinued Locally run an Instruction-Tuned Chat-Style LLM
mlc-llm

89 17,053 9.9 Python llama.cpp VS mlc-llm

Enable everyone to develop, optimize and deploy AI models natively on everyone's devices.
guidance

89 12,248 9.5 Jupyter Notebook llama.cpp VS guidance

Discontinued A guidance language for controlling large language models. [Moved to: https://github.com/guidance-ai/guidance] (by microsoft)
FastChat

83 34,277 9.6 Python llama.cpp VS FastChat

An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.
GPTQ-for-LLaMa

75 2,924 8.6 Python llama.cpp VS GPTQ-for-LLaMa

4 bits quantization of LLaMA using GPTQ
ggml

69 9,802 9.8 C llama.cpp VS ggml

Tensor library for machine learning
exllama

64 2,609 9.0 Python llama.cpp VS exllama

A more memory-efficient rewrite of the HF transformers implementation of Llama for use with quantized weights.
llama-cpp-python

55 6,579 9.8 Python llama.cpp VS llama-cpp-python

Python bindings for llama.cpp
open_llama

52 7,211 5.3 llama.cpp VS open_llama

OpenLLaMA, a permissively licensed open source reproduction of Meta AI’s LLaMA 7B trained on the RedPajama dataset
StableLM

43 15,851 5.0 Jupyter Notebook llama.cpp VS StableLM

StableLM: Stability AI Language Models
llm

41 5,931 9.4 Rust llama.cpp VS llm

An ecosystem of Rust libraries for working with large language models
llamafile

35 15,120 9.6 C++ llama.cpp VS llamafile

Distribute and run LLMs with a single file.
SaaSHub

www.saashub.com featured

SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a better llama.cpp alternative or higher similarity.

Suggest an alternative to llama.cpp

llama.cpp reviews and mentions

Posts with mentions or reviews of llama.cpp. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2024-05-07.

IBM Granite: A Family of Open Foundation Models for Code Intelligence
3 projects | news.ycombinator.com | 7 May 2024

if you can compile stuff, then looking at llama.cpp (what ollama uses) is also interesting: https://github.com/ggerganov/llama.cpp
the server is here: https://github.com/ggerganov/llama.cpp/tree/master/examples/...
And you can search for any GGUF on huggingface
Ask HN: Affordable hardware for running local large language models?
1 project | news.ycombinator.com | 5 May 2024

Yes, Metal seems to allow a maximum of 1/2 of the RAM for one process, and 3/4 of the RAM allocated to the GPU overall. There’s a kernel hack to fix it, but that comes with the usual system integrity caveats. https://github.com/ggerganov/llama.cpp/discussions/2182
Xmake: A modern C/C++ build tool
7 projects | news.ycombinator.com | 4 May 2024
Better and Faster Large Language Models via Multi-Token Prediction
1 project | news.ycombinator.com | 1 May 2024

For anyone interested in exploring this, llama.cpp has an example implementation here:
https://github.com/ggerganov/llama.cpp/tree/master/examples/...
Llama.cpp Bfloat16 Support
1 project | news.ycombinator.com | 30 Apr 2024
Fine-tune your first large language model (LLM) with LoRA, llama.cpp, and KitOps in 5 easy steps
1 project | dev.to | 30 Apr 2024

Getting started with LLMs can be intimidating. In this tutorial we will show you how to fine-tune a large language model using LoRA, facilitated by tools like llama.cpp and KitOps.
GGML Flash Attention support merged into llama.cpp
1 project | news.ycombinator.com | 30 Apr 2024
Phi-3 Weights Released
1 project | news.ycombinator.com | 23 Apr 2024

well https://github.com/ggerganov/llama.cpp/issues/6849
Lossless Acceleration of LLM via Adaptive N-Gram Parallel Decoding
3 projects | news.ycombinator.com | 21 Apr 2024
Llama.cpp Working on Support for Llama3
1 project | news.ycombinator.com | 18 Apr 2024
A note from our sponsor - SaaSHub
www.saashub.com | 9 May 2024

SaaSHub helps you find the best software and product alternatives Learn more →