Could I get a suggestion for a simple HTTP API with no GUI for llama.cpp?

InfluxDB – Built for High-Performance Time Series Workloads

InfluxDB 3 OSS is now GA. Transform, enrich, and act on time series data directly in the database. Automate critical tasks and eliminate the need to move data externally. Download now.

www.influxdata.com

featured

Sevalla - Deploy and host your apps and databases, now with $50 credit!

Sevalla is the PaaS you have been looking for! Advanced deployment pipelines, usage-based pricing, preview apps, templates, human support by developers, and much more!

sevalla.com

featured

llama.cpp-dotnet

1 1 72 8.8 C#

Minimal C# bindings for llama.cpp + .NET core library with API host/client.
InfluxDB

www.influxdata.com featured

InfluxDB – Built for High-Performance Time Series Workloads. InfluxDB 3 OSS is now GA. Transform, enrich, and act on time series data directly in the database. Automate critical tasks and eliminate the need to move data externally. Download now.
llama-cpp-python

2 60 9,515 8.8 Python

Python bindings for llama.cpp
go-llama.cpp

3 4 808 0.0 C++

LLama.cpp golang bindings

Go: go-skynet/go-llama.cpp
llama-node

4 2 863 8.6 Rust

Discontinued Believe in AI democratization. llama for nodejs backed by llama-rs, llama.cpp and rwkv.cpp, work locally on your laptop CPU. support llama/alpaca/gpt4all/vicuna/rwkv model.

Node.js: hlhr202/llama-node
llama_cpp.rb

5 3 224 9.5 C

llama_cpp.rb provides Ruby bindings for llama.cpp

Ruby: yoshoku/llama_cpp.rb
LLamaSharp

6 7 3,336 9.5 C#

A C#/.NET library to run LLM (🦙LLaMA/LLaVA) on your local device efficiently.

C#/.NET: SciSharp/LLamaSharp
FastChat

7 86 39,043 7.8 Python

An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.

I used the FastChat API to load two quantized Vicuna-13 models locally so I could repeatedly query them for the modern translation of a given paragraph from the complete works of Jonathan Swift. Then I LoRa+PEFTed Llama-7b to convert from modern English to Swift. Works great: https://huggingface.co/pcalhoun/LLaMA-7b-JonathanSwift
Sevalla

sevalla.com featured

Deploy and host your apps and databases, now with $50 credit! Sevalla is the PaaS you have been looking for! Advanced deployment pipelines, usage-based pricing, preview apps, templates, human support by developers, and much more!
LocalAI

8 90 34,903 9.9 Go

:robot: The free, Open Source alternative to OpenAI, Claude and others. Self-hosted and local-first. Drop-in replacement for OpenAI, running on consumer-grade hardware. No GPU required. Runs gguf, transformers, diffusers and many more models architectures. Features: Generate Text, Audio, Video, Images, Voice Cloning, Distributed, P2P inference

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Show HN: Run AI models directly in the browser – no server or internet required

3 projects | news.ycombinator.com | 23 Aug 2025
Paddler - open-source llama.cpp load balancer (self-host LLMs in production)

2 projects | dev.to | 28 Jun 2024
FreedomGPT: AI with no censorship

3 projects | /r/KotakuInAction | 12 May 2023
Show HN: Paddler – open-source LLMOps platform for hosting AI in your own infra

1 project | news.ycombinator.com | 15 Aug 2025
My Top Open-Source AI Tools for Building Smarter in 2025

7 projects | dev.to | 14 Aug 2025

Could I get a suggestion for a simple HTTP API with no GUI for llama.cpp?

This page summarizes the projects mentioned and recommended in the original post on /r/LocalLLaMA
hardware-buttons linkedin-bot template-engine-js
Post date: 16 May 2023

llama.cpp-dotnet

InfluxDB

llama-cpp-python

go-llama.cpp

llama-node

llama_cpp.rb

LLamaSharp

FastChat

Sevalla

LocalAI

Related posts

Show HN: Run AI models directly in the browser – no server or internet required

Paddler - open-source llama.cpp load balancer (self-host LLMs in production)

FreedomGPT: AI with no censorship

Show HN: Paddler – open-source LLMOps platform for hosting AI in your own infra

My Top Open-Source AI Tools for Building Smarter in 2025

Did you know that Python is
the 2nd most popular programming language
based on number of references?

Could I get a suggestion for a simple HTTP API with no GUI for llama.cpp?

This page summarizes the projects mentioned and recommended in the original post on /r/LocalLLaMA hardware-buttons linkedin-bot template-engine-js Post date: 16 May 2023

Related posts

Show HN: Run AI models directly in the browser – no server or internet required

Paddler - open-source llama.cpp load balancer (self-host LLMs in production)

FreedomGPT: AI with no censorship

Show HN: Paddler – open-source LLMOps platform for hosting AI in your own infra

My Top Open-Source AI Tools for Building Smarter in 2025

Did you know that Python is the 2nd most popular programming language based on number of references?

This page summarizes the projects mentioned and recommended in the original post on /r/LocalLLaMA
hardware-buttons linkedin-bot template-engine-js
Post date: 16 May 2023

Did you know that Python is
the 2nd most popular programming language
based on number of references?