Could I get a suggestion for a simple HTTP API with no GUI for llama.cpp?

This page summarizes the projects mentioned and recommended in the original post on /r/LocalLLaMA

Our great sponsors
  • WorkOS - The modern identity platform for B2B SaaS
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • SaaSHub - Software Alternatives and Reviews
  • llama.cpp-dotnet

    Minimal C# bindings for llama.cpp + .NET core library with API host/client.

  • llama-cpp-python

    Python bindings for llama.cpp

  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

    WorkOS logo
  • go-llama.cpp

    LLama.cpp golang bindings

  • Go: go-skynet/go-llama.cpp

  • llama-node

    Believe in AI democratization. llama for nodejs backed by llama-rs, llama.cpp and rwkv.cpp, work locally on your laptop CPU. support llama/alpaca/gpt4all/vicuna/rwkv model.

  • Node.js: hlhr202/llama-node

  • llama_cpp.rb

    llama_cpp provides Ruby bindings for llama.cpp

  • Ruby: yoshoku/llama_cpp.rb

  • LLamaSharp

    A cross-platform library to run 🦙LLaMA/LLaVA model (and others) on your local device efficiently.

  • C#/.NET: SciSharp/LLamaSharp

  • FastChat

    An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.

  • I used the FastChat API to load two quantized Vicuna-13 models locally so I could repeatedly query them for the modern translation of a given paragraph from the complete works of Jonathan Swift. Then I LoRa+PEFTed Llama-7b to convert from modern English to Swift. Works great: https://huggingface.co/pcalhoun/LLaMA-7b-JonathanSwift

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • LocalAI

    :robot: The free, Open Source OpenAI alternative. Self-hosted, community-driven and local-first. Drop-in replacement for OpenAI running on consumer-grade hardware. No GPU required. Runs gguf, transformers, diffusers and many more models architectures. It allows to generate Text, Audio, Video, Images. Also with voice cloning capabilities.

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts