llama-cpp-python

Python bindings for llama.cpp (by abetlen)

Llama-cpp-python Alternatives

Similar projects and alternatives to llama-cpp-python

  1. llama.cpp

    LLM inference in C/C++

  2. SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
  3. textgen

    Open-source desktop app for local LLMs. Text, vision, tool-calling, OpenAI/Anthropic-compatible API. 100% private.

  4. ollama

    Get up and running with Kimi-K2.6, GLM-5.1, MiniMax, DeepSeek, gpt-oss, Qwen, Gemma and other models.

  5. aider

    aider is AI pair programming in your terminal

  6. gpt4all

    GPT4All: Run Local LLMs on Any Device. Open-source and available for commercial use.

  7. LocalAI

    LocalAI is the open-source AI engine. Run any model - LLMs, vision, voice, image, video - on any hardware. No GPU required.

  8. mlc-llm

    Universal LLM Deployment Engine with ML Compilation

  9. FastChat

    An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.

  10. ggml

    Tensor library for machine learning

  11. khoj

    53 llama-cpp-python VS khoj

    Your AI second brain. Self-hostable. Get answers from the web or your docs. Build custom agents, schedule automations, do deep research. Turn any online or local LLM into your personal, autonomous AI (gpt, claude, gemini, llama, qwen, mistral). Get started - free.

  12. refact

    Discontinued AI Agent that handles engineering tasks end-to-end: integrates with developers’ tools, plans, executes, and iterates until it achieves a successful result.

  13. text-generation-inference

    Discontinued Large Language Model Text Generation Inference

  14. continue

    37 llama-cpp-python VS continue

    ⏩ Source-controlled AI checks, enforceable in CI. Powered by the open-source Continue CLI

  15. TensorRT-LLM

    TensorRT LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and supports state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT LLM also contains components to create Python and C++ runtimes that orchestrate the inference execution in a performant way.

  16. basaran

    Discontinued Basaran is an open-source alternative to the OpenAI text completion API. It provides a compatible streaming API for your Hugging Face Transformers-based text generation models.

  17. intel-extension-for-pytorch

    Discontinued A Python package for extending the official PyTorch that can easily obtain performance on Intel platform

  18. lmdeploy

    LMDeploy is a toolkit for compressing, deploying, and serving LLMs.

  19. gpt4all-chat

    Discontinued gpt4all-j chat

  20. localLLM_guidance

    Local LLM ReAct Agent with Guidance

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a better llama-cpp-python alternative or higher similarity.

llama-cpp-python discussion

Log in or Post with

llama-cpp-python reviews and mentions

Posts with mentions or reviews of llama-cpp-python. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2026-04-21.

Stats

Basic llama-cpp-python repo stats
61
10,394
9.1
3 days ago

Sponsored
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com

Did you know that Python is
the 1st most popular programming language
based on number of references?