SaaSHub helps you find the best software and product alternatives Learn more →
Llama-cpp-python Alternatives
Similar projects and alternatives to llama-cpp-python
-
text-generation-webui
A Gradio web UI for Large Language Models. Supports transformers, GPTQ, AWQ, EXL2, llama.cpp (GGUF), Llama models.
-
WorkOS
The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
-
LocalAI
:robot: The free, Open Source OpenAI alternative. Self-hosted, community-driven and local-first. Drop-in replacement for OpenAI running on consumer-grade hardware. No GPU required. Runs gguf, transformers, diffusers and many more models architectures. It allows to generate Text, Audio, Video, Images. Also with voice cloning capabilities.
-
FastChat
An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
-
FLiPStackWeekly
FLaNK AI Weekly covering Apache NiFi, Apache Flink, Apache Kafka, Apache Spark, Apache Iceberg, Apache Ozone, Apache Pulsar, and more...
-
khoj
Your AI second brain. A copilot to get answers to your questions, whether they be from your own notes or from the internet. Use powerful, online (e.g gpt4) or private, local (e.g mistral) LLMs. Self-host locally or use our web app. Access from Obsidian, Emacs, Desktop app, Web or Whatsapp.
-
basaran
Discontinued Basaran is an open-source alternative to the OpenAI text completion API. It provides a compatible streaming API for your Hugging Face Transformers-based text generation models.
-
continue
⏩ The easiest way to code with any LLM—Continue is an open-source autopilot for VS Code and JetBrains
-
intel-extension-for-pytorch
A Python package for extending the official PyTorch that can easily obtain performance on Intel platform
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
llama-cpp-python reviews and mentions
- FLaNK AI for 11 March 2024
-
OpenAI: Memory and New Controls for ChatGPT
I'll share the core bit that took a while to figure out the right format, my main script is a hot mess using embeddings with SentenceTransformer, so I won't share that yet. E.g: last night I did a PR for llama-cpp-python that shows how Phi might be used with JSON only for the author to write almost exactly the same code at pretty much the same time. https://github.com/abetlen/llama-cpp-python/pull/1184
-
TinyLlama LLM: A Step-by-Step Guide to Implementing the 1.1B Model on Google Colab
Python Bindings for llama.cpp
- Mistral-8x7B-Chat
-
Running Mistral LLM on Apple Silicon Using Apple's MLX Framework Is Much Faster
If the model could be made to work with llama.cpp, then https://github.com/abetlen/llama-cpp-python might be more compact. llama.cpp only supports a limited list of model types though.
- Run ChatGPT-like LLMs on your laptop in 3 lines of code
-
Code Llama, a state-of-the-art large language model for coding
https://github.com/abetlen/llama-cpp-python has a web server mode that replicates openai's API iirc and the readme shows it has docker builds already.
-
Meta: Code Llama, an AI Tool for Coding
LocalAI https://localai.io/ and LMStudio https://lmstudio.ai/ both have fairly complete OpenAI compatibility layers. llama-cpp-python has a FastAPI server as well: https://github.com/abetlen/llama-cpp-python/blob/main/llama_... (as of this moment it hasn't merged GGUF update yet though)
-
First steps with llama
I went with Python, llama-cpp-python, since my goal is just to get a small project up and running locally.
-
Show HN: Khoj – Chat Offline with Your Second Brain Using Llama 2
I see you’re using gpt4all; do you have a supported way to change the model being used for local inference?
A number of apps that are designed for OpenAI’s completion/chat APIs can simply point to the endpoints served by llama-cpp-python [0], and function in (largely) the same way, while supporting the various models and quants supported by llama.cpp. That would allow folks to run larger models on the hardware of their choice (including Apple Silicon with Metal acceleration) or using other proxies like openrouter.io.
[0]: https://github.com/abetlen/llama-cpp-python
-
A note from our sponsor - SaaSHub
www.saashub.com | 24 Apr 2024
Stats
abetlen/llama-cpp-python is an open source project licensed under MIT License which is an OSI approved license.
The primary programming language of llama-cpp-python is Python.
Popular Comparisons
- llama-cpp-python VS LocalAI
- llama-cpp-python VS intel-extension-for-pytorch
- llama-cpp-python VS llama.cpp
- llama-cpp-python VS text-generation-inference
- llama-cpp-python VS mlc-llm
- llama-cpp-python VS FastChat
- llama-cpp-python VS KoboldAI
- llama-cpp-python VS ollama
- llama-cpp-python VS text-generation-webui
- llama-cpp-python VS localLLM_guidance
Sponsored