LocalAI
go-llama.cpp
LocalAI | go-llama.cpp | |
---|---|---|
83 | 4 | |
19,862 | 561 | |
8.3% | 5.7% | |
9.9 | 7.9 | |
4 days ago | 6 days ago | |
C++ | C++ | |
MIT License | MIT License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
LocalAI
- LocalAI: Self-hosted OpenAI alternative reaches 2.14.0
- Drop-In Replacement for ChatGPT API
- Voxos.ai – An Open-Source Desktop Voice Assistant
- Ask HN: Set Up Local LLM
- FLaNK Stack Weekly 11 Dec 2023
- Is there any open source app to load a model and expose API like OpenAI?
-
What do you use to run your models?
If you're running this as a server, I would recommend LocalAI https://github.com/mudler/LocalAI
-
OpenAI Switch Kit: Swap OpenAI with any open-source model
LocalAI can do that: https://github.com/mudler/LocalAI
https://localai.io/features/openai-functions/
-
"ChatGPT romanesc"
De inspirație, LocalAI, un replacement la OpenAI. E deja hot pe GitHub.
-
Local LLM's to run on old iMac / Hardware
Your hardware should be fine for inferencing, as long as you don't bother trying to get the GPU working.
My $0.02 would be to try getting LocalAI running on your machine with OpenCL/CLBlas acceleration for your CPU. If you're running other things, you could limit the inferencing process to 2 or 3 threads. That should get it working; I've been able to inference even 13b models on cheap Rockchip SOCs. Your CPU should be fine, even if it's a little outdated.
LocalAI: https://github.com/mudler/LocalAI
Some decent models to start with:
TinyLlama (extremely small/fast): https://huggingface.co/TheBloke/TinyLlama-1.1B-Chat-v0.3-GGU...
Dolphin Mistral (larger size, better responses: https://huggingface.co/TheBloke/dolphin-2.1-mistral-7B-GGUF
go-llama.cpp
- Lokale LLM: Gibt es bereits welche für <= 4 GB vRAM?
-
LocalAI v1.19.0 - CUDA GPU support!
Full CUDA GPU offload support ( PR by mudler. Thanks to chnyda for handing over the GPU access, and lu-zero to help in debugging )
-
Could I get a suggestion for a simple HTTP API with no GUI for llama.cpp?
Go: go-skynet/go-llama.cpp
-
Redirecting Model Outputs from llama.cpp to a TXT File for Easier Tracking of Results?
I've had great success using go-llama.cpp to wrap llama in a much-friendlier language. The install process is a bit clunky- go does not like compiling submodules, so you need to use a replace within the go.mod file to point towards a local copy of go-llama.cpp that you've already compiled manually.
What are some alternatives?
gpt4all - gpt4all: run open-source LLMs anywhere
llama-cpp-python - Python bindings for llama.cpp
ollama - Get up and running with Llama 3, Mistral, Gemma, and other large language models.
llama.cpp-dotnet - Minimal C# bindings for llama.cpp + .NET core library with API host/client.
llama_cpp.rb - llama_cpp provides Ruby bindings for llama.cpp
private-gpt - Interact with your documents using the power of GPT, 100% privately, no data leaks
LLamaSharp - A C#/.NET library to run LLM models (🦙LLaMA/LLaVA) on your local device efficiently.
text-generation-webui - A Gradio web UI for Large Language Models. Supports transformers, GPTQ, AWQ, EXL2, llama.cpp (GGUF), Llama models.
llama-node - Believe in AI democratization. llama for nodejs backed by llama-rs, llama.cpp and rwkv.cpp, work locally on your laptop CPU. support llama/alpaca/gpt4all/vicuna/rwkv model.
FastChat - An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.