GPTQ-for-LLaMa-API
koboldcpp
GPTQ-for-LLaMa-API | koboldcpp | |
---|---|---|
5 | 180 | |
40 | 3,951 | |
- | - | |
4.7 | 10.0 | |
12 months ago | 3 days ago | |
Python | C++ | |
Apache License 2.0 | GNU Affero General Public License v3.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
GPTQ-for-LLaMa-API
- Alternative ways for running models locally and hosting APIs
-
Can someone explain why there isn't a good interface for the oobabooga api in langchain?
oobabooga has to support way too many models, so making the whole thing unnecessarily complicated. If you have some development experience, maybe you would build your own API in a few lines of Python code. It's not hard if you build from scratch and learn along the way. I have built some example repositories for hosting GPTQ-related models. You can have a look at them. https://github.com/mzbac/GPTQ-for-LLaMa-API https://github.com/mzbac/gptq-cuda-api
-
Looking to selfhost Llama on remote server, could use some help
I ran this https://github.com/mzbac/GPTQ-for-LLaMa-API for my home server. It should be easy enough to create a Dockerfile and make it hostable via Docker.
-
How do I load a gptq LLaMA model (Vicuna) in .safetensors format?
If you have some experience with Python, you can take a look at my repo. It only has the minimal logic of how to load a GPTQ model and serve it as an API. https://github.com/mzbac/GPTQ-for-LLaMa-API
-
Just create a repository to show how to serve GPTQ model via an API
Hopefully, it will make it easier for any developer who wants to build some integration with their app. https://github.com/mzbac/GPTQ-for-LLaMa-API
koboldcpp
- Any Online Communities on Local/Home AI?
- Koboldcpp-1.62.1 adds support for Command-R+
- Show HN: I made an app to use local AI as daily driver
-
Easiest way to show my model to my mom?
FYI this is the easiest way to host on the horde: https://github.com/LostRuins/koboldcpp
- IT Veteran... why am I struggling with all of this?
- What do you use to run your models?
- ByteDance AI researcher suggests that open source model more powerful than Gemini to be released soon
- i need some help guys
-
[Guide] How install KoboldAI in Android via Termux (Update 04-12-2023)
For more information of Koboldcpp look this guide: https://github.com/LostRuins/koboldcpp/wiki
-
SillyTavern 1.10.10 has been released
Out of curiosity, is there a specific reason for this? The most popular fork KoboldCpp is in active development, and was the first to adopt the Min P sampler, and even distincts itself with the context shift feature. Just wondering what this means for the future. Thanks!
What are some alternatives?
gptq-cuda-api
KoboldAI
text-generation-inference - Large Language Model Text Generation Inference
text-generation-webui - A Gradio web UI for Large Language Models. Supports transformers, GPTQ, AWQ, EXL2, llama.cpp (GGUF), Llama models.
llama-cpp-python - Python bindings for llama.cpp
TavernAI - Atmospheric adventure chat for AI language models (KoboldAI, NovelAI, Pygmalion, OpenAI chatgpt, gpt-4)
learn-langchain
KoboldAI - KoboldAI is generative AI software optimized for fictional use, but capable of much more!
ChatRWKV - ChatRWKV is like ChatGPT but powered by RWKV (100% RNN) language model, and open source.
AgentOoba - An autonomous AI agent extension for Oobabooga's web ui
SillyTavern - LLM Frontend for Power Users. [Moved to: https://github.com/SillyTavern/SillyTavern]