Our great sponsors
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
-
serge
A web interface for chatting with Alpaca through llama.cpp. Fully dockerized, with an easy to use API.
-
Local-LLM-Comparison-Colab-UI
Compare the performance of different LLM that can be deployed locally on consumer hardware. Run yourself with Colab WebUI.
-
SillyTavern
Discontinued LLM Frontend for Power Users. [Moved to: https://github.com/SillyTavern/SillyTavern] (by Cohee1207)
-
WorkOS
The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
-
langflow
⛓️ Langflow is a dynamic graph where each node is an executable unit. Its modular and interactive design fosters rapid experimentation and prototyping, pushing hard on the limits of creativity.
No, I still use ooba's fork to ensure the widest compatibility. I would love to use a later version - specifically, I want to move to AutoGPTQ. But if I do that people who are still using ooba's fork (which is like 90% of people) can't use CPU offloading. They get a ton of errors and it just breaks.
Here's the script I use to merge a LoRA onto a base model: https://gist.github.com/TheBloke/d31d289d3198c24e0ca68aaf37a19032 (a slightly modified version of https://github.com/bigcode-project/starcoder/blob/main/finetune/merge_peft_adapters.py)
BTW, are you using this llama.py for quantization? https://github.com/qwopqwop200/GPTQ-for-LLaMa/blob/triton/llama.py
u/The-Bloke Serge is with you (https://github.com/nsarrazin/serge/pull/334/files) can you suggest best models to set in the model manager from ggml currently :)
Colab webui for the guanaco-13B-GPTQ: Link
https://github.com/Cohee1207/SillyTavern from the repo, you will find everything you need and I use Ooba Text Generation Api as the backend
#1: OfflineAI example stack: PrivateGPT: Interact privately with your documents using the power of GPT, 100% privately, no data leaks | 0 comments #2: 💫 Found Offline Code AI: StarCoder: How to use an LLM to code | 0 comments #3: 🍿Oobabooga with NEW Uncensored Wizard Mega 13B Model | 1 comment