-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
-
text-generation-webui
A Gradio web UI for Large Language Models. Supports transformers, GPTQ, AWQ, EXL2, llama.cpp (GGUF), Llama models.
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
I am a noob. I saw your comment on github and another post here. I am confused about what has changed and what us users have to do. Do we have to update llama.cpp and redownload all the models(I am using something called catai instead of the webui, i think it also uses llama.cpp)? How do we know which versions of the models are compatible with which vesions of llama.cpp?
If you don't have a usable GPU (you'll need an Nvidia GPU with at least 10GB VRAM) then the other option is CPU inference. text-generation-webui can do that too, but at this moment it can't support the new quantisation format that came out a couple of days ago. So the alternative would be to download llama.cpp and run it from the command line/cmd.exe. You can download that from https://github.com/ggerganov/llama.cpp.
Ok understood! So, two options: firstly you could still use text-generation-webui with it's --api option, and then access the API it provides. That exposes a simple REST API that you can access from whatever code, with sample Python code providedhere's the example API code https://github.com/oobabooga/text-generation-webui/blob/main/api-example.py
But the ideal way would be to use your own Python code to load it directly. The future of GPTQ will be the AutoGPTQ repo (https://github.com/PanQiWei/AutoGPTQ). It's still quite new and under active development, with a few bugs and issues still to sort out. But it's making good progress.
the best way to start, is to train an 8-bit or 4-bit LoRA of Alpaca 7b. you can do that on your own hardware. https://github.com/tloen/alpaca-lora
There's a bug report here: The seed is not randomized? · Issue #164 · LostRuins/koboldcpp - not sure if that's where the issue is, though, but I'm watching this before I continue further analysis...