-
text-generation-webui
A Gradio web UI for Large Language Models. Supports transformers, GPTQ, AWQ, EXL2, llama.cpp (GGUF), Llama models.
-
Scout Monitoring
Free Django app performance insights with Scout Monitoring. Get Scout setup in minutes, and let us sweat the small stuff. A couple lines in settings.py is all you need to start monitoring your apps. Sign up for our free tier today.
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
Get rid of everything you have and just use the installer I linked HERE. It'll get everything for you, up to date.
I'm just throwing it out there, but I had to use oobabooga's fork of GPTQ-for-Llama and ensuring I was on the cuda branch (https://github.com/oobabooga/GPTQ-for-LLaMa.git).
I can load it with GPTQ on a 6700xt using a fork: https://github.com/YellowRoseCx/GPTQ-for-LLaMa You can update the post for AMD users if you want.
There are a lot of ROCm versions of bitsandbytes. For example this one: https://github.com/broncotc/bitsandbytes-rocm The problem is compatibility with most of the requirements. Kobold does a better job than ooba in offering a more streamlined approach for AMD users.
Welcome to bitsandbytes. For bug reports, please submit your error trace to: https://github.com/TimDettmers/bitsandbytes/issues