-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
-
text-generation-webui
A Gradio web UI for Large Language Models. Supports transformers, GPTQ, AWQ, EXL2, llama.cpp (GGUF), Llama models.
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
I'm running it using https://github.com/ggerganov/llama.cpp. The 4-bit version of 13b runs ok without GPU acceleration.
My question seemed to have been answered here, and it is a VRAM limitation. Also, that last link seems to support 4-bit models as well. Doesn't seem too bad to set up.... Though I installed A1111 when it first came out, so I learned through the garbage of that. Lol.
Related posts
-
Ask HN: Self-hosted/open-source ChatGPT alternative? Like Stable Diffusion
-
Lossless Acceleration of LLM via Adaptive N-Gram Parallel Decoding
-
AI enthusiasm #6 - Finetune any LLM you want💡
-
Zephyr 141B, a Mixtral 8x22B fine-tune, is now available in Hugging Chat
-
Schedule-Free Learning – A New Way to Train