Our great sponsors
-
text-generation-webui
A Gradio web UI for Large Language Models. Supports transformers, GPTQ, AWQ, EXL2, llama.cpp (GGUF), Llama models.
-
WorkOS
The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
-
WizardLM
Discontinued Family of instruction-following LLMs powered by Evol-Instruct: WizardLM, WizardCoder and WizardMath
-
FastChat
An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
Also, probably the easiest way to get started would be to install oobabooga's web-ui (there are one-click installers for various operating systems), then pair it with a GPTQ quantized (not GGML) model -- you'll also want the smaller 4-bit file (ie without groupsize 128) where applicable to avoid running into issues with the context length. Here are the appropriate files for GPT4-X-Alpaca-30b and WizardLM-30B, which are both good choices.
It's worth noting that you'll need a recent release of llama.cpp to run GGML models with GPU acceleration here is the latest build for CUDA 12.1), and you'll need to install a recent CUDA version if you haven't already (here is the CUDA 12.1 toolkit installer -- mind, it's over 3 GB).
Here is the codebase and dataset for WizardLM https://github.com/nlpxucan/WizardLM https://github.com/AetherCortex/Llama-X https://huggingface.co/datasets/victor123/evol_instruct_70k
Here is the codebase and dataset for WizardLM https://github.com/nlpxucan/WizardLM https://github.com/AetherCortex/Llama-X https://huggingface.co/datasets/victor123/evol_instruct_70k
Here is the codebase and dataset for WizardVicuna https://github.com/melodysdreamj/WizardVicunaLM https://github.com/lm-sys/FastChat https://huggingface.co/datasets/RyokoAI/ShareGPT52K
Here is the codebase and dataset for WizardVicuna https://github.com/melodysdreamj/WizardVicunaLM https://github.com/lm-sys/FastChat https://huggingface.co/datasets/RyokoAI/ShareGPT52K
There was a special release of Koboldcpp that features GPU offloading, it's a 418 MB file due to all the libraries needed to support CUDA. There are hints that it might be a one-off thing but it'll at least work until the model formats get changed again.
Related posts
- PyNQ: Python LINQ for the Masochistic and Deranged
- Ask HN: High quality Python scripts or small libraries to learn from
- Facebook Sybil suite open source GitHub project
- One path to connecting a Python script to a COM application on Windows
- Microsoft's VASA-1 can deepfake a person with one photo and one audio track