Our great sponsors
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
I would also like to know how one would finetune this in 4 bit? I think one could take the merged 8K PEFT with the LLaMA weights, and then quantize it to 4 bit, and then train with https://github.com/johnsmith0031/alpaca_lora_4bit ?
The one thing I have published is my Docker files for producing my two Runpod templates, which let people try GGML and GPTQ models on Runpod pods with full GPU acceleration (ExLlama and AutoGPTQ). They can be found at https://github.com/TheBlokeAI/dockerLLM/ .
NOTE:
The number of mentions on this list indicates mentions on common posts plus user suggested alternatives.
Hence, a higher number means a more popular project.
Related posts
- PySheets – Spreadsheet UI for Python
- AWS Serverless Diversity: Multi-Language Strategies for Optimal Solutions
- Building LinkedIn Elevator Pitch Generator with Lyzr SDK
- Show HN: Create typed declarative API clients quickly and easily (Python)
- What are LLMs? An intro into AI, models, tokens, parameters, weights, quantization and more