peft
FastChat
peft | FastChat | |
---|---|---|
26 | 83 | |
13,877 | 34,277 | |
4.1% | 3.6% | |
9.7 | 9.6 | |
4 days ago | 2 days ago | |
Python | Python | |
Apache License 2.0 | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
peft
- LoftQ: LoRA-fine-tuning-aware Quantization
-
Fine Tuning Mistral 7B on Magic the Gathering Draft
There is not a lot of great content out there making this clear, but basically all that matters for basic fine tuning is how much VRAM you have -- since the 3090 / 4090 have 24GB VRAM they're both pretty decent fine tuning chips. I think you could probably fine-tune a model up to ~13B parameters on one of them with PEFT (https://github.com/huggingface/peft)
-
Whisper prompt tuning
Hi everyone. Recently I've been looking into the PEFT library (https://github.com/huggingface/peft) and I was wondering if it would be possible to do prompt tuning with OpenAI's Whisper model. They have an example notebook for tuning Whisper with LoRA (https://colab.research.google.com/drive/1vhF8yueFqha3Y3CpTHN6q9EVcII9EYzs?usp=sharing) but I'm not sure how to go about changing it to use prompt tuning instead.
-
Code Llama - The Hugging Face Edition
In the coming days, we'll work on sharing scripts to train models, optimizations for on-device inference, even nicer demos (and for more powerful models), and more. Feel free to like our GitHub repos (transformers, peft, accelerate). Enjoy!
- PEFT 0.5 supports fine-tuning GPTQ models
-
Exploding loss when trying to train OpenOrca-Platypus2-13B
image
-
[D] Is there a difference between p-tuning and prefix tuning ?
I discussed part of this here: https://github.com/huggingface/peft/issues/123
-
How does using QLoRAs when running Llama on CPU work?
It seems like the merge_and_unload function in this PEFT script might be what they are referring to: https://github.com/huggingface/peft/blob/main/src/peft/tuners/lora.py
-
How to merge the two weights into a single weight?
To obtain the original llama model, one may refer to this doc. To merge a lora model with a base model, one may refer to PEFT or use the merge script provided by LMFlow.
-
[D] [LoRA + weight merge every N step] for pre-training?
you could use a callback, like show here, https://github.com/huggingface/peft/issues/286 and call code to merge them here.
FastChat
-
GPT4.5 or GPT5 being tested on LMSYS?
gpt2-chatbot isn't the only "mystery model" on LMSYS. Another is "deluxe-chat".
When asked about it in October last year, LMSYS replied [0] "It is an experiment we are running currently. More details will be revealed later"
One distinguishing feature of "deluxe-chat": although it gives high quality answers, it is very slow, so slow that the arena displays a warning whenever it is invoked
[0] https://github.com/lm-sys/FastChat/issues/2527
-
LLMs on your local Computer (Part 1)
FastChat
- FLaNK AI for 11 March 2024
- FLaNK 04 March 2024
- ChatGPT for Teams
- FastChat: An open platform for training and serving large language models
-
LM Studio – Discover, download, and run local LLMs
How does it compare with something like FastChat? https://github.com/lm-sys/FastChat
Feature set seems like a decent amount of overlap. One limitation of FastChat, as far as I can tell, is that one is limited to the models that FastChat supports (though I think it would be minor to modify it to support arbitrary models?)
-
Video-LLaVA
Looks like the Vicuna repo is Apache 2.0 also[1].
What's the interpretation of copyright law that would prevent the code being Apache 2.0 based on the source of the fine-tuning dataset?
[1] https://github.com/lm-sys/FastChat
-
🔥🚀 Top 10 Open-Source Must-Have Tools for Crafting Your Own Chatbot 🤖💬
Check how to start with FastChat. Support FastChat on GitHub ⭐
-
Show HN: ChatAPI – PWA to Use ChatGPT by API Build with Alpine.js
For something a little heavier but much more robust in terms of features/functionality I've been enjoying FastChat: https://github.com/lm-sys/FastChat
It allows you to plug in different backends so that you can use OpenAI compatible clients with various LLM's, selfhosted or otherwise.
What are some alternatives?
lora - Using Low-rank adaptation to quickly fine-tune diffusion models.
text-generation-webui - A Gradio web UI for Large Language Models. Supports transformers, GPTQ, AWQ, EXL2, llama.cpp (GGUF), Llama models.
LoRA - Code for loralib, an implementation of "LoRA: Low-Rank Adaptation of Large Language Models"
llama.cpp - LLM inference in C/C++
alpaca-lora - Instruct-tune LLaMA on consumer hardware
gpt4all - gpt4all: run open-source LLMs anywhere
dalai - The simplest way to run LLaMA on your local machine
bitsandbytes - Accessible large language models via k-bit quantization for PyTorch.
mlc-llm - Enable everyone to develop, optimize and deploy AI models natively on everyone's devices.
LocalAI - :robot: The free, Open Source OpenAI alternative. Self-hosted, community-driven and local-first. Drop-in replacement for OpenAI running on consumer-grade hardware. No GPU required. Runs gguf, transformers, diffusers and many more models architectures. It allows to generate Text, Audio, Video, Images. Also with voice cloning capabilities.
minLoRA - minLoRA: a minimal PyTorch library that allows you to apply LoRA to any PyTorch model.
llama-cpp-python - Python bindings for llama.cpp