chat-ui
qlora
chat-ui | qlora | |
---|---|---|
40 | 80 | |
6,369 | 9,500 | |
10.8% | - | |
9.7 | 7.4 | |
4 days ago | 8 months ago | |
TypeScript | Jupyter Notebook | |
Apache License 2.0 | MIT License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
chat-ui
-
Zephyr 141B, a Mixtral 8x22B fine-tune, is now available in Hugging Chat
Zephyr 141B is a Mixtral 8x22B fine-tune. Here are some interesting details
- Base model: Mixtral 8x22B, 8 experts, 141B total params, 35B activated params
- Fine-tuned with ORPO, a new alignment algorithm with no SFT step (hence much faster than DPO/PPO)
- Trained with 7K open data instances -> high-quality, synthetic, multi-turn
- Apache 2
Everything is open:
- Final Model: https://huggingface.co/HuggingFaceH4/zephyr-orpo-141b-A35b-v...
- Base Model: https://huggingface.co/mistral-community/Mixtral-8x22B-v0.1
- Fine-tune data: https://huggingface.co/datasets/argilla/distilabel-capybara-...
- Recipe/code to train the model: https://huggingface.co/datasets/argilla/distilabel-capybara-...
- Open-source inference engine: https://github.com/huggingface/text-generation-inference
- Open-source UI code https://github.com/huggingface/chat-ui
Have fun!
-
AI enthusiasm - episode #2đ
As long as you have a free Hugging Face account, you can sign up and exploit HuggingChat, a web-based chat interface where you will find 5 large language models to play with (Mixtral-7B-it v0.1 and v0.2, Command R plus, Gemma 1.1-7B-it, Dolphin). You will also have the possibility to exploit several assistants made by the Hugging Face community, or even create your own!
-
OpenAI Startup Fund: GP Hallucination
I submitted something about this the other day (and it got flagged)- poked around a little bit and the only interesting thing I could find is this: https://github.com/huggingface/chat-ui/issues/254 and I don't really even understand what it is, it references the stuff the dude who wrote this is discussing. I had kinda written the whole thing off as someone with too much time on their hands and is just f'ing around with stuff for whatever reason.
I think they made this as well: https://chat.openai.com/g/g-KT4gusP3Y-a-l-i-s-t-a-i-r-e-earl... - it doesn't seem very useful.
*ÂŻ\_(ă)_/ÂŻ to me after spending an hr or so poking around, it seemed like a bored modern tech savvy young person playing around.
- âď¸ Embeddings, Chatbots RAG Arena et forfaits Telecom OPT-NC
-
Show HN: I made an app to use local AI as daily driver
- https://github.com/huggingface/chat-ui
-
Deconstructing Hugging Face Chat: Explore open-source chat UI/UX for generative AI
Hugging Face Chat - open-source repo powering Hugging Chat!
-
What are you guys using local LLMs for?
If you don't want to do coding, I think HuggingFace's chat-ui can come in handy with web retrieval RAG and llama-cpp running as a server. Please check their documentation on how to setup( See "Running your own models using a custom endpoint" section on their Github).
-
The founder of OpenAI/ChatGPT is a Zionist calling people that are against Israeli genocide âantisemitistâ, how dare the American left speak against genocide!?
yes! it's proprietary, invasive, and harvests your data and use it for improving the AI, Ultman went to Israel weeks after Chatgpt was introduced, Israel like any other tech-giant-country needs to make sure that it has control over that data and/or use it to achieve its goals, so it's better to find offline FOSS alternatives (if you have a decent enough PC) or use HuggingChat as an online FOSS alternative, I find it better than GPT 3.5 in many aspects
-
Smartphone Brands Sorted Out, So You Don't Have To
I have categorized some of the smartphone brands by their parent company using HuggingChat based on RLHF, Google's Bard, ChatGPT, and Perplexity. All of them are powered by LLMs, and both ChatGPT and Perplexity use GPT-3.5.
-
Accessing ChatGPT in non-official UI
I'm looking for something like https://huggingface.co/chat/ or OpenAssistant, but it should target OpenAI's api.
qlora
- FLaNK Stack Weekly for 30 Oct 2023
-
I released Marx 3B V3.
Marx 3B V3 is StableLM 3B 4E1T instruction tuned on EverythingLM Data V3(ShareGPT Format) for 2 epochs using QLoRA.
-
Tuning and Testing Llama 2, Flan-T5, and GPT-J with LoRA, Sematic, and Gradio
https://github.com/artidoro/qlora
The tools and mechanisms to get a model to do what you want is ever so changing, ever so quickly. Build and understand a notebook yourself, and reduce dependencies. You will need to switch them.
-
Yet another QLoRA tutorial
My own project right now is still in raw generated form, and this now makes me think about trying qlora's scripts since this gives me some confidence I should be able to get it to turn out now that someone else has carved a path and charted the map. I was going to target llamatune which was mentioned here the other day.
-
Creating a new Finetuned model
Most papers I did read showed at least a thousand, even 10000 at several cases, so I assumed that to be the trend in the case of Low rank adapter(PEFT) training.(source: [2305.14314] QLoRA: Efficient Finetuning of Quantized LLMs (arxiv.org) , Stanford CRFM (Alpaca) and the minimum being openchat/openchat ¡ Hugging Face ; There are a lot more examples)
-
[R] LaVIN-lite: Training your own Multimodal Large Language Models on one single GPU with competitive performance! (Technical Details)
4-bit quantization training mainly refers to qlora. Simply put, qlora quantizes the weights of the LLM into 4-bit for storage, while dequantizing them into 16-bit during the training process to ensure training precision. This method significantly reduces GPU memory overhead during training (the training speed should not vary much). This approach is highly suitable to be combined with parameter-efficient methods. However, the original paper was designed for single-modal LLMs and the code has already been wrapped in HuggingFace's library. Therefore, we extracted the core code from HuggingFace's library and migrated it into LaVIN's code. The main principle is to replace all linear layers in LLM with 4-bit quantized layers. Those interested can refer to our implementation in quantization.py and mm_adaptation.py, which is roughly a dozen lines of code.
-
[D] To all the machine learning engineers: most difficult model task/type youâve ever had to work with?
There have been some new development like QLora which help fine-tune LLMs without updating all the weights.
-
Finetune MPT-30B using QLORA
This might be helpful: https://github.com/artidoro/qlora/issues/10
-
is lora fine-tuning on 13B/33B/65B comparable to full fine-tuning?
curious, since qlora paper only reports lora/qlora comparison for full fine-tuning for small 7B models.for 13B/33B/65B, it does not do so (table 4 in paper)it would be helpful if anyone can please provide links where I can read more on efficacy of lora or disadvantages of lora?
-
Need a detailed tutorial on how to create and use a dataset for QLoRA fine-tuning.
This might not be appropriate answer but did you take a look at this repository? https://github.com/artidoro/qlora With artidoro's repository it's pretty easy to train qlora. You just prepare your own dataset and run the following command: python qlora.py --model_name_or_path --dataset="path/to/your/dataset" --dataset_format="self-instruct" This is only available for several dataset formats. But every dataset format has to have input-output pairs. So the dataset json format has to be like this [ { âinputâ: âsomething â, âoutputâ:âsomething â }, { âinputâ: âsomething â, âoutputâ:âsomething â } ]
What are some alternatives?
promptfoo - Test your prompts, models, and RAGs. Catch regressions and improve prompt quality. LLM evals for OpenAI, Azure, Anthropic, Gemini, Mistral, Llama, Bedrock, Ollama, and other local & private models with CI/CD integration.
alpaca-lora - Instruct-tune LLaMA on consumer hardware
DiscordChatExporter-frontend - Browse json files exported by Tyrrrz/DiscordChatExporter in familiar discord like user interface
GPTQ-for-LLaMa - 4 bits quantization of LLaMA using GPTQ
WizardLM - Family of instruction-following LLMs powered by Evol-Instruct: WizardLM, WizardCoder and WizardMath
bitsandbytes - Accessible large language models via k-bit quantization for PyTorch.
basaran - Basaran is an open-source alternative to the OpenAI text completion API. It provides a compatible streaming API for your Hugging Face Transformers-based text generation models.
ggml - Tensor library for machine learning
Open-Assistant - OpenAssistant is a chat-based assistant that understands tasks, can interact with third-party systems, and retrieve information dynamically to do so.
alpaca_lora_4bit
AgileRL - Streamlining reinforcement learning with RLOps. State-of-the-art RL algorithms and tools.
llm-foundry - LLM training code for Databricks foundation models