trl
qdrant
trl | qdrant | |
---|---|---|
13 | 141 | |
8,176 | 17,943 | |
4.9% | 3.4% | |
9.7 | 9.9 | |
1 day ago | 7 days ago | |
Python | Rust | |
Apache License 2.0 | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
trl
- FLaNK Stack 29 Jan 2024
-
OOM Error while using TRL for RLHF Fine-tuning
I am using TRL for RLHF fine-tuning the Llama-2-7B model and getting an OOM error (even with batch_size=1). If anyone used TRL for RLHF can please tell me what I am doing wrong? Code details can be found in the GitHub issue.
-
[D] Tokenizers Truncation during Fine-tuning with Large Texts
SFTtrainer from huggingface
-
New Open-source LLMs! 🤯 The Falcon has landed! 7B and 40B
For lora - PEFT seems to work. I don't have patience to wait 5 hours, but modifying this example seems to work. You don't even need to modify that much, as their model just as neo-x uses query_key_value name for self-attention.
-
[D] Using RLHF beyond preference tuning
They have examples of making GPT output more positive (code) by using a sentiment model as reward. There are other examples about reducing toxicity, summarization here: https://github.com/lvwerra/trl/tree/main/examples . Should be fairly simple to modify the sentiment example and try the calculator reward you mentioned above.
-
[R] 🤖🌟 Unlock the Power of Personal AI: Introducing ChatLLaMA, Your Custom Personal Assistant! 🚀💬
You can use this -> https://github.com/lvwerra/trl/blob/main/examples/sentiment/scripts/gpt-neox-20b_peft/merge_peft_adapter.py
-
[R] Stanford-Alpaca 7B model (an instruction tuned version of LLaMA) performs as well as text-davinci-003
Just the hh directly. From the results it seems like it might possibly be enough but I might also try instruction tuning then running the whole process from that base. I will also be running the reinforcement learning by using a Lora using this as an example https://github.com/lvwerra/trl/tree/main/examples/sentiment/scripts/gpt-neox-20b_peft
-
[R] A simple explanation of Reinforcement Learning from Human Feedback (RLHF)
This package is pretty simple to use! https://github.com/lvwerra/trl
- Transformer Reinforcement Learning
- trl: Train transformer language models with reinforcement learning
qdrant
-
Hindi-Language AI Chatbot for Enterprises Using Qdrant, MLFlow, and LangChain
Great. Now that we have the embeddings, we need to store them in a vector database. We will be using Qdrant for this purpose. Qdrant is an open-source vector database that allows you to store and query high-dimensional vectors. The easiest way to get started with the Qdrant database is using the docker.
-
Boost Your Code's Efficiency: Introducing Semantic Cache with Qdrant
I took Qdrant for this project. The reason was that Qdrant stands for high-performance vector search, the best choice against use cases like finding similar function calls based on semantic similarity. Qdrant is not only powerful but also scalable to support a variety of advanced search features that are greatly useful to nuanced caching mechanisms like ours.
-
Ask HN: Has Anyone Trained a personal LLM using their personal notes?
I'm currently looking to implement locally, using QDrant [1] for instance.
I'm just playing around, but it makes sense to have a runnable example for our users at work too :) [2].
[1]. https://qdrant.tech/
-
Show HN: A fast HNSW implementation in Rust
Also compare with qdrant's Rust implementation; they tout their performance. https://github.com/qdrant/qdrant/tree/master/lib/segment/src...
-
pgvecto.rs alternatives - qdrant and Weaviate
3 projects | 13 Mar 2024
-
Open-source Rust-based RAG
There are much better known examples, such as https://qdrant.tech/ and https://github.com/lancedb/lancedb
-
Qdrant 1.8.0 - Major Performance Enhancements
For more information, see our release notes. Qdrant is an open source project. We welcome your contributions; raise issues, or contribute via pull requests!
-
Perform Image-Driven Reverse Image Search on E-Commerce Sites with ImageBind and Qdrant
Initialize the Qdrant Client with in-memory storage. The collection name will be “imagebind_data” and we will be using cosine distance.
-
7 Vector Databases Every Developer Should Know!
Qdrant is an open-source vector search engine optimized for performance and flexibility. It supports both exact and approximate nearest neighbor search, providing a balance between accuracy and speed for various AI and ML applications.
- Ask HN: Who is hiring? (February 2024)
What are some alternatives?
lm-human-preferences - Code for the paper Fine-Tuning Language Models from Human Preferences
Milvus - A cloud-native vector database, storage for next generation AI applications
alpaca-lora - Instruct-tune LLaMA on consumer hardware
Weaviate - Weaviate is an open-source vector database that stores both objects and vectors, allowing for the combination of vector search with structured filtering with the fault tolerance and scalability of a cloud-native database.
trlx - A repo for distributed training of language models with Reinforcement Learning via Human Feedback (RLHF)
faiss - A library for efficient similarity search and clustering of dense vectors.
LLaMA-8bit-LoRA - Repository for Chat LLaMA - training a LoRA for the LLaMA (1 or 2) models on HuggingFace with 8-bit or 4-bit quantization. Research only.
pgvector - Open-source vector similarity search for Postgres
sparsegpt-for-LLaMA - Code for the paper "SparseGPT: Massive Language Models Can Be Accurately Pruned in One-Shot" with LLaMA implementation.
Elasticsearch - Free and Open, Distributed, RESTful Search Engine
llama-recipes - Scripts for fine-tuning Meta Llama3 with composable FSDP & PEFT methods to cover single/multi-node GPUs. Supports default & custom datasets for applications such as summarization and Q&A. Supporting a number of candid inference solutions such as HF TGI, VLLM for local or cloud deployment. Demo apps to showcase Meta Llama3 for WhatsApp & Messenger.
towhee - Towhee is a framework that is dedicated to making neural data processing pipelines simple and fast.