Deep_Object_Pose
trl
Deep_Object_Pose | trl | |
---|---|---|
3 | 14 | |
1,042 | 10,572 | |
1.3% | 4.1% | |
7.8 | 9.8 | |
5 months ago | 6 days ago | |
Python | Python | |
GNU General Public License v3.0 or later | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
Deep_Object_Pose
- FLaNK Stack 29 Jan 2024
-
6D object pose estimation by known 3d model
I've been doing some research in this area and there are a few deep learning solutions to this problem. For example, NVIDIA's Deep Object Pose Estimation will estimate the 6DOF pose of a known object. But you'll have to train the network if you want to detect a new object. PoseCNN, which someone else mentioned, does a similar thing. CenterPose is more interesting, as it can estimate then pose of an object from a known category; e.g. sneakers, or laptops, rather that one specific object (as DOPE and PoseCNN do).
-
Machine Learning Workshop tonight 8-9pm hosted by Underwater Robotics!
For our last event of ArchE Week, the Ohio State Underwater Robotics Team (Website, Instagram) is hosting a workshop tonight on machine learning! The workshop is an interactive walkthrough of using machine learning solutions to make predictions. Some example problems we could be trying to solve are predicting a grade, predicting the weather, and the classic recognize a digit problem. Our team personally uses machine learning to do real-time object detection with YOLO and NVidia DOPE, so we may touch on that as well!
trl
-
ORPO, DPO, and PPO: Optimizing Models for Human Preferences
Implementation: ORPO has been integrated into popular fine-tuning libraries like TRL, Axolotl, and LLaMA-Factory.
- FLaNK Stack 29 Jan 2024
-
OOM Error while using TRL for RLHF Fine-tuning
I am using TRL for RLHF fine-tuning the Llama-2-7B model and getting an OOM error (even with batch_size=1). If anyone used TRL for RLHF can please tell me what I am doing wrong? Code details can be found in the GitHub issue.
-
[D] Tokenizers Truncation during Fine-tuning with Large Texts
SFTtrainer from huggingface
-
New Open-source LLMs! 🤯 The Falcon has landed! 7B and 40B
For lora - PEFT seems to work. I don't have patience to wait 5 hours, but modifying this example seems to work. You don't even need to modify that much, as their model just as neo-x uses query_key_value name for self-attention.
-
[D] Using RLHF beyond preference tuning
They have examples of making GPT output more positive (code) by using a sentiment model as reward. There are other examples about reducing toxicity, summarization here: https://github.com/lvwerra/trl/tree/main/examples . Should be fairly simple to modify the sentiment example and try the calculator reward you mentioned above.
-
[R] 🤖🌟 Unlock the Power of Personal AI: Introducing ChatLLaMA, Your Custom Personal Assistant! 🚀💬
You can use this -> https://github.com/lvwerra/trl/blob/main/examples/sentiment/scripts/gpt-neox-20b_peft/merge_peft_adapter.py
-
[R] Stanford-Alpaca 7B model (an instruction tuned version of LLaMA) performs as well as text-davinci-003
Just the hh directly. From the results it seems like it might possibly be enough but I might also try instruction tuning then running the whole process from that base. I will also be running the reinforcement learning by using a Lora using this as an example https://github.com/lvwerra/trl/tree/main/examples/sentiment/scripts/gpt-neox-20b_peft
-
[R] A simple explanation of Reinforcement Learning from Human Feedback (RLHF)
This package is pretty simple to use! https://github.com/lvwerra/trl
- Transformer Reinforcement Learning
What are some alternatives?
reor - Private & local AI personal knowledge management app for high entropy people.
lm-human-preferences - Code for the paper Fine-Tuning Language Models from Human Preferences
Hierarchical-Localization - Visual localization made easy with hloc
trlx - A repo for distributed training of language models with Reinforcement Learning via Human Feedback (RLHF)
PoseCNN-PyTorch - PyTorch implementation of the PoseCNN framework
alpaca-lora - Instruct-tune LLaMA on consumer hardware
CenterPose - Single-Stage Keypoint-based Category-level Object Pose Estimation from an RGB Image (ICRA 2022)
LLaMA-8bit-LoRA - Repository for Chat LLaMA - training a LoRA for the LLaMA (1 or 2) models on HuggingFace with 8-bit or 4-bit quantization. Research only.
llm-classifier - Classify data instantly using an LLM
llama-recipes - Scripts for fine-tuning Meta Llama with composable FSDP & PEFT methods to cover single/multi-node GPUs. Supports default & custom datasets for applications such as summarization and Q&A. Supporting a number of candid inference solutions such as HF TGI, VLLM for local or cloud deployment. Demo apps to showcase Meta Llama for WhatsApp & Messenger.
iNeRF-public
java-snapshot-testing - Facebook style snapshot testing for JAVA Tests