lm-human-preferences VS trl

Compare lm-human-preferences vs trl and see what are their differences.

lm-human-preferences

Code for the paper Fine-Tuning Language Models from Human Preferences (by openai)

trl

Train transformer language models with reinforcement learning. (by huggingface)
InfluxDB - Power Real-Time Data Analytics at Scale
Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
www.influxdata.com
featured
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com
featured
lm-human-preferences trl
8 13
1,113 8,176
2.8% 4.9%
2.7 9.7
10 months ago 2 days ago
Python Python
MIT License Apache License 2.0
The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.

lm-human-preferences

Posts with mentions or reviews of lm-human-preferences. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2023-02-14.

trl

Posts with mentions or reviews of trl. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2024-01-29.

What are some alternatives?

When comparing lm-human-preferences and trl you can also consider the following projects:

GLM-130B - GLM-130B: An Open Bilingual Pre-Trained Model (ICLR 2023)

alpaca-lora - Instruct-tune LLaMA on consumer hardware

dalle-mini - DALL·E Mini - Generate images from a text prompt

trlx - A repo for distributed training of language models with Reinforcement Learning via Human Feedback (RLHF)

tensorrtx - Implementation of popular deep learning networks with TensorRT network definition API

LLaMA-8bit-LoRA - Repository for Chat LLaMA - training a LoRA for the LLaMA (1 or 2) models on HuggingFace with 8-bit or 4-bit quantization. Research only.

glide-text2im - GLIDE: a diffusion-based text-conditional image synthesis model

sparsegpt-for-LLaMA - Code for the paper "SparseGPT: Massive Language Models Can Be Accurately Pruned in One-Shot" with LLaMA implementation.

gpt-2 - Code for the paper "Language Models are Unsupervised Multitask Learners"

llama-recipes - Scripts for fine-tuning Meta Llama3 with composable FSDP & PEFT methods to cover single/multi-node GPUs. Supports default & custom datasets for applications such as summarization and Q&A. Supporting a number of candid inference solutions such as HF TGI, VLLM for local or cloud deployment. Demo apps to showcase Meta Llama3 for WhatsApp & Messenger.

dalle-2-preview

Deep_Object_Pose - Deep Object Pose Estimation (DOPE) – ROS inference (CoRL 2018)