lm-human-preferences VS trl

Compare lm-human-preferences vs trl and see what are their differences.

lm-human-preferences

Code for the paper Fine-Tuning Language Models from Human Preferences (by openai)

trl

Train transformer language models with reinforcement learning. (by huggingface)
Our great sponsors
  • WorkOS - The modern identity platform for B2B SaaS
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • SaaSHub - Software Alternatives and Reviews
lm-human-preferences trl
8 13
1,099 7,935
4.7% 6.0%
2.7 9.6
9 months ago 6 days ago
Python Python
MIT License Apache License 2.0
The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.

lm-human-preferences

Posts with mentions or reviews of lm-human-preferences. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2023-02-14.

trl

Posts with mentions or reviews of trl. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2024-01-29.

What are some alternatives?

When comparing lm-human-preferences and trl you can also consider the following projects:

GLM-130B - GLM-130B: An Open Bilingual Pre-Trained Model (ICLR 2023)

alpaca-lora - Instruct-tune LLaMA on consumer hardware

dalle-mini - DALL·E Mini - Generate images from a text prompt

trlx - A repo for distributed training of language models with Reinforcement Learning via Human Feedback (RLHF)

tensorrtx - Implementation of popular deep learning networks with TensorRT network definition API

LLaMA-8bit-LoRA - Repository for Chat LLaMA - training a LoRA for the LLaMA (1 or 2) models on HuggingFace with 8-bit or 4-bit quantization. Research only.

glide-text2im - GLIDE: a diffusion-based text-conditional image synthesis model

sparsegpt-for-LLaMA - Code for the paper "SparseGPT: Massive Language Models Can Be Accurately Pruned in One-Shot" with LLaMA implementation.

gpt-2 - Code for the paper "Language Models are Unsupervised Multitask Learners"

llama-recipes - Scripts for fine-tuning Llama2 with composable FSDP & PEFT methods to cover single/multi-node GPUs. Supports default & custom datasets for applications such as summarization & question answering. Supporting a number of candid inference solutions such as HF TGI, VLLM for local or cloud deployment.Demo apps to showcase Llama2 for WhatsApp & Messenger

dalle-2-preview

Deep_Object_Pose - Deep Object Pose Estimation (DOPE) – ROS inference (CoRL 2018)