[P] RLHF Learning to Summarize: Implementation by CarperAI with trlX

InfluxDB - Power Real-Time Data Analytics at Scale

Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

www.influxdata.com

featured

SaaSHub - Software Alternatives and Reviews

SaaSHub helps you find the best software and product alternatives

www.saashub.com

featured

trlx

6 4,332 7.9 Python

A repo for distributed training of language models with Reinforcement Learning via Human Feedback (RLHF)

trlX library here: https://github.com/CarperAI/trlx

summarize-from-feedback

4 949 2.8 Python

Code for "Learning to summarize from human feedback"

Found relevant code at https://github.com/openai/summarize-from-feedback + all code implementations here

InfluxDB

www.influxdata.com featured

Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Recapping the AI, Machine Learning and Data Science Meetup — May 2, 2024

2 projects | dev.to | 2 May 2024
Show HN: An end-to-end reinforcement learning library for infinite horizon tasks

1 project | news.ycombinator.com | 29 Dec 2023
Problem with Truncated Quantile Critics (TQC) and n-step learning algorithm.

4 projects | /r/reinforcementlearning | 9 Dec 2023
[P] PettingZoo 1.24.0 has been released (including Stable-Baselines3 tutorials)

4 projects | /r/reinforcementlearning | 24 Aug 2023
SB3 - NotImplementedError: Box([-1. -1. -8.], [1. 1. 8.], (3,), <class 'numpy.float32'>) observation space is not supported

2 projects | /r/reinforcementlearning | 19 Jun 2023

[P] RLHF Learning to Summarize: Implementation by CarperAI with trlX

This page summarizes the projects mentioned and recommended in the original post on /r/MachineLearning
Machine Learning Pytorch reinforcement-learning
Post date: 12 Jan 2023

trlx

summarize-from-feedback

InfluxDB

Related posts

Recapping the AI, Machine Learning and Data Science Meetup — May 2, 2024

Show HN: An end-to-end reinforcement learning library for infinite horizon tasks

Problem with Truncated Quantile Critics (TQC) and n-step learning algorithm.

[P] PettingZoo 1.24.0 has been released (including Stable-Baselines3 tutorials)

SB3 - NotImplementedError: Box([-1. -1. -8.], [1. 1. 8.], (3,), <class 'numpy.float32'>) observation space is not supported

[P] RLHF Learning to Summarize: Implementation by CarperAI with trlX

This page summarizes the projects mentioned and recommended in the original post on /r/MachineLearning Machine Learning Pytorch reinforcement-learning Post date: 12 Jan 2023

trlx

summarize-from-feedback

InfluxDB

Related posts

Recapping the AI, Machine Learning and Data Science Meetup — May 2, 2024

Show HN: An end-to-end reinforcement learning library for infinite horizon tasks

Problem with Truncated Quantile Critics (TQC) and n-step learning algorithm.

[P] PettingZoo 1.24.0 has been released (including Stable-Baselines3 tutorials)

SB3 - NotImplementedError: Box([-1. -1. -8.], [1. 1. 8.], (3,), &lt;class 'numpy.float32'&gt;) observation space is not supported

This page summarizes the projects mentioned and recommended in the original post on /r/MachineLearning
Machine Learning Pytorch reinforcement-learning
Post date: 12 Jan 2023

SB3 - NotImplementedError: Box([-1. -1. -8.], [1. 1. 8.], (3,), <class 'numpy.float32'>) observation space is not supported