[P] RLHF Learning to Summarize: Implementation by CarperAI with trlX

This page summarizes the projects mentioned and recommended in the original post on /r/MachineLearning

InfluxDB - Power Real-Time Data Analytics at Scale
Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
www.influxdata.com
featured
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com
featured
  • trlx

    A repo for distributed training of language models with Reinforcement Learning via Human Feedback (RLHF)

  • trlX library here: https://github.com/CarperAI/trlx

  • summarize-from-feedback

    Code for "Learning to summarize from human feedback"

  • Found relevant code at https://github.com/openai/summarize-from-feedback + all code implementations here

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts

  • Recapping the AI, Machine Learning and Data Science Meetup — May 2, 2024

    2 projects | dev.to | 2 May 2024
  • Show HN: An end-to-end reinforcement learning library for infinite horizon tasks

    1 project | news.ycombinator.com | 29 Dec 2023
  • Problem with Truncated Quantile Critics (TQC) and n-step learning algorithm.

    4 projects | /r/reinforcementlearning | 9 Dec 2023
  • [P] PettingZoo 1.24.0 has been released (including Stable-Baselines3 tutorials)

    4 projects | /r/reinforcementlearning | 24 Aug 2023
  • SB3 - NotImplementedError: Box([-1. -1. -8.], [1. 1. 8.], (3,), <class 'numpy.float32'>) observation space is not supported

    2 projects | /r/reinforcementlearning | 19 Jun 2023