Python reinforcement-learning-from-human-feedback Projects

safe-rlhf

1 1,160 8.1 Python

Safe RLHF: Constrained Value Alignment via Safe Reinforcement Learning from Human Feedback

Project mention: [R] Meet Beaver-7B: a Constrained Value-Aligned LLM via Safe RLHF Technique | /r/MachineLearning | 2023-05-16

alpaca_farm

1 717 7.5 Python

A simulation framework for RLHF and alternatives. Develop your RLHF method without collecting human data.

Project mention: [P] AlpacaEval : An Automatic Evaluator for Instruction-following Language Models | /r/MachineLearning | 2023-06-08

AlpacaEval dataset: 805 instructions, which are a simplification of AlpacaFarm's evaluation set.

InfluxDB

www.influxdata.com featured

Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

Index

	Project	Stars
1	safe-rlhf	1,160
2	alpaca_farm	717

Python reinforcement-learning-from-human-feedback

Python reinforcement-learning-from-human-feedback Projects

safe-rlhf

alpaca_farm

InfluxDB

Index