Python reinforcement-learning-from-human-feedback Projects
-
safe-rlhf
Safe RLHF: Constrained Value Alignment via Safe Reinforcement Learning from Human Feedback
-
alpaca_farm
A simulation framework for RLHF and alternatives. Develop your RLHF method without collecting human data.
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
Project mention: [R] Meet Beaver-7B: a Constrained Value-Aligned LLM via Safe RLHF Technique | /r/MachineLearning | 2023-05-16
Project mention: [P] AlpacaEval : An Automatic Evaluator for Instruction-following Language Models | /r/MachineLearning | 2023-06-08AlpacaEval dataset: 805 instructions, which are a simplification of AlpacaFarm's evaluation set.
NOTE:
The open source projects on this list are ordered by number of github stars.
The number of mentions indicates repo mentiontions in the last 12 Months or
since we started tracking (Dec 2020).
Index
Project | Stars | |
---|---|---|
1 | safe-rlhf | 1,160 |
2 | alpaca_farm | 717 |
Sponsored
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com