"RL Fine-Tuning: Scalable Online Planning via Reinforcement Learning Fine-Tuning", Fickinger et al 2021 {FB}

This page summarizes the projects mentioned and recommended in the original post on /r/reinforcementlearning

InfluxDB - Power Real-Time Data Analytics at Scale
Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
www.influxdata.com
featured
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com
featured
  • KataGo

    GTP engine and self-play learning in Go

  • MCTS is hard to beat for chess/Go, but I'm increasingly convinced that MCTS is a heuristic that's overfit to perfect-info deterministic board games. Even within chess/Go, David Wu (creator of KataGo and now a researcher at FAIR) has pointed out to me several interesting failure cases for MCTS. I do think with further algorithmic improvements and hardware scaling, RL Fine-Tuning might overtake MCTS in chess/Go, but the real goal is to develop more general algorithms that can be used in a wide variety of environments.

  • nnue-pytorch

    Stockfish NNUE (Chess evaluation) trainer in Pytorch

  • Getting SOTA in chess would be earth-shattering, especially since Stockfish has now adopted very light-weight NNs (called NNUE) and has doubled down on alpha-beta search, regaining the upper hand against A0 style programs.

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts

  • Ask HN: Who is hiring? (May 2024)

    3 projects | news.ycombinator.com | 1 May 2024
  • Better and Faster Large Language Models via Multi-Token Prediction

    1 project | news.ycombinator.com | 1 May 2024
  • Ask HN: What have you built with ESPHome, ESP8266 or similar hardware

    26 projects | news.ycombinator.com | 27 Apr 2024
  • Deploying Unity Executable as a Windows Service

    1 project | dev.to | 1 May 2024
  • The LaserDisc – By Bradford Morgan White – Abort Retry Fail

    1 project | news.ycombinator.com | 1 May 2024