can-wikipedia-help-offline-rl

Official code for "Can Wikipedia Help Offline Reinforcement Learning?" by Machel Reid, Yutaro Yamada and Shixiang Shane Gu (by machelreid)

Can-wikipedia-help-offline-rl Alternatives

Similar projects and alternatives to can-wikipedia-help-offline-rl

  • qw

    The DCSS-playing bot qw (by crawl)

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a better can-wikipedia-help-offline-rl alternative or higher similarity.

can-wikipedia-help-offline-rl reviews and mentions

Posts with mentions or reviews of can-wikipedia-help-offline-rl. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2022-06-14.
  • The NetHack Challenge: Dungeons, Dragons, and Tourists
    2 projects | news.ycombinator.com | 14 Jun 2022
    Because Nethack and video games remove most of the reasons that GOFAI failed; for example, everything hard about perception from pixels is removed when you have a little parsed text grid of cells presented to you or are hooked into the game engine and query objects directly. (The real world is not made of a clean little grid of discrete high-level objects like 'dragon' or 'black jelly'.)

    Moravec's paradox again: humans find perception easy but things like Sokoban hard, while GOFAI approaches find Sokoban so trivial that it's common to use it (or Sudoku) as a toy problem introduction to constraint solving. Nethack literally has levels which are just Sokoban and which are important to solve; GOFAI can push the boulders around perfectly, even though it would be unable to recognize a photograph of 'a boulder' or use the word 'boulder' in a story.

    Then you have the extensive hand-engineering of expert knowledge which goes into ascension agents and the symbolic winners, above and beyond merely plugging in a Sokoban solver. There are increasing experiments in making DRL agents exploit or initialize from pretrained language models (https://arxiv.org/abs/2005.07648#google https://arxiv.org/abs/2201.12122 https://arxiv.org/abs/2204.01691#google https://ai.stanford.edu/blog/DrRepair/ https://arxiv.org/abs/2005.07648#google https://arxiv.org/abs/2009.03393 https://arxiv.org/abs/2204.00598 come to mind) or reading manuals (https://arxiv.org/abs/1401.5390 all the way back in 2012!), and of course, a DRL agent can learn a tremendous amount without actually doing any playing by offline and off-policy and imitation learning, but while it is exciting and things like Gato look like the future, there is a long way to go from feeding in a dump of the Nethack wiki which mentions offhandedly "you can do X" to an agent recognizing an opportunity for X in the wild and executing it. (Which is something that symbolic approaches also don't come anywhere close to doing, because they just cheat by the capability being given to them by hand-engineering rather than having to autonomously read, understand, and apply.)

  • A.I. Learns to Drive From Scratch in Trackmania
    1 project | /r/Games | 12 Mar 2022
    It has! The video is very introductory RL and doesn't represent the state of the art in AI research. For example in 2020 Deepmind made an AI that can beat humans in 57 Atari games, and more recently there's been a study that showed that transformer models that were trained on language generation can be used for unrelated Reinforcement Learning tasks that is hard to generate data for like the posted video. This seems to indicate that transformer language models actually learn about the world and can transfer that knowledge to other tasks, with the caveat that these models are slow to train and take up a lot of memory (GPT-3 takes up ~700GB of GPU memory for example).
  • [D] Paper Explained - Can Wikipedia Help Offline Reinforcement Learning? (Full Video Walkthrough)
    1 project | /r/MachineLearning | 26 Feb 2022
    Code: https://github.com/machelreid/can-wikipedia-help-offline-rl
  • A note from our sponsor - SaaSHub
    www.saashub.com | 4 May 2024
    SaaSHub helps you find the best software and product alternatives Learn more →

Stats

Basic can-wikipedia-help-offline-rl repo stats
3
98
0.0
almost 2 years ago

machelreid/can-wikipedia-help-offline-rl is an open source project licensed under MIT License which is an OSI approved license.

The primary programming language of can-wikipedia-help-offline-rl is Python.


Sponsored
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com