The NetHack Challenge: Dungeons, Dragons, and Tourists

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

Our great sponsors
  • WorkOS - The modern identity platform for B2B SaaS
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • SaaSHub - Software Alternatives and Reviews
  • qw

    The DCSS-playing bot qw (by crawl)

  • DCSS has been beaten by a handcrafted Lua bot, written by the (at the time) indisputably best player in the game. Unlike the approach here, it got to benefit from human expertise, and also it can only win with a very narrow set of characters. Learndb entry:

    qw: A fully automated lua bot written by elliptic, with some code borrowed from parabolic and xw. The first DCSS bot to ever achieve an uninterrupted and unassisted win (see '!lg qw won 2'). Now maintained at https://github.com/crawl/qw by the DCSS devteam. See qw[2] for a summary of results.

    As of 0.29-a, qw has a 0.36% winrate with GrBe with 1 win in 276 attempts. See https://crawl.dcss.io/crawl/morgue/qwqw/morgue-qwqw-20220613... and games are sometimes played on cdi. Historically its best 3-rune winrate was 15% DDFi^Makhleb, and, for 15 runes, about 1% with GrFi^TSO before the 0.28 hell rework.

    Branch order: D -> Lair:8 -> Orc:3 -> D:15 -> S:5 -> Vaults:4 -> Depths:5 -> S:5 -> Vaults:5 -> Zot

    On the online servers, qw plays with an extra added delay so that it doesn't use too much server CPU. Playing locally without this delay, qw is much faster.

  • can-wikipedia-help-offline-rl

    Official code for "Can Wikipedia Help Offline Reinforcement Learning?" by Machel Reid, Yutaro Yamada and Shixiang Shane Gu

  • Because Nethack and video games remove most of the reasons that GOFAI failed; for example, everything hard about perception from pixels is removed when you have a little parsed text grid of cells presented to you or are hooked into the game engine and query objects directly. (The real world is not made of a clean little grid of discrete high-level objects like 'dragon' or 'black jelly'.)

    Moravec's paradox again: humans find perception easy but things like Sokoban hard, while GOFAI approaches find Sokoban so trivial that it's common to use it (or Sudoku) as a toy problem introduction to constraint solving. Nethack literally has levels which are just Sokoban and which are important to solve; GOFAI can push the boulders around perfectly, even though it would be unable to recognize a photograph of 'a boulder' or use the word 'boulder' in a story.

    Then you have the extensive hand-engineering of expert knowledge which goes into ascension agents and the symbolic winners, above and beyond merely plugging in a Sokoban solver. There are increasing experiments in making DRL agents exploit or initialize from pretrained language models (https://arxiv.org/abs/2005.07648#google https://arxiv.org/abs/2201.12122 https://arxiv.org/abs/2204.01691#google https://ai.stanford.edu/blog/DrRepair/ https://arxiv.org/abs/2005.07648#google https://arxiv.org/abs/2009.03393 https://arxiv.org/abs/2204.00598 come to mind) or reading manuals (https://arxiv.org/abs/1401.5390 all the way back in 2012!), and of course, a DRL agent can learn a tremendous amount without actually doing any playing by offline and off-policy and imitation learning, but while it is exciting and things like Gato look like the future, there is a long way to go from feeding in a dump of the Nethack wiki which mentions offhandedly "you can do X" to an agent recognizing an opportunity for X in the wild and executing it. (Which is something that symbolic approaches also don't come anywhere close to doing, because they just cheat by the capability being given to them by hand-engineering rather than having to autonomously read, understand, and apply.)

  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

    WorkOS logo
NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts