qw vs can-wikipedia-help-offline-rl

qw

The DCSS-playing bot qw (by crawl)

can-wikipedia-help-offline-rl

Official code for "Can Wikipedia Help Offline Reinforcement Learning?" by Machel Reid, Yutaro Yamada and Shixiang Shane Gu (by machelreid)

Suggest topics

Source Code

arxiv.org

Suggest alternative

Edit details

InfluxDB - Power Real-Time Data Analytics at Scale

Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

www.influxdata.com

featured

SaaSHub - Software Alternatives and Reviews

SaaSHub helps you find the best software and product alternatives

www.saashub.com

featured

qw		can-wikipedia-help-offline-rl
	Project
11	Mentions	3
9	Stars	98
-	Growth	-
9.2	Activity	0.0
3 months ago	Latest Commit	almost 2 years ago
Lua	Language	Python
-	License	MIT License

The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.

qw

Posts with mentions or reviews of qw. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2022-11-15.

First ranged win for qw: GrHu of Okawaru
1 project | /r/dcss | 8 Jun 2023

qw is a bot by elliptic that's capable of playing and winning completely unassisted games of DCSS. It's fully written in lua and runs via a normal crawl rc. Elliptic improved qw over many years to where it has won many combos using melee with a few different gods, and has also won some 15 rune games and completed a zig. Over the past year, I've resumed qw's development to add a goal system, better travel and exploration, among other features described in qw's first official release.
How does the qw bot work?
3 projects | /r/dcss | 15 Nov 2022

Regarding the goal planning, this is rule-based as advil mentioned. The gameplan system I've recently added is meant to allow qw to be fully programmable in terms of its goals. In order to coordinate its goals with crawl's travel system, qw does build a representation of branch and level layout that it "walks" with a simple algorithm that incorporates persistently stored data about every staircase qw encounters. You can find this code in travel.lua. But there is nothing terribly sophisticated here algorithmically, since I'm trying to keep processing in qw as simple as possible to avoid lua throttling. Some day qw may become so sophisticated that I have to move it to python and talk to crawl over the webtiles WebSocket, but that would require a full rewrite and lots of new code to build a representation of crawl levels and data from the json. Hence I'm not going to do that until I feel I've reached the limit of what I can do with crawl's clua.
Init script API
2 projects | /r/dcss | 15 Nov 2022

There are a couple of examples in my rcfile that hook into crawl using c_answer_prompt() and c_message() instead of using ready(). You probably also want to check out qw.rc for examples: This is the most well-known bot that plays Crawl; it uses Crawl's Lua api.
The NetHack Challenge: Dungeons, Dragons, and Tourists
2 projects | news.ycombinator.com | 14 Jun 2022

DCSS has been beaten by a handcrafted Lua bot, written by the (at the time) indisputably best player in the game. Unlike the approach here, it got to benefit from human expertise, and also it can only win with a very narrow set of characters. Learndb entry:
qw: A fully automated lua bot written by elliptic, with some code borrowed from parabolic and xw. The first DCSS bot to ever achieve an uninterrupted and unassisted win (see '!lg qw won 2'). Now maintained at https://github.com/crawl/qw by the DCSS devteam. See qw[2] for a summary of results.
As of 0.29-a, qw has a 0.36% winrate with GrBe with 1 win in 276 attempts. See https://crawl.dcss.io/crawl/morgue/qwqw/morgue-qwqw-20220613... and games are sometimes played on cdi. Historically its best 3-rune winrate was 15% DDFi^Makhleb, and, for 15 runes, about 1% with GrFi^TSO before the 0.28 hell rework.
Branch order: D -> Lair:8 -> Orc:3 -> D:15 -> S:5 -> Vaults:4 -> Depths:5 -> S:5 -> Vaults:5 -> Zot
On the online servers, qw plays with an extra added delay so that it doesn't use too much server CPU. Playing locally without this delay, qw is much faster.
RC file issue with new version of the game
1 project | /r/dcss | 13 Jun 2022

I don't really know the details on this but I think the slot name got renamed, and you may want to look at the following PR for qw: https://github.com/crawl/qw/pull/3
DCSS Python Bot (proof of concept)
3 projects | /r/dcss | 23 Aug 2021
Macros and Crawl's more advanced features
2 projects | /r/dcss | 6 Mar 2021

One example is the bot qw, which is written in Lua and can win Crawl without human input.
QW Bot - LUA errors
1 project | /r/dcss | 15 Feb 2021

Elliptic isn't updating the bot any more; qw is now (semi-)maintained by the devteam at: https://github.com/crawl/qw

can-wikipedia-help-offline-rl

Posts with mentions or reviews of can-wikipedia-help-offline-rl. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2022-06-14.

The NetHack Challenge: Dungeons, Dragons, and Tourists
2 projects | news.ycombinator.com | 14 Jun 2022

Because Nethack and video games remove most of the reasons that GOFAI failed; for example, everything hard about perception from pixels is removed when you have a little parsed text grid of cells presented to you or are hooked into the game engine and query objects directly. (The real world is not made of a clean little grid of discrete high-level objects like 'dragon' or 'black jelly'.)
Moravec's paradox again: humans find perception easy but things like Sokoban hard, while GOFAI approaches find Sokoban so trivial that it's common to use it (or Sudoku) as a toy problem introduction to constraint solving. Nethack literally has levels which are just Sokoban and which are important to solve; GOFAI can push the boulders around perfectly, even though it would be unable to recognize a photograph of 'a boulder' or use the word 'boulder' in a story.
Then you have the extensive hand-engineering of expert knowledge which goes into ascension agents and the symbolic winners, above and beyond merely plugging in a Sokoban solver. There are increasing experiments in making DRL agents exploit or initialize from pretrained language models (https://arxiv.org/abs/2005.07648#google https://arxiv.org/abs/2201.12122 https://arxiv.org/abs/2204.01691#google https://ai.stanford.edu/blog/DrRepair/ https://arxiv.org/abs/2005.07648#google https://arxiv.org/abs/2009.03393 https://arxiv.org/abs/2204.00598 come to mind) or reading manuals (https://arxiv.org/abs/1401.5390 all the way back in 2012!), and of course, a DRL agent can learn a tremendous amount without actually doing any playing by offline and off-policy and imitation learning, but while it is exciting and things like Gato look like the future, there is a long way to go from feeding in a dump of the Nethack wiki which mentions offhandedly "you can do X" to an agent recognizing an opportunity for X in the wild and executing it. (Which is something that symbolic approaches also don't come anywhere close to doing, because they just cheat by the capability being given to them by hand-engineering rather than having to autonomously read, understand, and apply.)
A.I. Learns to Drive From Scratch in Trackmania
1 project | /r/Games | 12 Mar 2022

It has! The video is very introductory RL and doesn't represent the state of the art in AI research. For example in 2020 Deepmind made an AI that can beat humans in 57 Atari games, and more recently there's been a study that showed that transformer models that were trained on language generation can be used for unrelated Reinforcement Learning tasks that is hard to generate data for like the posted video. This seems to indicate that transformer language models actually learn about the world and can transfer that knowledge to other tasks, with the caveat that these models are slow to train and take up a lot of memory (GPT-3 takes up ~700GB of GPU memory for example).
[D] Paper Explained - Can Wikipedia Help Offline Reinforcement Learning? (Full Video Walkthrough)
1 project | /r/MachineLearning | 26 Feb 2022

Code: https://github.com/machelreid/can-wikipedia-help-offline-rl