Python reinforcement-learning

Open-source Python projects categorized as reinforcement-learning | Edit details

Top 23 Python reinforcement-learning Projects

  • Ray

    An open source framework that provides a simple, universal API for building distributed applications. Ray is packaged with RLlib, a scalable reinforcement learning library, and Tune, a scalable hyperparameter tuning library.

    Project mention: Is it normal to have a negative and near-zero explained variance in PPO? | | 2021-12-25

    I guess I did, as I directly use the PPO agent provided by the RLlib.

  • tensor2tensor

    Library of deep learning models and datasets designed to make deep learning more accessible and accelerate ML research.

    Project mention: [D] Resources for Understanding The Original Transformer Paper | | 2021-09-08

    Code for found:

  • OPS

    OPS - Build and Run Open Source Unikernels. Quickly and easily build and deploy open source unikernels in tens of seconds. Deploy in any language to any cloud.

  • pysc2

    StarCraft II Learning Environment

    Project mention: How A.I. Conquered Poker | | 2022-01-18
  • trax

    Trax — Deep Learning with Clear Code and Speed

    Project mention: [D] Paper Explained - Sparse is Enough in Scaling Transformers (aka Terraformer) | Video Walkthrough | | 2021-12-01


  • machine_learning_examples

    A collection of machine learning examples and tutorials.

    Project mention: How to save an attention model for deployment/exposing to an API? | | 2021-08-17

    I've been following a course teaching how to make an attention model for neural machine translation, This is the file inside the repo. I know that I'll have to use certain functions to make the textual input be processed in encodings and tokens, but those functions use certain instances of the model, which I don't know if I should keep or not. If anyone can please take a look and help me out here, it'd be really really appreciated.

  • client

    🔥 A tool for visualizing and tracking your machine learning experiments. This repo contains the CLI and Python API. (by wandb)

    Project mention: What's a sequel that got you thinking "the people who made this COMPLETELY missed the point of the first one"? | | 2022-01-16

    does current cgi and ai tech can bring back leslie nielsen? might use unreal engine and or

  • stable-baselines

    A fork of OpenAI Baselines, implementations of reinforcement learning algorithms

    Project mention: Nvidia ISAAC gym/RL | | 2021-08-28

    Code for found:

  • SonarQube

    Static code analysis for 29 languages.. Your projects are multi-language. So is SonarQube analysis. Find Bugs, Vulnerabilities, Security Hotspots, and Code Smells so you can release quality code every time. Get started analyzing your projects today for free.

  • tensorforce

    Tensorforce: a TensorFlow library for applied reinforcement learning

    Project mention: Advice on doing RL for Settlers of Catan? | | 2021-07-11

    The most promising approach has been using the TensorForce framework ( with a custom environment that represents a simpler game (1v1 against a bot that chooses actions randomly, no trading between players, and fixing discarding to be done automatically and at random).

  • polyaxon

    Machine Learning Management & Orchestration Platform (Monorepo for Polyaxon's MLOps Tools)

    Project mention: [D] Productionalizing machine learning pipelines for small teams | | 2021-08-08

    For running experiments, is a really good free open-source package that has lots of nice integrations so you can quickly run experiments in k8s but it might be overkill in some cases.

  • football

    Check out the new game server:

    Project mention: Creating a new football game | | 2021-07-26

    For fun, merging such an idea with Google's open source football research project and its AI could result in a very interesting game!

  • stable-baselines3

    PyTorch version of Stable Baselines, reliable implementations of reinforcement learning algorithms.

    Project mention: Change learning rate of a saved model | | 2022-01-24

    Looking at the code in, it looks like it uses a learning rate scheduler with an initial value set with self.learning_rate. Have you tried setting the value of model.learning_rate?

  • pytorch-a2c-ppo-acktr-gail

    PyTorch implementation of Advantage Actor Critic (A2C), Proximal Policy Optimization (PPO), Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation (ACKTR) and Generative Adversarial Imitation Learning (GAIL).

    Project mention: How to pretrain a model on expert data? | | 2021-09-12

    Try using an imitation learning algorithm. Two popular options are MaxEnt IRL and GAIL. This repository has GAIL implementation and this repository has MaxEnt IRL and GAIL implementation. There are other implementations too that you can check out.

  • dm_control

    DeepMind's software stack for physics-based simulation and Reinforcement Learning environments, using MuJoCo.

    Project mention: [D] MuJoCo vs PyBullet? (esp. for custom environment) | | 2021-12-07

    If you're interested in using Mujoco, I'd suggest checking out the dm_control package for Python bindings rather than interfacing with C++ directly. I think one downside to Mujoco currently is that you cannot dynamically add objects, and the entire simulation is initialized and loaded according to the MJCF / XML file.

  • acme

    A library of reinforcement learning components and agents

    Project mention: Applied resources in Pytorch? | | 2021-07-04
  • agents

    TF-Agents: A reliable, scalable and easy to use TensorFlow library for Contextual Bandits and Reinforcement Learning.

    Project mention: Trying to apply the TensorFlow agents from the examples to a custom environment | | 2022-01-09

    I followed the TensorFlow tutorial for agents and the multi armed bandit tutorial and now I'm trying to make one of the already implemented agents, from the examples, work on my own environment. Basically my environment exists of 5 actions and 5 observations. Applying one action i results in the same state i. One action contains another step of sending that action number to a different program via a socket and the answer from the program is interpreted for the reward. My environment seems to be working, I used the little test script below to test the observe and action functions. I know this is not a full proof but showed its atleast working.

  • minimalRL

    Implementations of basic RL algorithms with minimal lines of codes! (pytorch based)

    Project mention: Rl algorithm implemented | | 2021-07-18
  • muzero-general


    Project mention: MuZero unable to solve non-slippery FrozenLake environment? | | 2021-08-09

    I have used this implementation from MuZero:

  • ElegantRL

    Scalable and Elastic Deep Reinforcement Learning Using PyTorch. Please star. 🔥

    Project mention: ElegantRL: A Lightweight and Stable Deep Reinforcement Learning Library | | 2021-03-15
  • rlcard

    Reinforcement Learning / AI Bots in Card (Poker) Games - Blackjack, Leduc, Texas, DouDizhu, Mahjong, UNO.

    Project mention: Self play environments | | 2021-11-26

    Hi. I’ve decided to do a project to adapt an rl library to support self-play. This is a project so I can teach myself more about building rl systems. I’ve been considering working with the environment system from rlcard but wonder if there are other more widely-used self play environment libraries. Thanks.

  • gym-minigrid

    Minimalistic gridworld package for OpenAI Gym

    Project mention: Using FastAI to navigate matterport spaces? | | 2021-10-07

    This is a pretty hard domain to start with as someone "brand new" to AI. If you're interested in the vision aspect, I'd suggest you start by training a DNN for the CIFAR-10 task. There are plenty of tutorials out there. If you're more interested in the navigation aspect, you could start by training a Q-learning agent to solve some of the simpler problems in gym-minigrid.

  • Advanced-Deep-Learning-with-Keras

    Advanced Deep Learning with Keras, published by Packt

    Project mention: Cannot understand how REINFORCE model is trained | | 2021-03-04

    I have understood the concept of REINFORCE algorithm and what policy gradient is. However, when I see the code published by PacktPublishing, I was stuck with it.

  • deep-q-learning

    Minimal Deep Q Learning (DQN & DDQN) implementations in Keras (by keon)

    Project mention: Deep Q Network knapsack problem | | 2021-05-22

    So go online on GitHub and find a DQN implementation that has options for using a feedforward net as input (instead of conv net as your input isn’t pixel based). Any remotely modular piece of code will take in state space size and action space as parameters to their NN. This is essentially setting input layer to be equal to state space (so 4) and output layer to be action space (201). ( this repo seems helpful i a cursory glance

  • Hypernets

    A General Automated Machine Learning framework to simplify the development of End-to-end AutoML toolkits in specific domains.

    Project mention: [N][R] A Brief Tutorial for Developing AutoML Tools with Hypernets | | 2021-06-28

    Please see here for the Hypernets library.

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020). The latest post mention was on 2022-01-24.

Python reinforcement-learning related posts


What are some of the best open-source reinforcement-learning projects in Python? This list will help you:

Project Stars
1 Ray 18,887
2 tensor2tensor 11,940
3 pysc2 7,413
4 trax 6,736
5 machine_learning_examples 6,512
6 client 3,653
7 stable-baselines 3,422
8 tensorforce 3,076
9 polyaxon 2,984
10 football 2,864
11 stable-baselines3 2,778
12 pytorch-a2c-ppo-acktr-gail 2,666
13 dm_control 2,644
14 acme 2,491
15 agents 2,162
16 minimalRL 2,076
17 muzero-general 1,631
18 ElegantRL 1,617
19 rlcard 1,572
20 gym-minigrid 1,379
21 Advanced-Deep-Learning-with-Keras 1,193
22 deep-q-learning 1,074
23 Hypernets 1,037
Find remote jobs at our new job board There are 29 new remote jobs listed recently.
Are you hiring? Post a new remote job listing for free.
Less time debugging, more time building
Scout APM allows you to find and fix performance issues with no hassle. Now with error monitoring and external services monitoring, Scout is a developer's best friend when it comes to application development.