The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning. Learn more →
Top 23 Python reinforcement-learning Projects
-
Ray
Ray is a unified framework for scaling AI and Python applications. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.
22. Ray | Github | tutorial
-
d2l-en
Interactive deep learning book with multi-framework code, math, and discussions. Adopted at 500 universities from 70 countries including Stanford, MIT, Harvard, and Cambridge.
Project mention: which book to chose for deep learning :lan Goodfellow or francois chollet | /r/learnmachinelearning | 2023-04-07 -
WorkOS
The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
-
reinforcement-learning-an-introduction
Python Implementation of Reinforcement Learning: An Introduction
Project mention: Help request: Are the results of Sutton and Barto's Example 6.6 Cliff walking believable? What's likely the problem if my SARSA implementation can't replicate? | /r/reinforcementlearning | 2023-04-10The python code to generate any figure in this textbook is reproduced in a repo, and you can find the file for the figure in question here: https://github.com/ShangtongZhang/reinforcement-learning-an-introduction/blob/master/chapter06/cliff_walking.py
-
Does that mean that the example I found on the internet is wrong (I think it comes from a DL Course, so I'd imagine it is not wrong)? or does it mean that I am comparing two different things? I guess this has to deal with right and left eigen vectors as u/JanneJM pointed out in her comment?
-
wandb
🔥 A tool for visualizing and tracking your machine learning experiments. This repo contains the CLI and Python API.
Project mention: A list of SaaS, PaaS and IaaS offerings that have free tiers of interest to devops and infradev | dev.to | 2024-02-05Weights & Biases — The developer-first MLOps platform. Build better models faster with experiment tracking, dataset versioning, and model management. Free tier for personal projects only, with 100 GB of storage included.
-
and the implementation https://github.com/google/trax/blob/master/trax/models/resea... if you are interested.
Hope you get to look into this!
-
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
-
stable-baselines3
PyTorch version of Stable Baselines, reliable implementations of reinforcement learning algorithms.
The latest release (v3.0.0) of Upkie's software brings a functional sim-to-real reinforcement learning pipeline based on Stable Baselines3, with standard sim-to-real tricks. The pipeline trains on the Gymnasium environments distributed in upkie.envs (setup: pip install upkie) and is implemented in the PPO balancer. Here is a policy running on an Upkie:
-
PaLM-rlhf-pytorch
Implementation of RLHF (Reinforcement Learning with Human Feedback) on top of the PaLM architecture. Basically ChatGPT but with PaLM
Project mention: How should I get an in-depth mathematical understanding of generative AI? | /r/datascience | 2023-05-18ChatGPT isn't open sourced so we don't know what the actual implementation is. I think you can read Open Assistant's source code for application design. If that is too much, try Open Chat Toolkit's source code for developer tools . If you need very bare implementation, you should go for lucidrains/PaLM-rlhf-pytorch.
-
-
-
Gymnasium
An API standard for single-agent reinforcement learning environments, with popular reference environments and related utilities (formerly Gym)
"Show HN: Ghidra Plays Mario" (2023) https://news.ycombinator.com/item?id=37475761 :
[RL, MuZero reduxxxx ]
> Farama-Foundation/Gymnasium is a fork of OpenAI/gym and it has support for additional Environments like MuJoCo: https://github.com/Farama-Foundation/Gymnasium#environments
> Farama-Foundatiom/MO-Gymnasiun: "Multi-objective Gymnasium environments for reinforcement learning": https://github.com/Farama-Foundation/MO-Gymnasium
-
cleanrl
High-quality single file implementation of Deep Reinforcement Learning algorithms with research-friendly features (PPO, DQN, C51, DDPG, TD3, SAC, PPG)
Project mention: [P] PettingZoo 1.24.0 has been released (including Stable-Baselines3 tutorials) | /r/reinforcementlearning | 2023-08-24PettingZoo 1.24.0 is now live! This release includes Python 3.11 support, updated Chess and Hanabi environment versions, and many bugfixes, documentation updates and testing expansions. We are also very excited to announce 3 tutorials using Stable-Baselines3, and a full training script using CleanRL with TensorBoard and WandB.
-
trlx
A repo for distributed training of language models with Reinforcement Learning via Human Feedback (RLHF)
Project mention: Why did Stability not copy Midjourney's RLHF process? And what's the future of Stable Diffusion? | /r/StableDiffusion | 2023-04-09We drove and released the top RLHF framework TRLX for example from our Carper AI lab used by some of the biggest companies in the world: https://github.com/CarperAI/trlx
-
-
dm_control
Google DeepMind's software stack for physics-based simulation and Reinforcement Learning environments, using MuJoCo.
Project mention: Shimmy 1.0: Gymnasium & PettingZoo bindings for popular external RL environments | /r/farama | 2023-04-25This includes single-agent Gymnasium wrappers for DM Control, DM Lab, Behavior Suite, Arcade Learning Environment, OpenAI Gym V21 & V26. Multi-agent PettingZoo wrappers support DM Control Soccer, OpenSpiel and Melting Pot. For more information, read the release notes here:
-
-
pytorch-a2c-ppo-acktr-gail
PyTorch implementation of Advantage Actor Critic (A2C), Proximal Policy Optimization (PPO), Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation (ACKTR) and Generative Adversarial Imitation Learning (GAIL).
-
-
-
-
-
Project mention: Instance segmentation of small objects in grainy drone imagery | /r/computervision | 2023-12-09
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
Python reinforcement-learning related posts
- Bayesianbandits: A Pythonic microframework for multi-armed bandit problems
- Adding Weapons
- Understand how transformers work by demystifying all the math behind them
- Show HN: An end-to-end reinforcement learning library for infinite horizon tasks
- Show HN: Easily train AlphaZero-like agents on any environment you want
- Problem with Truncated Quantile Critics (TQC) and n-step learning algorithm.
- Sim-to-real RL pipeline for open-source wheeled bipeds
-
A note from our sponsor - WorkOS
workos.com | 29 Mar 2024
Index
What are some of the best open-source reinforcement-learning projects in Python? This list will help you:
Project | Stars | |
---|---|---|
1 | Ray | 30,474 |
2 | d2l-en | 21,335 |
3 | reinforcement-learning-an-introduction | 13,110 |
4 | machine_learning_examples | 8,040 |
5 | wandb | 8,036 |
6 | trax | 7,928 |
7 | pysc2 | 7,900 |
8 | stable-baselines3 | 7,704 |
9 | PaLM-rlhf-pytorch | 7,571 |
10 | TensorLayer | 7,275 |
11 | keras-rl | 5,478 |
12 | Gymnasium | 5,458 |
13 | cleanrl | 4,283 |
14 | trlx | 4,278 |
15 | stable-baselines | 4,000 |
16 | dm_control | 3,492 |
17 | polyaxon | 3,465 |
18 | pytorch-a2c-ppo-acktr-gail | 3,423 |
19 | ElegantRL | 3,376 |
20 | acme | 3,351 |
21 | tensorforce | 3,273 |
22 | football | 3,235 |
23 | catalyst | 3,216 |