Python reinforcement-learning

Open-source Python projects categorized as reinforcement-learning

Top 23 Python reinforcement-learning Projects

reinforcement-learning
  1. nn

    🧑‍🏫 60+ Implementations/tutorials of deep learning papers with side-by-side notes 📝; including transformers (original, xl, switch, feedback, vit, ...), optimizers (adam, adabelief, sophia, ...), gans(cyclegan, stylegan2, ...), 🎮 reinforcement learning (ppo, dqn), capsnet, distillation, ... 🧠

  2. SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
  3. unsloth

    Unsloth Studio is a web UI for training and running open models like Gemma 4, Qwen3.6, DeepSeek, gpt-oss locally.

    Project mention: I Trained an LLM on 75K of My Own Messages So It Would Stop Writing Like a Chatbot | dev.to | 2026-05-08

    Training: unsloth + trl (SFTTrainer). Unsloth handles the 4-bit quantization and gradient checkpointing; trl handles the training loop.

  4. Ray

    Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.

    Project mention: GSoC 2026 Predictions: 30 NEW AI/ML/Security Organizations You Should Start Contributing to NOW! | dev.to | 2026-02-06

    Main: https://github.com/ray-project/ray ⭐ 34k+

  5. sglang

    SGLang is a high-performance serving framework for large language models and multimodal models.

    Project mention: DeepSeek makes the V4 Pro price discount permanent | news.ycombinator.com | 2026-05-22

    There are several things at play:

    Inference stack efficiency: Many of these providers take off the shelf sglang / vllm / trtllm and hope for the best. Meanwhile DeepSeek team is known for pushing the boundary of optimizations.

    Now, sglang and vllm are great pieces of software, but take DeepSeek's Sparse Attention (DSA). Introduced 1.5 years ago (https://arxiv.org/abs/2512.02556), used by DeepSeek 3.2, GLM 5, DeepSeek V4. Only now is it slowly strating to get optimized in the major inference engines: (https://github.com/sgl-project/sglang/issues/19380 https://github.com/sgl-project/sglang/pull/22851 etc.). Of course, DS V4 adds extra optimizations into the model architecture on top of DSA, and those will take more time to be taken full advantage of by the open source inference engines.

    Privacy: Betting that people will pay extra for inference hosted outside China. This is especially true with DeepSeek, because DeepSeek is transparent about using API data for model improvements.

    And few other things (scale (matters a lot for MoEs), reliability, soft enterprise lock in, etc.)

    ---

    There is also, likely, tacit collusion at play here. Look at GLM 5 and GLM 5.1 prices. GLM 5 and 5.1 cost the same to run, but providers decided to charge much more for 5.1 because it is much better model, and because Z.AI raised their price as well.

  6. d2l-en

    Interactive deep learning book with multi-framework code, math, and discussions. Adopted at 500 universities from 70 countries including Stanford, MIT, Harvard, and Cambridge.

  7. agent-lightning

    The absolute trainer to light up AI agents.

    Project mention: Daily Artificial Intelligence Digest - Oct 26, 2025 | dev.to | 2025-10-25

    Microsoft's agent-lightning project offers a comprehensive toolkit aimed at accelerating the process of building, testing, and deploying AI Agents. This open-source initiative highlights the industry's commitment to enabling faster development and implementation of advanced AI capabilities, providing developers with robust resources to streamline AI agent creation.

  8. reinforcement-learning-an-introduction

    Python Implementation of Reinforcement Learning: An Introduction

  9. ai-engineering-from-scratch

    Learn it. Build it. Ship it for others.

    Project mention: One Open Source Project per Day #74: ai-engineering-from-scratch - Build AI Full-stack Skills from Ground Up | dev.to | 2026-05-23

    git clone https://github.com/rohitg00/ai-engineering-from-scratch.git cd ai-engineering-from-scratch python phases/01-math-foundations/01-linear-algebra-intuition/code/vectors.py

  10. stable-baselines3

    PyTorch version of Stable Baselines, reliable implementations of reinforcement learning algorithms.

  11. Gymnasium

    An API standard for single-agent reinforcement learning environments, with popular reference environments and related utilities (formerly Gym)

  12. wandb

    The AI developer platform. Use Weights & Biases to train and fine-tune models, and manage models from experimentation to production.

    Project mention: The $100 ChatGPT: Why Karpathy's nanochat Represnts the Next Big Thing | dev.to | 2026-05-04

    Each stage is comprehensible. Each stage is hackable. You can literally watch it get smarter in real-time through the wandb plots.

  13. cleanrl

    High-quality single file implementation of Deep Reinforcement Learning algorithms with research-friendly features (PPO, DQN, C51, DDPG, TD3, SAC, PPG)

  14. ART

    Agent Reinforcement Trainer: train multi-step agents for real-world tasks using GRPO. Give your agents on-the-job training. Reinforcement learning for Qwen3.6, GPT-OSS, Llama, and more!

    Project mention: Code is not being read anymore, right? | news.ycombinator.com | 2026-02-22

    Huh?

    I thought the latest advance in computing (spring 2025) is self-play / reinforcement learning. Like we've ran out of training data a few years ago.

    https://github.com/OpenPipe/ART

    Reinforcement learning having the large language model devise puzzles that they solve via llm-as-judge.

    The definition of llm-as-judge is your llm generate 8-12 trajectories and a different llm judges the result. I'd use an oracle like windows or linux operating system execution for the problem of ISA-assembly creation.

    The winning entries are used to train the large language model.

  15. OpenRLHF

    An Easy-to-use, Scalable and High-performance Agentic RL Framework based on Ray (PPO & DAPO & REINFORCE++ & VLM & TIS & vLLM & Ray & Async RL)

  16. machine_learning_examples

    A collection of machine learning examples and tutorials.

  17. pysc2

    StarCraft II Learning Environment

  18. TensorLayer

    Deep Learning and Reinforcement Learning Library for Scientists and Engineers

  19. keras-rl

    Deep Reinforcement Learning for Keras.

  20. AReaL

    The RL Bridge for LLM-based Agent Applications. Made Simple & Flexible.

    Project mention: Text-to-LoRA & AReaL: AI Builder Breakthroughs | dev.to | 2026-01-25

    While mainstream AI chatter circles ever-larger models, two research drops last weeks point to something more tactical: faster, cheaper ways to customize and train what you already have. Sakana AI's Text-to-LoRA (T2L) slashes adapter creation to a single prompt, and AReaL framework squeezes 2-3× more throughput from your RLHF cluster. Let's unpack the wins and risks.

  21. xtuner

    A Next-Generation Training Engine Built for Ultra-Large MoE Models

  22. trlx

    A repo for distributed training of language models with Reinforcement Learning via Human Feedback (RLHF)

  23. dm_control

    Google DeepMind's software stack for physics-based simulation and Reinforcement Learning environments, using MuJoCo.

  24. ElegantRL

    Massively Parallel Deep Reinforcement Learning. 🔥

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

Python reinforcement-learning discussion

Log in or Post with

Python reinforcement-learning related posts

  • I'm Scared About Biological Computing

    1 project | news.ycombinator.com | 5 May 2026
  • The Evolution of GUI Agents: From RPA Scripts to AI That Sees Your Screen

    2 projects | dev.to | 9 Apr 2026
  • Open Source Project of the Day (Part 10): AgentEvolver - Self-Evolving Agent System for Autonomous Learning and Evolution

    1 project | dev.to | 8 Mar 2026
  • Simular Agent S hits 72.6% success on 369 real computer tasks (human: 72.36%)

    1 project | news.ycombinator.com | 16 Dec 2025
  • Learning to Model the World with Language

    1 project | news.ycombinator.com | 6 Nov 2025
  • maze VS pi-optimal - a user suggested alternative

    2 projects | 30 Oct 2025
  • Daily Artificial Intelligence Digest - Oct 26, 2025

    1 project | dev.to | 25 Oct 2025
  • A note from our sponsor - SaaSHub
    www.saashub.com | 10 Jun 2026
    SaaSHub helps you find the best software and product alternatives Learn more →

Index

What are some of the best open-source reinforcement-learning projects in Python? This list will help you:

# Project Stars
1 nn 66,902
2 unsloth 65,904
3 Ray 42,791
4 sglang 28,872
5 d2l-en 28,853
6 agent-lightning 17,276
7 reinforcement-learning-an-introduction 14,640
8 ai-engineering-from-scratch 13,774
9 stable-baselines3 13,381
10 Gymnasium 12,001
11 wandb 11,104
12 cleanrl 9,911
13 ART 9,893
14 OpenRLHF 9,596
15 machine_learning_examples 8,877
16 pysc2 8,276
17 TensorLayer 7,389
18 keras-rl 5,554
19 AReaL 5,252
20 xtuner 5,151
21 trlx 4,743
22 dm_control 4,607
23 ElegantRL 4,337

Sponsored
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com

Did you know that Python is
the 1st most popular programming language
based on number of references?