TinyZero

Clean, minimal, accessible reproduction of DeepSeek R1-Zero (by Jiayi-Pan)

TinyZero Alternatives

Similar projects and alternatives to TinyZero

  1. CodeRabbit

    CodeRabbit: AI Code Reviews for Developers. Revolutionize your code reviews with AI. CodeRabbit offers PR summaries, code walkthroughs, 1-click suggestions, and AST-based analysis. Boost productivity and code quality across all major languages with each PR.

    CodeRabbit logo
  2. DeepSeek-LLM

    26 TinyZero VS DeepSeek-LLM

    DeepSeek LLM: Let there be answers

  3. open-r1

    4 TinyZero VS open-r1

    Fully open reproduction of DeepSeek-R1

  4. s1

    10 TinyZero VS s1

    s1: Simple test-time scaling

  5. CreuSAT

    CreuSAT - A formally verified SAT solver written in Rust and verified with Creusot.

  6. hai-platform

    一种任务级GPU算力分时调度的高性能深度学习训练平台

  7. SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
  8. LLMstudio

    Framework to bring LLM applications to production

  9. r2md

    Convert an entire code repository (local or remote) to a single markdown or pdf file

  10. verify-rust-std

    Verifying the Rust standard library

  11. LightZero

    3 TinyZero VS LightZero

    [NeurIPS 2023 Spotlight] LightZero: A Unified Benchmark for Monte Carlo Tree Search in General Sequential Decision Scenarios (awesome MCTS)

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a better TinyZero alternative or higher similarity.

TinyZero discussion

Log in or Post with

TinyZero reviews and mentions

Posts with mentions or reviews of TinyZero. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2025-02-09.
  • LIMO: Less Is More for Reasoning
    5 projects | news.ycombinator.com | 9 Feb 2025
    Yes, the authors explicitly highlighted those two points in the abstract, in terms of them being the elicitation threshold for complex reasoning, namely, an extremely complete pre-trained foundation model, and a set of extremely high quality examples post-training.

    To your question on finetuning on the initial 10 million pool - intuitively, it would require tremendous amount of finetuning data to move the needle - you really won't be able to move the gradients much with just 817 examples, that initial pool is effectively enforcing pretty rigid regularization.

    There is now an increasing interest in showing that small data with inference time scaling is providing significant yield. Couple of recent examples:

    * TinyZero: https://github.com/Jiayi-Pan/TinyZero

  • Mini-R1: Reproduce DeepSeek R1 "Aha Moment"
    2 projects | news.ycombinator.com | 30 Jan 2025
    They do mention it here

    > Note: This blog is inspired by Jiayi Pan [1] who initially explored the idea and proofed it with a small model.

    But I agree, that attribution could be more substantial.

    > Note: This blog is inspired by Jiayi Pan [1] who also reproduced the "Aha Moment" with their TinyZero [2] model.

    [1] https://x.com/jiayi_pirate/status/1882839370505621655 (1.1M views btw)

    [2] https://github.com/Jiayi-Pan/TinyZero

    A lot of people are busy reproing R1 right now. I think this is the spark.

  • Berkeley Researchers Replicate DeepSeek R1's Core Tech for Just $30: A Small Mod
    2 projects | news.ycombinator.com | 28 Jan 2025
  • Berkeley Researchers Replicate DeepSeek R1's Core Tech for Just $30
    1 project | news.ycombinator.com | 27 Jan 2025
    This is blogspam of https://github.com/Jiayi-Pan/TinyZero and https://nitter.lucabased.xyz/jiayi_pirate/status/18828393705.... This also doesn't mention that it's for one specific domain (playing Countdown).
  • Explainer: What's R1 and Everything Else?
    1 project | news.ycombinator.com | 26 Jan 2025
    This is indeed a massive exaggeration, I'm pretty sure the $30 experiment is this one: https://threadreaderapp.com/thread/1882839370505621655.html (github: https://github.com/Jiayi-Pan/TinyZero).

    And while this is true that this experiment shows that you can reproduce the concept of direct reinforcement learning of an existing LLM, in a way that makes it develop reasoning in the same fashion Deepseek-R1 did, this is very far from a re-creation of R1!

  • DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via RL
    8 projects | news.ycombinator.com | 25 Jan 2025
    >I wonder if this was a deliberate move by PRC or really our own fault in falling for the fallacy that more is always better.

    Well, let’s see …hmmm… are we discussing this on a platform ran by people who made insane money flipping zero-value companies to greater fools during the dotcom bubble, only to pivot to doing the same thing to big tech during the FANG era or one for discussing of hard ML research among the no-nonsense math elite from some of the world’s top universities.

    More seriously, we don’t have to even speculate about any of this because the methods from DeepSeek’s work are already being reproduced:

    https://github.com/Jiayi-Pan/TinyZero

  • TinyZero
    1 project | news.ycombinator.com | 24 Jan 2025
  • A note from our sponsor - CodeRabbit
    coderabbit.ai | 23 Mar 2025
    Revolutionize your code reviews with AI. CodeRabbit offers PR summaries, code walkthroughs, 1-click suggestions, and AST-based analysis. Boost productivity and code quality across all major languages with each PR. Learn more →

Stats

Basic TinyZero repo stats
9
11,298
9.4
13 days ago

Sponsored
CodeRabbit: AI Code Reviews for Developers
Revolutionize your code reviews with AI. CodeRabbit offers PR summaries, code walkthroughs, 1-click suggestions, and AST-based analysis. Boost productivity and code quality across all major languages with each PR.
coderabbit.ai

Did you know that Python is
the 2nd most popular programming language
based on number of references?