open-r1 VS TinyZero

Compare open-r1 vs TinyZero and see what are their differences.

open-r1

Fully open reproduction of DeepSeek-R1 (by huggingface)

TinyZero

Clean, minimal, accessible reproduction of DeepSeek R1-Zero (by Jiayi-Pan)
Nutrient - The #1 PDF SDK Library
Bad PDFs = bad UX. Slow load times, broken annotations, clunky UX frustrates users. Nutrient’s PDF SDKs gives seamless document experiences, fast rendering, annotations, real-time collaboration, 100+ features. Used by 10K+ devs, serving ~half a billion users worldwide. Explore the SDK for free.
nutrient.io
featured
CodeRabbit: AI Code Reviews for Developers
Revolutionize your code reviews with AI. CodeRabbit offers PR summaries, code walkthroughs, 1-click suggestions, and AST-based analysis. Boost productivity and code quality across all major languages with each PR.
coderabbit.ai
featured
open-r1 TinyZero
4 9
22,359 11,174
99.5% 54.1%
9.4 9.4
7 days ago 5 days ago
Python Python
Apache License 2.0 Apache License 2.0
The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.

open-r1

Posts with mentions or reviews of open-r1. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2025-01-25.

TinyZero

Posts with mentions or reviews of TinyZero. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2025-02-09.
  • LIMO: Less Is More for Reasoning
    5 projects | news.ycombinator.com | 9 Feb 2025
    Yes, the authors explicitly highlighted those two points in the abstract, in terms of them being the elicitation threshold for complex reasoning, namely, an extremely complete pre-trained foundation model, and a set of extremely high quality examples post-training.

    To your question on finetuning on the initial 10 million pool - intuitively, it would require tremendous amount of finetuning data to move the needle - you really won't be able to move the gradients much with just 817 examples, that initial pool is effectively enforcing pretty rigid regularization.

    There is now an increasing interest in showing that small data with inference time scaling is providing significant yield. Couple of recent examples:

    * TinyZero: https://github.com/Jiayi-Pan/TinyZero

  • Mini-R1: Reproduce DeepSeek R1 "Aha Moment"
    2 projects | news.ycombinator.com | 30 Jan 2025
    They do mention it here

    > Note: This blog is inspired by Jiayi Pan [1] who initially explored the idea and proofed it with a small model.

    But I agree, that attribution could be more substantial.

    > Note: This blog is inspired by Jiayi Pan [1] who also reproduced the "Aha Moment" with their TinyZero [2] model.

    [1] https://x.com/jiayi_pirate/status/1882839370505621655 (1.1M views btw)

    [2] https://github.com/Jiayi-Pan/TinyZero

    A lot of people are busy reproing R1 right now. I think this is the spark.

  • Berkeley Researchers Replicate DeepSeek R1's Core Tech for Just $30: A Small Mod
    2 projects | news.ycombinator.com | 28 Jan 2025
  • Berkeley Researchers Replicate DeepSeek R1's Core Tech for Just $30
    1 project | news.ycombinator.com | 27 Jan 2025
    This is blogspam of https://github.com/Jiayi-Pan/TinyZero and https://nitter.lucabased.xyz/jiayi_pirate/status/18828393705.... This also doesn't mention that it's for one specific domain (playing Countdown).
  • Explainer: What's R1 and Everything Else?
    1 project | news.ycombinator.com | 26 Jan 2025
    This is indeed a massive exaggeration, I'm pretty sure the $30 experiment is this one: https://threadreaderapp.com/thread/1882839370505621655.html (github: https://github.com/Jiayi-Pan/TinyZero).

    And while this is true that this experiment shows that you can reproduce the concept of direct reinforcement learning of an existing LLM, in a way that makes it develop reasoning in the same fashion Deepseek-R1 did, this is very far from a re-creation of R1!

  • DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via RL
    8 projects | news.ycombinator.com | 25 Jan 2025
    >I wonder if this was a deliberate move by PRC or really our own fault in falling for the fallacy that more is always better.

    Well, let’s see …hmmm… are we discussing this on a platform ran by people who made insane money flipping zero-value companies to greater fools during the dotcom bubble, only to pivot to doing the same thing to big tech during the FANG era or one for discussing of hard ML research among the no-nonsense math elite from some of the world’s top universities.

    More seriously, we don’t have to even speculate about any of this because the methods from DeepSeek’s work are already being reproduced:

    https://github.com/Jiayi-Pan/TinyZero

  • TinyZero
    1 project | news.ycombinator.com | 24 Jan 2025

What are some alternatives?

When comparing open-r1 and TinyZero you can also consider the following projects:

DeepSeek-R1

DeepSeek-V3

DeepSeek-LLM - DeepSeek LLM: Let there be answers

Nutrient - The #1 PDF SDK Library
Bad PDFs = bad UX. Slow load times, broken annotations, clunky UX frustrates users. Nutrient’s PDF SDKs gives seamless document experiences, fast rendering, annotations, real-time collaboration, 100+ features. Used by 10K+ devs, serving ~half a billion users worldwide. Explore the SDK for free.
nutrient.io
featured
CodeRabbit: AI Code Reviews for Developers
Revolutionize your code reviews with AI. CodeRabbit offers PR summaries, code walkthroughs, 1-click suggestions, and AST-based analysis. Boost productivity and code quality across all major languages with each PR.
coderabbit.ai
featured

Did you know that Python is
the 2nd most popular programming language
based on number of references?