Revolutionize your code reviews with AI. CodeRabbit offers PR summaries, code walkthroughs, 1-click suggestions, and AST-based analysis. Boost productivity and code quality across all major languages with each PR. Learn more →
TinyZero Alternatives
Similar projects and alternatives to TinyZero
-
CodeRabbit
CodeRabbit: AI Code Reviews for Developers. Revolutionize your code reviews with AI. CodeRabbit offers PR summaries, code walkthroughs, 1-click suggestions, and AST-based analysis. Boost productivity and code quality across all major languages with each PR.
-
-
-
-
-
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
-
-
-
-
LightZero
[NeurIPS 2023 Spotlight] LightZero: A Unified Benchmark for Monte Carlo Tree Search in General Sequential Decision Scenarios (awesome MCTS)
TinyZero discussion
TinyZero reviews and mentions
-
LIMO: Less Is More for Reasoning
Yes, the authors explicitly highlighted those two points in the abstract, in terms of them being the elicitation threshold for complex reasoning, namely, an extremely complete pre-trained foundation model, and a set of extremely high quality examples post-training.
To your question on finetuning on the initial 10 million pool - intuitively, it would require tremendous amount of finetuning data to move the needle - you really won't be able to move the gradients much with just 817 examples, that initial pool is effectively enforcing pretty rigid regularization.
There is now an increasing interest in showing that small data with inference time scaling is providing significant yield. Couple of recent examples:
* TinyZero: https://github.com/Jiayi-Pan/TinyZero
-
Mini-R1: Reproduce DeepSeek R1 "Aha Moment"
They do mention it here
> Note: This blog is inspired by Jiayi Pan [1] who initially explored the idea and proofed it with a small model.
But I agree, that attribution could be more substantial.
> Note: This blog is inspired by Jiayi Pan [1] who also reproduced the "Aha Moment" with their TinyZero [2] model.
[1] https://x.com/jiayi_pirate/status/1882839370505621655 (1.1M views btw)
[2] https://github.com/Jiayi-Pan/TinyZero
A lot of people are busy reproing R1 right now. I think this is the spark.
- Berkeley Researchers Replicate DeepSeek R1's Core Tech for Just $30: A Small Mod
-
Berkeley Researchers Replicate DeepSeek R1's Core Tech for Just $30
This is blogspam of https://github.com/Jiayi-Pan/TinyZero and https://nitter.lucabased.xyz/jiayi_pirate/status/18828393705.... This also doesn't mention that it's for one specific domain (playing Countdown).
-
Explainer: What's R1 and Everything Else?
This is indeed a massive exaggeration, I'm pretty sure the $30 experiment is this one: https://threadreaderapp.com/thread/1882839370505621655.html (github: https://github.com/Jiayi-Pan/TinyZero).
And while this is true that this experiment shows that you can reproduce the concept of direct reinforcement learning of an existing LLM, in a way that makes it develop reasoning in the same fashion Deepseek-R1 did, this is very far from a re-creation of R1!
-
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via RL
>I wonder if this was a deliberate move by PRC or really our own fault in falling for the fallacy that more is always better.
Well, let’s see …hmmm… are we discussing this on a platform ran by people who made insane money flipping zero-value companies to greater fools during the dotcom bubble, only to pivot to doing the same thing to big tech during the FANG era or one for discussing of hard ML research among the no-nonsense math elite from some of the world’s top universities.
More seriously, we don’t have to even speculate about any of this because the methods from DeepSeek’s work are already being reproduced:
https://github.com/Jiayi-Pan/TinyZero
- TinyZero
-
A note from our sponsor - CodeRabbit
coderabbit.ai | 23 Mar 2025
Stats
Jiayi-Pan/TinyZero is an open source project licensed under Apache License 2.0 which is an OSI approved license.
The primary programming language of TinyZero is Python.