Top 13 Python pretraining Projects
-
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
-
OFA
Official repository of OFA (ICML 2022). Paper: OFA: Unifying Architectures, Tasks, and Modalities Through a Simple Sequence-to-Sequence Learning Framework
-
-
SparK
[ICLR'23 Spotlight🔥] The first successful BERT/MAE-style pretraining on any convolutional network; Pytorch impl. of "Designing BERT for Convolutional Networks: Sparse and Hierarchical Masked Modeling" (by keyu-tian)
-
ImageNet21K
Official Pytorch Implementation of: "ImageNet-21K Pretraining for the Masses"(NeurIPS, 2021) paper
-
-
-
-
Revisiting-Contrastive-SSL
Revisiting Contrastive Methods for Unsupervised Learning of Visual Representations. [NeurIPS 2021]
-
awesome-clip-papers
The most impactful papers related to contrastive pretraining for multimodal models!
-
-
autolab
Companion tools for Karpathy's autoresearch - smarter evaluation, guided steering, and multi-agent competitions for GPT pretraining
Repo: github.com/dean0x/autolab
-
easy-torch-tpu
A flexible pipeline for training custom research-scale models on Google Cloud TPUs using PyTorch/XLA
Project mention: TorchTPU: Running PyTorch Natively on TPUs at Google Scale | news.ycombinator.com | 2026-04-23This is great to see.
I did trained some research models using the existing PyTorch/XLA on TPUs, and it was a mess of undocumented behavior and bugs (silently hanging after 8 hours of training!).
If anyone is trying to use PyTorch on TPU before TorchTPU is released, you can check out the training pipeline that I ended up building to support my research: https://github.com/aklein4/easy-torch-tpu
Python pretraining discussion
Python pretraining related posts
Index
What are some of the best open-source pretraining projects in Python? This list will help you:
| # | Project | Stars |
|---|---|---|
| 1 | LMOps | 4,403 |
| 2 | OFA | 2,557 |
| 3 | mPLUG-Owl | 2,539 |
| 4 | SparK | 1,370 |
| 5 | ImageNet21K | 780 |
| 6 | PITI | 502 |
| 7 | LinkBERT | 449 |
| 8 | llm-baselines | 118 |
| 9 | Revisiting-Contrastive-SSL | 89 |
| 10 | awesome-clip-papers | 78 |
| 11 | tabular-dl-pretrain-objectives | 68 |
| 12 | autolab | 6 |
| 13 | easy-torch-tpu | 5 |