SaaSHub helps you find the best software and product alternatives Learn more →
Top 23 Python Transformer Projects
-
nn
🧑🏫 60+ Implementations/tutorials of deep learning papers with side-by-side notes 📝; including transformers (original, xl, switch, feedback, vit, ...), optimizers (adam, adabelief, sophia, ...), gans(cyclegan, stylegan2, ...), 🎮 reinforcement learning (ppo, dqn), capsnet, distillation, ... 🧠
-
InfluxDB
InfluxDB – Built for High-Performance Time Series Workloads. InfluxDB 3 OSS is now GA. Transform, enrich, and act on time series data directly in the database. Automate critical tasks and eliminate the need to move data externally. Download now.
-
Take a look at the hardware requirements at https://github.com/hiyouga/LLaMA-Factory?tab=readme-ov-file#...
A 'LoRA' is a memory-efficient type of fine tuning that only tunes a small fraction of the LLM's parameters. And 'quantisation' reduces an LLM to, say, 4 bits per parameter. So it's feasible to fine-tune a 7B parameter model at home.
Anything bigger than 7B parameters and you'll want to look at renting GPUs on a platform like Runpod. In the current market, there are used 4090s selling on ebay right now for $2100 while runpod will rent you a 4090 for $0.34/hr - you do the math.
It's certainly possible to scale model training to span multiple nodes, but generally scaling through bigger GPUs and more GPUs per machine is easier.
-
vit-pytorch
Implementation of Vision Transformer, a simple way to achieve SOTA in vision classification with only a single transformer encoder, in Pytorch
-
haystack
AI orchestration framework to build customizable, production-ready LLM applications. Connect components (models, vector DBs, file converters) to pipelines or agents that can interact with your data. With advanced retrieval methods, it's best suited for building RAG, question answering, semantic search or conversational agent chatbots.
Project mention: Building a Prompt-Based Crypto Trading Platform with RAG and Reddit Sentiment Analysis using Haystack | dev.to | 2025-04-28Haystack forms the backbone of our RAG system. It provides pipelines for processing documents, embedding text, and retrieving relevant information.
-
Project mention: 💡 TinyLlama Meets LoRA: A Lightweight Approach to Emotion Classification | dev.to | 2025-05-05
TinyLlama is a lean, mean machine from the Llama family, designed for efficiency without sacrificing power. Fine-tuning it for emotion classification—identifying sadness, joy, love, anger, fear, or surprise in tweets—can be resource-intensive. That’s where LoRA, implemented via the PEFT library, saves the day. By updating only low-rank weight matrices, I can fine-tune effectively even on modest hardware.
-
-
RWKV-LM
RWKV (pronounced RwaKuv) is an RNN with great LLM performance, which can also be directly trained like a GPT transformer (parallelizable). We are at RWKV-7 "Goose". So it's combining the best of RNN and transformer - great performance, linear time, constant space (no kv-cache), fast training, infinite ctx_len, and free sentence embedding.
Project mention: Ask HN: Is anybody building an alternative transformer? | news.ycombinator.com | 2025-02-14You can see all the development directly from them: https://github.com/BlinkDL/RWKV-LM
Last week version 7 was released and every time they make significant improvements.
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
-
-
Project mention: Exploring the Exciting Possibilities of NVIDIA Megatron LM: A Fun and Friendly Code Walkthrough with PyTorch & NVIDIA Apex! | dev.to | 2024-10-25
# Install necessary dependencies sudo apt update sudo apt install python3-pip # Install PyTorch with GPU support pip3 install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cu113 # Clone Megatron LM repository git clone https://github.com/NVIDIA/Megatron-LM.git cd Megatron-LM # Install Megatron LM dependencies pip3 install -r requirements.txt # Install NVIDIA Apex for mixed-precision training git clone https://github.com/NVIDIA/apex cd apex pip3 install -v --disable-pip-version-check --no-cache-dir ./
-
txtai
💡 All-in-one open-source AI framework for semantic search, LLM orchestration and language model workflows
-
segmentation_models.pytorch
Semantic segmentation models with 500+ pretrained convolutional and transformer-based backbones.
-
Simple Diarizer Simple Diarizer is a speaker diarization library that utilizes pretrained models from SpeechBrain . To get started with simple_diarizer, follow these steps:
-
BigDL
Accelerate local LLM inference and finetuning (LLaMA, Mistral, ChatGLM, Qwen, DeepSeek, Mixtral, Gemma, Phi, MiniCPM, Qwen-VL, MiniCPM-V, etc.) on Intel XPU (e.g., local PC with iGPU and NPU, discrete GPU such as Arc, Flex and Max); seamlessly integrate with llama.cpp, Ollama, HuggingFace, LangChain, LlamaIndex, vLLM, DeepSpeed, Axolotl, etc.
Project mention: FlashMoE: DeepSeek-R1 671B and Qwen3MoE 235B with 1~2 Intel B580 GPU in IPEX-LLM | news.ycombinator.com | 2025-05-12 -
PaLM-rlhf-pytorch
Implementation of RLHF (Reinforcement Learning with Human Feedback) on top of the PaLM architecture. Basically ChatGPT but with PaLM
-
gpt-neox
An implementation of model parallel autoregressive transformers on GPUs, based on the Megatron and DeepSpeed libraries
Closest to this would be https://www.eleuther.ai whose training data is largely public and training processes are openly discussed, planned, and evaluated on their Discord server. Much of their training dataset is available at https://the-eye.eu (their onion link is considered "primary", however, due to copyright concerns)
-
-
-
OpenRLHF
An Easy-to-use, Scalable and High-performance RLHF Framework based on Ray (PPO & GRPO & REINFORCE++ & LoRA & vLLM & RFT)
-
courses
This repository is a curated collection of links to various courses and resources about Artificial Intelligence (AI) (by SkalskiP)
-
DALLE-pytorch
Implementation / replication of DALL-E, OpenAI's Text to Image Transformer, in Pytorch
-
-
x-transformers
A concise but complete full-attention transformer with a set of promising experimental features from various papers
-
Chinese-CLIP
Chinese version of CLIP which achieves Chinese cross-modal retrieval and representation generation.
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
Python Transformers discussion
Python Transformers related posts
-
FlashMoE: DeepSeek-R1 671B and Qwen3MoE 235B with 1~2 Intel B580 GPU in IPEX-LLM
-
💡 TinyLlama Meets LoRA: A Lightweight Approach to Emotion Classification
-
DeepSeek R1 671B Q4_K_M with 1~2 Arc A770 on Xeon
-
Show HN: Open-Source Windows AI assistant that uses with Word, Excel through COM
-
IPEX-LLM Portable Zip for Ollama on Intel GPU
-
AI founders will learn The Bitter Lesson
-
Building an AI-Powered Background Remover with React and Transformers.js
-
A note from our sponsor - SaaSHub
www.saashub.com | 14 May 2025
Index
What are some of the best open-source Transformer projects in Python? This list will help you:
# | Project | Stars |
---|---|---|
1 | nn | 60,495 |
2 | LLaMA-Factory | 48,648 |
3 | vit-pytorch | 22,795 |
4 | haystack | 20,614 |
5 | peft | 18,392 |
6 | ml-engineering | 13,653 |
7 | RWKV-LM | 13,575 |
8 | PaddleNLP | 12,569 |
9 | Megatron-LM | 12,297 |
10 | txtai | 10,893 |
11 | segmentation_models.pytorch | 10,440 |
12 | speechbrain | 9,808 |
13 | BigDL | 7,853 |
14 | PaLM-rlhf-pytorch | 7,805 |
15 | gpt-neox | 7,172 |
16 | bertviz | 7,147 |
17 | BERTopic | 6,750 |
18 | OpenRLHF | 6,666 |
19 | courses | 5,968 |
20 | DALLE-pytorch | 5,598 |
21 | openchat | 5,343 |
22 | x-transformers | 5,294 |
23 | Chinese-CLIP | 5,177 |