Python Transformers

Open-source Python projects categorized as Transformers

Top 23 Python Transformer Projects

Transformers
  1. nn

    🧑‍🏫 60+ Implementations/tutorials of deep learning papers with side-by-side notes 📝; including transformers (original, xl, switch, feedback, vit, ...), optimizers (adam, adabelief, sophia, ...), gans(cyclegan, stylegan2, ...), 🎮 reinforcement learning (ppo, dqn), capsnet, distillation, ... 🧠

  2. InfluxDB

    InfluxDB – Built for High-Performance Time Series Workloads. InfluxDB 3 OSS is now GA. Transform, enrich, and act on time series data directly in the database. Automate critical tasks and eliminate the need to move data externally. Download now.

    InfluxDB logo
  3. LLaMA-Factory

    Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)

    Project mention: Fine-tune Google's Gemma 3 | news.ycombinator.com | 2025-03-19

    Take a look at the hardware requirements at https://github.com/hiyouga/LLaMA-Factory?tab=readme-ov-file#...

    A 'LoRA' is a memory-efficient type of fine tuning that only tunes a small fraction of the LLM's parameters. And 'quantisation' reduces an LLM to, say, 4 bits per parameter. So it's feasible to fine-tune a 7B parameter model at home.

    Anything bigger than 7B parameters and you'll want to look at renting GPUs on a platform like Runpod. In the current market, there are used 4090s selling on ebay right now for $2100 while runpod will rent you a 4090 for $0.34/hr - you do the math.

    It's certainly possible to scale model training to span multiple nodes, but generally scaling through bigger GPUs and more GPUs per machine is easier.

  4. vit-pytorch

    Implementation of Vision Transformer, a simple way to achieve SOTA in vision classification with only a single transformer encoder, in Pytorch

  5. haystack

    AI orchestration framework to build customizable, production-ready LLM applications. Connect components (models, vector DBs, file converters) to pipelines or agents that can interact with your data. With advanced retrieval methods, it's best suited for building RAG, question answering, semantic search or conversational agent chatbots.

    Project mention: Building a Prompt-Based Crypto Trading Platform with RAG and Reddit Sentiment Analysis using Haystack | dev.to | 2025-04-28

    Haystack forms the backbone of our RAG system. It provides pipelines for processing documents, embedding text, and retrieving relevant information.

  6. peft

    🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.

    Project mention: 💡 TinyLlama Meets LoRA: A Lightweight Approach to Emotion Classification | dev.to | 2025-05-05

    TinyLlama is a lean, mean machine from the Llama family, designed for efficiency without sacrificing power. Fine-tuning it for emotion classification—identifying sadness, joy, love, anger, fear, or surprise in tweets—can be resource-intensive. That’s where LoRA, implemented via the PEFT library, saves the day. By updating only low-rank weight matrices, I can fine-tune effectively even on modest hardware.

  7. ml-engineering

    Machine Learning Engineering Open Book

  8. RWKV-LM

    RWKV (pronounced RwaKuv) is an RNN with great LLM performance, which can also be directly trained like a GPT transformer (parallelizable). We are at RWKV-7 "Goose". So it's combining the best of RNN and transformer - great performance, linear time, constant space (no kv-cache), fast training, infinite ctx_len, and free sentence embedding.

    Project mention: Ask HN: Is anybody building an alternative transformer? | news.ycombinator.com | 2025-02-14

    You can see all the development directly from them: https://github.com/BlinkDL/RWKV-LM

    Last week version 7 was released and every time they make significant improvements.

  9. SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
  10. PaddleNLP

    Easy-to-use and powerful LLM and SLM library with awesome model zoo.

  11. Megatron-LM

    Ongoing research training transformer models at scale

    Project mention: Exploring the Exciting Possibilities of NVIDIA Megatron LM: A Fun and Friendly Code Walkthrough with PyTorch & NVIDIA Apex! | dev.to | 2024-10-25

    # Install necessary dependencies sudo apt update sudo apt install python3-pip # Install PyTorch with GPU support pip3 install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cu113 # Clone Megatron LM repository git clone https://github.com/NVIDIA/Megatron-LM.git cd Megatron-LM # Install Megatron LM dependencies pip3 install -r requirements.txt # Install NVIDIA Apex for mixed-precision training git clone https://github.com/NVIDIA/apex cd apex pip3 install -v --disable-pip-version-check --no-cache-dir ./

  12. txtai

    💡 All-in-one open-source AI framework for semantic search, LLM orchestration and language model workflows

    Project mention: Chunking your data for RAG | dev.to | 2025-02-11
  13. segmentation_models.pytorch

    Semantic segmentation models with 500+ pretrained convolutional and transformer-based backbones.

  14. speechbrain

    A PyTorch-based Speech Toolkit

    Project mention: Speaker Diarization in Python | dev.to | 2024-08-22

    Simple Diarizer Simple Diarizer is a speaker diarization library that utilizes pretrained models from SpeechBrain . To get started with simple_diarizer, follow these steps:

  15. BigDL

    Accelerate local LLM inference and finetuning (LLaMA, Mistral, ChatGLM, Qwen, DeepSeek, Mixtral, Gemma, Phi, MiniCPM, Qwen-VL, MiniCPM-V, etc.) on Intel XPU (e.g., local PC with iGPU and NPU, discrete GPU such as Arc, Flex and Max); seamlessly integrate with llama.cpp, Ollama, HuggingFace, LangChain, LlamaIndex, vLLM, DeepSpeed, Axolotl, etc.

    Project mention: FlashMoE: DeepSeek-R1 671B and Qwen3MoE 235B with 1~2 Intel B580 GPU in IPEX-LLM | news.ycombinator.com | 2025-05-12
  16. PaLM-rlhf-pytorch

    Implementation of RLHF (Reinforcement Learning with Human Feedback) on top of the PaLM architecture. Basically ChatGPT but with PaLM

  17. gpt-neox

    An implementation of model parallel autoregressive transformers on GPUs, based on the Megatron and DeepSpeed libraries

    Project mention: Why YC Went to DC | news.ycombinator.com | 2024-06-03

    Closest to this would be https://www.eleuther.ai whose training data is largely public and training processes are openly discussed, planned, and evaluated on their Discord server. Much of their training dataset is available at https://the-eye.eu (their onion link is considered "primary", however, due to copyright concerns)

  18. bertviz

    BertViz: Visualize Attention in NLP Models (BERT, GPT2, BART, etc.)

  19. BERTopic

    Leveraging BERT and c-TF-IDF to create easily interpretable topics.

  20. OpenRLHF

    An Easy-to-use, Scalable and High-performance RLHF Framework based on Ray (PPO & GRPO & REINFORCE++ & LoRA & vLLM & RFT)

    Project mention: AIM Weekly 27 May 2024 | dev.to | 2024-05-28
  21. courses

    This repository is a curated collection of links to various courses and resources about Artificial Intelligence (AI) (by SkalskiP)

  22. DALLE-pytorch

    Implementation / replication of DALL-E, OpenAI's Text to Image Transformer, in Pytorch

  23. openchat

    OpenChat: Advancing Open-source Language Models with Imperfect Data (by imoneoi)

  24. x-transformers

    A concise but complete full-attention transformer with a set of promising experimental features from various papers

  25. Chinese-CLIP

    Chinese version of CLIP which achieves Chinese cross-modal retrieval and representation generation.

  26. SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

Python Transformers discussion

Log in or Post with

Python Transformers related posts

  • FlashMoE: DeepSeek-R1 671B and Qwen3MoE 235B with 1~2 Intel B580 GPU in IPEX-LLM

    1 project | news.ycombinator.com | 12 May 2025
  • 💡 TinyLlama Meets LoRA: A Lightweight Approach to Emotion Classification

    4 projects | dev.to | 5 May 2025
  • DeepSeek R1 671B Q4_K_M with 1~2 Arc A770 on Xeon

    3 projects | news.ycombinator.com | 5 Mar 2025
  • Show HN: Open-Source Windows AI assistant that uses with Word, Excel through COM

    3 projects | news.ycombinator.com | 3 Mar 2025
  • IPEX-LLM Portable Zip for Ollama on Intel GPU

    1 project | news.ycombinator.com | 13 Feb 2025
  • AI founders will learn The Bitter Lesson

    1 project | news.ycombinator.com | 12 Jan 2025
  • Building an AI-Powered Background Remover with React and Transformers.js

    2 projects | dev.to | 11 Jan 2025
  • A note from our sponsor - SaaSHub
    www.saashub.com | 14 May 2025
    SaaSHub helps you find the best software and product alternatives Learn more →

Index

What are some of the best open-source Transformer projects in Python? This list will help you:

# Project Stars
1 nn 60,495
2 LLaMA-Factory 48,648
3 vit-pytorch 22,795
4 haystack 20,614
5 peft 18,392
6 ml-engineering 13,653
7 RWKV-LM 13,575
8 PaddleNLP 12,569
9 Megatron-LM 12,297
10 txtai 10,893
11 segmentation_models.pytorch 10,440
12 speechbrain 9,808
13 BigDL 7,853
14 PaLM-rlhf-pytorch 7,805
15 gpt-neox 7,172
16 bertviz 7,147
17 BERTopic 6,750
18 OpenRLHF 6,666
19 courses 5,968
20 DALLE-pytorch 5,598
21 openchat 5,343
22 x-transformers 5,294
23 Chinese-CLIP 5,177

Sponsored
InfluxDB – Built for High-Performance Time Series Workloads
InfluxDB 3 OSS is now GA. Transform, enrich, and act on time series data directly in the database. Automate critical tasks and eliminate the need to move data externally. Download now.
www.influxdata.com

Did you know that Python is
the 2nd most popular programming language
based on number of references?