Python Transformers

Open-source Python projects categorized as Transformers

Top 23 Python Transformer Projects

  • GitHub repo gpt-neo

    An implementation of model parallel GPT-2 and GPT-3-style models using the mesh-tensorflow library.

    Project mention: Error: forward() got an unexpected keyword argument 'rotary_emb' | reddit.com/r/NovelAi | 2021-06-17
  • GitHub repo vit-pytorch

    Implementation of Vision Transformer, a simple way to achieve SOTA in vision classification with only a single transformer encoder, in Pytorch

    Project mention: [R] Rotary Positional Embeddings - a new relative positional embedding for Transformers that significantly improves convergence (20-30%) and works for both regular and efficient attention | reddit.com/r/MachineLearning | 2021-04-21

    I've attempted it here https://github.com/lucidrains/vit-pytorch/blob/main/vit_pytorch/rvt.py but those who have tried it haven't seen knock out results as 1d. Perhaps the axial lengths are too small to see a benefit

  • GitHub repo jina

    An easier way to build neural search on the cloud

    Project mention: Open-source AI-powered games search engine | reddit.com/r/opensourcegames | 2021-06-17

    What advantage does neural search give over "traditional" search methods? (answering my own question - https://github.com/jina-ai/jina/blob/master/.github/2.0/neural-search.md - seems like there are some upsides!)

  • GitHub repo deep-daze

    Simple command line tool for text to image generation using OpenAI's CLIP and Siren (Implicit neural representation network). Technique was originally created by https://twitter.com/advadnoun

    Project mention: 9 Command-Line Tools to Go to Infinity & Beyond | dev.to | 2021-04-30

    2. Deep Daze

  • GitHub repo DALLE-pytorch

    Implementation / replication of DALL-E, OpenAI's Text to Image Transformer, in Pytorch

    Project mention: (from the discord stream) I'm so hyped for this game. This generation is really good. | reddit.com/r/NovelAi | 2021-05-22

    I am very excited, when AI Dungeon was released and seeing them filtering stuff, I thought that one day there will be an open source version of this without filters, the same goes for any future open sourced GPT-X. Now if we can get to train an open source DALL-E too and integrate it on NovelAI. Wouldn't that be even more awesome?

  • GitHub repo simpletransformers

    Transformers for Classification, NER, QA, Language Modelling, Language Generation, T5, Multi-Modal, and Conversational AI

    Project mention: Gpt 2 124m using transformers | reddit.com/r/LanguageTechnology | 2021-06-14

    https://github.com/ThilinaRajapakse/simpletransformers/blob/master/simpletransformers/language_generation/language_generation_model.py#L146

  • GitHub repo reformer-pytorch

    Reformer, the efficient Transformer, in Pytorch

    Project mention: [R]How to go about non-reproducible research? | reddit.com/r/MachineLearning | 2021-03-06

    This is what I call great code : https://github.com/lucidrains/reformer-pytorch

  • GitHub repo jiant

    jiant is an nlp toolkit

    Project mention: Looking for a code base to implement multi-task learning in NLP | reddit.com/r/LanguageTechnology | 2021-02-22

    Jiant should fulfill 1, 2, 4 and 5.

  • GitHub repo BERTopic

    Leveraging BERT and c-TF-IDF to create easily interpretable topics.

    Project mention: Idea validation: users clustering based on transformer output | reddit.com/r/datascience | 2021-05-17

    not mine, but this was posted here a while back: https://github.com/MaartenGr/BERTopic

  • GitHub repo gpt-neox

    An implementation of model parallel GPT-3-like models on GPUs, based on the DeepSpeed library. Designed to be able to train models in the hundreds of billions of parameters or larger.

    Project mention: With the release of Eleuther's new 6B model, I decided to rerun a comparison I did a couple weeks ago. | reddit.com/r/AIDungeon | 2021-06-09
  • GitHub repo x-transformers

    A simple but complete full-attention transformer with a set of promising experimental features from various papers

    Project mention: Hacker News top posts: May 9, 2021 | reddit.com/r/hackerdigest | 2021-05-09

    X-Transformers: A fully-featured transformer with experimental features\ (25 comments)

  • GitHub repo gansformer

    Generative Adversarial Transformers

    Project mention: Generative Adversarial Transformers [R] | reddit.com/r/MachineLearning | 2021-04-20

    As for whether the Ys are shared across layers, check the code.

  • GitHub repo performer-pytorch

    An implementation of Performer, a linear attention-based transformer, in Pytorch

    Project mention: [R] Rotary Positional Embeddings - a new relative positional embedding for Transformers that significantly improves convergence (20-30%) and works for both regular and efficient attention | reddit.com/r/MachineLearning | 2021-04-21

    Performer is the best linear attention variant, but linear attention is just one type of efficient attention solution. I have rotary embeddings already in the repo https://github.com/lucidrains/performer-pytorch and you can witness this phenomenon yourself by toggling it on / off

  • GitHub repo TimeSformer-pytorch

    Implementation of TimeSformer from Facebook AI, a pure attention-based solution for video classification

    Project mention: [N] Facebook AI Introduces TimeSformer: A New Video Architecture Based Purely On Transformers | reddit.com/r/MachineLearning | 2021-03-16
  • GitHub repo adapter-transformers

    Huggingface Transformers + Adapters = ❤️

    Project mention: [P] AdapterHub v2: Lightweight Transfer Learning with Transformers and Adapters | reddit.com/r/MachineLearning | 2021-04-30

    GitHub: https://github.com/Adapter-Hub/adapter-transformers

  • GitHub repo DALLE-mtf

    Open-AI's DALL-E for large scale training in mesh-tensorflow.

    Project mention: Are we ever going to get access to DALL-E? | reddit.com/r/GPT3 | 2021-02-28

    There's this

  • GitHub repo long-range-arena

    Long Range Arena for Benchmarking Efficient Transformers

    Project mention: [R][D] Informer: Beyond Efficient Transformer for Long Sequence Time-Series Forecasting. Zhou et al. AAAI21 Best Paper. ProbSparse self-attention reduces complexity to O(nlogn), generative style decoder to obtainsequence output in one step, and self-attention distilling for further reducing memory | reddit.com/r/MachineLearning | 2021-02-24

    I think the paper is written in a clear style and I like that the authors included many experiments, including hyperparameter effects, ablations and extensive baseline comparisons. One thing I would have liked is them comparing their Informer to more efficient transformers (they compared only against logtrans and reformer) using the LRA (https://github.com/google-research/long-range-arena) benchmark.

  • GitHub repo lightning-transformers

    Flexible interface for high-performance research using SOTA Transformers leveraging Pytorch Lightning, Transformers, and Hydra.

    Project mention: Lightning Transformers - Train HuggingFace Transformers with PyTorch | reddit.com/r/u_Grid_AI | 2021-04-23

    Lightning Transformers is for users who want to train, evaluate and predict using HuggingFace models and datasets with PyTorch Lightning. Full customizability of the code using the LightningModule and Trainer, with Hydra config composition for quick and easy experimentation. No boilerplate code is required; easily swap out models, optimizers, schedulers, and more without touching the code. Check out the blog post: Training Transformers at Scale with PyTorch Lightning for more information or the documentation.

  • GitHub repo transformer-in-transformer

    Implementation of Transformer in Transformer, pixel level attention paired with patch level attention for image classification, in Pytorch

    Project mention: “Transformer in Transformer” paper explained! | reddit.com/r/computervision | 2021-03-04

    A thirdparty implementation of " Transformer in Transformer": https://github.com/lucidrains/transformer-in-transformer

  • GitHub repo convolution-vision-transformers

    PyTorch Implementation of CvT: Introducing Convolutions to Vision Transformers

    Project mention: CvT: Introducing Convolutions to Vision Transformers | reddit.com/r/computervision | 2021-03-30

    code: https://github.com/rishikksh20/convolution-vision-transformers

  • GitHub repo Perceiver

    Implementation of Perceiver, General Perception with Iterative Attention in TensorFlow

    Project mention: I implemented Deepmind's new Perceiver Model | news.ycombinator.com | 2021-04-19
  • GitHub repo simpleT5

    simpleT5 is built on top of PyTorch-lightning⚡️ and Transformers🤗 that lets you quickly train your T5 models.

    Project mention: [P] SimpleT5 : Train T5 models in just 3 lines of code | reddit.com/r/MachineLearning | 2021-06-02

    🌟GitHub: https://github.com/Shivanandroy/simpleT5 🌟Medium: https://snrspeaks.medium.com/simplet5-train-t5-models-in-just-3-lines-of-code-by-shivanand-roy-2021-354df5ae46ba 🌟Colab Notebook: https://colab.research.google.com/drive/1JZ8v9L0w0Ai3WbibTeuvYlytn0uHMP6O?usp=sharing

  • GitHub repo deep-implicit-attention

    Experimental implementation of deep implicit attention in PyTorch

    Project mention: [P] Deep Implicit Attention: A Mean-Field Theory Perspective on Attention Mechanisms | reddit.com/r/MachineLearning | 2021-05-04

    Code: https://github.com/mcbal/deep-implicit-attention

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020). The latest post mention was on 2021-06-17.

Index

What are some of the best open-source Transformer projects in Python? This list will help you:

Project Stars
1 gpt-neo 5,053
2 vit-pytorch 4,611
3 jina 3,919
4 deep-daze 3,435
5 DALLE-pytorch 3,039
6 simpletransformers 2,433
7 reformer-pytorch 1,515
8 jiant 1,268
9 BERTopic 1,167
10 gpt-neox 888
11 x-transformers 845
12 gansformer 632
13 performer-pytorch 628
14 TimeSformer-pytorch 440
15 adapter-transformers 436
16 DALLE-mtf 333
17 long-range-arena 279
18 lightning-transformers 263
19 transformer-in-transformer 230
20 convolution-vision-transformers 116
21 Perceiver 55
22 simpleT5 32
23 deep-implicit-attention 29