Top 23 Python Transformer Projects
An implementation of model parallel GPT-2 and GPT-3-style models using the mesh-tensorflow library.Project mention: Error: forward() got an unexpected keyword argument 'rotary_emb' | reddit.com/r/NovelAi | 2021-06-17
Implementation of Vision Transformer, a simple way to achieve SOTA in vision classification with only a single transformer encoder, in PytorchProject mention: [R] Rotary Positional Embeddings - a new relative positional embedding for Transformers that significantly improves convergence (20-30%) and works for both regular and efficient attention | reddit.com/r/MachineLearning | 2021-04-21
I've attempted it here https://github.com/lucidrains/vit-pytorch/blob/main/vit_pytorch/rvt.py but those who have tried it haven't seen knock out results as 1d. Perhaps the axial lengths are too small to see a benefit
Scout APM - Leading-edge performance monitoring starting at $39/month. Scout APM uses tracing logic that ties bottlenecks to source code so you know the exact line of code causing performance issues and can get back to building a great product faster.
An easier way to build neural search on the cloudProject mention: Open-source AI-powered games search engine | reddit.com/r/opensourcegames | 2021-06-17
What advantage does neural search give over "traditional" search methods? (answering my own question - https://github.com/jina-ai/jina/blob/master/.github/2.0/neural-search.md - seems like there are some upsides!)
Simple command line tool for text to image generation using OpenAI's CLIP and Siren (Implicit neural representation network). Technique was originally created by https://twitter.com/advadnounProject mention: 9 Command-Line Tools to Go to Infinity & Beyond | dev.to | 2021-04-30
2. Deep Daze
Implementation / replication of DALL-E, OpenAI's Text to Image Transformer, in PytorchProject mention: (from the discord stream) I'm so hyped for this game. This generation is really good. | reddit.com/r/NovelAi | 2021-05-22
I am very excited, when AI Dungeon was released and seeing them filtering stuff, I thought that one day there will be an open source version of this without filters, the same goes for any future open sourced GPT-X. Now if we can get to train an open source DALL-E too and integrate it on NovelAI. Wouldn't that be even more awesome?
Transformers for Classification, NER, QA, Language Modelling, Language Generation, T5, Multi-Modal, and Conversational AIProject mention: Gpt 2 124m using transformers | reddit.com/r/LanguageTechnology | 2021-06-14
Reformer, the efficient Transformer, in PytorchProject mention: [R]How to go about non-reproducible research? | reddit.com/r/MachineLearning | 2021-03-06
This is what I call great code : https://github.com/lucidrains/reformer-pytorch
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
jiant is an nlp toolkitProject mention: Looking for a code base to implement multi-task learning in NLP | reddit.com/r/LanguageTechnology | 2021-02-22
Jiant should fulfill 1, 2, 4 and 5.
Leveraging BERT and c-TF-IDF to create easily interpretable topics.Project mention: Idea validation: users clustering based on transformer output | reddit.com/r/datascience | 2021-05-17
not mine, but this was posted here a while back: https://github.com/MaartenGr/BERTopic
An implementation of model parallel GPT-3-like models on GPUs, based on the DeepSpeed library. Designed to be able to train models in the hundreds of billions of parameters or larger.Project mention: With the release of Eleuther's new 6B model, I decided to rerun a comparison I did a couple weeks ago. | reddit.com/r/AIDungeon | 2021-06-09
A simple but complete full-attention transformer with a set of promising experimental features from various papersProject mention: Hacker News top posts: May 9, 2021 | reddit.com/r/hackerdigest | 2021-05-09
X-Transformers: A fully-featured transformer with experimental features\ (25 comments)
Generative Adversarial TransformersProject mention: Generative Adversarial Transformers [R] | reddit.com/r/MachineLearning | 2021-04-20
As for whether the Ys are shared across layers, check the code.
An implementation of Performer, a linear attention-based transformer, in PytorchProject mention: [R] Rotary Positional Embeddings - a new relative positional embedding for Transformers that significantly improves convergence (20-30%) and works for both regular and efficient attention | reddit.com/r/MachineLearning | 2021-04-21
Performer is the best linear attention variant, but linear attention is just one type of efficient attention solution. I have rotary embeddings already in the repo https://github.com/lucidrains/performer-pytorch and you can witness this phenomenon yourself by toggling it on / off
Implementation of TimeSformer from Facebook AI, a pure attention-based solution for video classificationProject mention: [N] Facebook AI Introduces TimeSformer: A New Video Architecture Based Purely On Transformers | reddit.com/r/MachineLearning | 2021-03-16
Huggingface Transformers + Adapters = ❤️Project mention: [P] AdapterHub v2: Lightweight Transfer Learning with Transformers and Adapters | reddit.com/r/MachineLearning | 2021-04-30
Open-AI's DALL-E for large scale training in mesh-tensorflow.Project mention: Are we ever going to get access to DALL-E? | reddit.com/r/GPT3 | 2021-02-28
Long Range Arena for Benchmarking Efficient TransformersProject mention: [R][D] Informer: Beyond Efficient Transformer for Long Sequence Time-Series Forecasting. Zhou et al. AAAI21 Best Paper. ProbSparse self-attention reduces complexity to O(nlogn), generative style decoder to obtainsequence output in one step, and self-attention distilling for further reducing memory | reddit.com/r/MachineLearning | 2021-02-24
I think the paper is written in a clear style and I like that the authors included many experiments, including hyperparameter effects, ablations and extensive baseline comparisons. One thing I would have liked is them comparing their Informer to more efficient transformers (they compared only against logtrans and reformer) using the LRA (https://github.com/google-research/long-range-arena) benchmark.
Flexible interface for high-performance research using SOTA Transformers leveraging Pytorch Lightning, Transformers, and Hydra.Project mention: Lightning Transformers - Train HuggingFace Transformers with PyTorch | reddit.com/r/u_Grid_AI | 2021-04-23
Lightning Transformers is for users who want to train, evaluate and predict using HuggingFace models and datasets with PyTorch Lightning. Full customizability of the code using the LightningModule and Trainer, with Hydra config composition for quick and easy experimentation. No boilerplate code is required; easily swap out models, optimizers, schedulers, and more without touching the code. Check out the blog post: Training Transformers at Scale with PyTorch Lightning for more information or the documentation.
Implementation of Transformer in Transformer, pixel level attention paired with patch level attention for image classification, in PytorchProject mention: “Transformer in Transformer” paper explained! | reddit.com/r/computervision | 2021-03-04
A thirdparty implementation of " Transformer in Transformer": https://github.com/lucidrains/transformer-in-transformer
PyTorch Implementation of CvT: Introducing Convolutions to Vision TransformersProject mention: CvT: Introducing Convolutions to Vision Transformers | reddit.com/r/computervision | 2021-03-30
Implementation of Perceiver, General Perception with Iterative Attention in TensorFlowProject mention: I implemented Deepmind's new Perceiver Model | news.ycombinator.com | 2021-04-19
simpleT5 is built on top of PyTorch-lightning⚡️ and Transformers🤗 that lets you quickly train your T5 models.Project mention: [P] SimpleT5 : Train T5 models in just 3 lines of code | reddit.com/r/MachineLearning | 2021-06-02
🌟GitHub: https://github.com/Shivanandroy/simpleT5 🌟Medium: https://snrspeaks.medium.com/simplet5-train-t5-models-in-just-3-lines-of-code-by-shivanand-roy-2021-354df5ae46ba 🌟Colab Notebook: https://colab.research.google.com/drive/1JZ8v9L0w0Ai3WbibTeuvYlytn0uHMP6O?usp=sharing
Experimental implementation of deep implicit attention in PyTorchProject mention: [P] Deep Implicit Attention: A Mean-Field Theory Perspective on Attention Mechanisms | reddit.com/r/MachineLearning | 2021-05-04
What are some of the best open-source Transformer projects in Python? This list will help you: