tldr-transformers
DISCONTINUED
long-range-arena
Our great sponsors
tldr-transformers | long-range-arena | |
---|---|---|
4 | 6 | |
167 | 677 | |
- | 3.6% | |
0.0 | 0.0 | |
over 1 year ago | 3 months ago | |
Python | ||
MIT License | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
tldr-transformers
-
[P] NLP "tl;dr" Notes on Transformers
In any case, I'm liking the first glance so far. I'd just transpose the summary tables so they wouldn't get so tightly squeezed: https://github.com/will-thompson-k/tldr-transformers/blob/main/notes/bart.md
long-range-arena
- The Secret Sauce behind 100K context window in LLMs: all tricks in one place
-
[D] Is there a repo on which many light-weight self-attention mechanism are introduced?
1.1 Long Range Arena: A Benchmark for Efficient Transformers. From authors of above, they proposed a benchmark for modeling long range interactions. It also inlcudes a repository
-
[R][D] Informer: Beyond Efficient Transformer for Long Sequence Time-Series Forecasting. Zhou et al. AAAI21 Best Paper. ProbSparse self-attention reduces complexity to O(nlogn), generative style decoder to obtainsequence output in one step, and self-attention distilling for further reducing memory
I think the paper is written in a clear style and I like that the authors included many experiments, including hyperparameter effects, ablations and extensive baseline comparisons. One thing I would have liked is them comparing their Informer to more efficient transformers (they compared only against logtrans and reformer) using the LRA (https://github.com/google-research/long-range-arena) benchmark.
What are some alternatives?
performer-pytorch - An implementation of Performer, a linear attention-based transformer, in Pytorch
NLP-progress - Repository to track the progress in Natural Language Processing (NLP), including the datasets and the current state-of-the-art for the most common NLP tasks.
FARM - :house_with_garden: Fast & easy transfer learning for NLP. Harvesting language models for the industry. Focus on Question Answering.
jax-resnet - Implementations and checkpoints for ResNet, Wide ResNet, ResNeXt, ResNet-D, and ResNeSt in JAX (Flax).
lemmatization-lists - Machine-readable lists of lemma-token pairs in 23 languages.
attention-is-all-you-need-pytorch - A PyTorch implementation of the Transformer model in "Attention is All You Need".
HJxB - Continuous-Time/State/Action Fitted Value Iteration via Hamilton-Jacobi-Bellman (HJB)
scenic - Scenic: A Jax Library for Computer Vision Research and Beyond
azure-sql-db-openai - Samples on how to use Azure SQL database with Azure OpenAI
elegy - A High Level API for Deep Learning in JAX
LFattNet - Attention-based View Selection Networks for Light-field Disparity Estimation
flaxmodels - Pretrained deep learning models for Jax/Flax: StyleGAN2, GPT2, VGG, ResNet, etc.