SaaSHub helps you find the best software and product alternatives Learn more →
Long-range-arena Alternatives
Similar projects and alternatives to long-range-arena
-
-
Scout Monitoring
Free Django app performance insights with Scout Monitoring. Get Scout setup in minutes, and let us sweat the small stuff. A couple lines in settings.py is all you need to start monitoring your apps. Sign up for our free tier today.
-
RWKV-LM
RWKV is an RNN with transformer-level LLM performance. It can be directly trained like a GPT (parallelizable). So it's combining the best of RNN and transformer - great performance, fast inference, saves VRAM, fast training, "infinite" ctx_len, and free sentence embedding.
-
attention-is-all-you-need-pytorch
A PyTorch implementation of the Transformer model in "Attention is All You Need".
-
jax-resnet
Implementations and checkpoints for ResNet, Wide ResNet, ResNeXt, ResNet-D, and ResNeSt in JAX (Flax).
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
-
-
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
long-range-arena discussion
long-range-arena reviews and mentions
-
The Secret Sauce behind 100K context window in LLMs: all tricks in one place
https://github.com/google-research/long-range-arena
-
[R] The Annotated S4: Efficiently Modeling Long Sequences with Structured State Spaces
The Structured State Space for Sequence Modeling (S4) architecture is a new approach to very long-range sequence modeling tasks for vision, language, and audio, showing a capacity to capture dependencies over tens of thousands of steps. Especially impressive are the model’s results on the challenging Long Range Arena benchmark, showing an ability to reason over sequences of up to 16,000+ elements with high accuracy.
-
[D] Is there a repo on which many light-weight self-attention mechanism are introduced?
1.1 Long Range Arena: A Benchmark for Efficient Transformers. From authors of above, they proposed a benchmark for modeling long range interactions. It also inlcudes a repository
- [R] Google’s H-Transformer-1D: Fast One-Dimensional Hierarchical Attention With Linear Complexity for Long Sequence Processing
- [2107.11906] H-Transformer-1D: Fast One-Dimensional Hierarchical Attention for Sequences
-
[R][D] Informer: Beyond Efficient Transformer for Long Sequence Time-Series Forecasting. Zhou et al. AAAI21 Best Paper. ProbSparse self-attention reduces complexity to O(nlogn), generative style decoder to obtainsequence output in one step, and self-attention distilling for further reducing memory
I think the paper is written in a clear style and I like that the authors included many experiments, including hyperparameter effects, ablations and extensive baseline comparisons. One thing I would have liked is them comparing their Informer to more efficient transformers (they compared only against logtrans and reformer) using the LRA (https://github.com/google-research/long-range-arena) benchmark.
-
A note from our sponsor - SaaSHub
www.saashub.com | 16 Jun 2024
Stats
google-research/long-range-arena is an open source project licensed under Apache License 2.0 which is an OSI approved license.
The primary programming language of long-range-arena is Python.
Popular Comparisons
- long-range-arena VS performer-pytorch
- long-range-arena VS attention-is-all-you-need-pytorch
- long-range-arena VS HJxB
- long-range-arena VS jax-resnet
- long-range-arena VS flaxmodels
- long-range-arena VS LFattNet
- long-range-arena VS tldr-transformers
- long-range-arena VS elegy
- long-range-arena VS gansformer
- long-range-arena VS scenic