Long Range Arena for Benchmarking Efficient Transformers
Why do you think that https://github.com/jadore801120/attention-is-all-you-need-pytorch is a good alternative to long-range-arena
Long Range Arena for Benchmarking Efficient Transformers
Why do you think that https://github.com/jadore801120/attention-is-all-you-need-pytorch is a good alternative to long-range-arena