Training RNNs as Fast as CNNs (https://arxiv.org/abs/1709.02755)
Why do you think that https://github.com/jadore801120/attention-is-all-you-need-pytorch is a good alternative to sru
Training RNNs as Fast as CNNs (https://arxiv.org/abs/1709.02755)
Why do you think that https://github.com/jadore801120/attention-is-all-you-need-pytorch is a good alternative to sru