An implementation of Fastformer: Additive Attention Can Be All You Need, a Transformer Variant in TensorFlow
Why do you think that https://github.com/lucidrains/TimeSformer-pytorch is a good alternative to Fast-Transformer
An implementation of Fastformer: Additive Attention Can Be All You Need, a Transformer Variant in TensorFlow
Why do you think that https://github.com/lucidrains/TimeSformer-pytorch is a good alternative to Fast-Transformer