Suggest an alternative to Linear-Multihead-Attention

Reproducing the Linear Multihead Attention introduced in Linformer paper (Linformer: Self-Attention with Linear Complexity)