Suggest an alternative to x-transformers

A simple but complete full-attention transformer with a set of promising experimental features from various papers