The official repository for our paper "The Devil is in the Detail: Simple Tricks Improve Systematic Generalization of Transformers". We significantly improve the systematic generalization of transformer models on a variety of datasets using simple tricks and careful considerations.
Why do you think that https://github.com/podgorskiy/ALAE is a good alternative to transformer_generalization