Implementation of Memorizing Transformers (ICLR 2022), attention net augmented with indexing and retrieval of memories using approximate nearest neighbors, in Pytorch
Why do you think that https://github.com/lucidrains/DALLE-pytorch is a good alternative to memorizing-transformers-pytorch