Our great sponsors
-
memorizing-transformers-pytorch
Implementation of Memorizing Transformers (ICLR 2022), attention net augmented with indexing and retrieval of memories using approximate nearest neighbors, in Pytorch
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
There is a GitHub repo https://github.com/lucidrains/memorizing-transformers-pytorch the implementation deviates from the paper slightly, using a hybrid attention across attention logits local and distant (rather than the sigmoid gate setup). It also uses cosine similarity attention (with learned temperature) for the KNN attention layer. There are also some features that are not mentioned in the paper, such as Transformer-XL memories and shifting memories down. There are no easy-to-use Memorizing Transformers implementations yet.