[R] Rotary Positional Embeddings - a new relative positional embedding for Transformers that significantly improves convergence (20-30%) and works for both regular and efficient attention

Our great sponsors

InfluxDB - Power Real-Time Data Analytics at Scale

WorkOS - The modern identity platform for B2B SaaS

SaaSHub - Software Alternatives and Reviews

Our great sponsors

vit-pytorch

11 18,006 7.3 Python

Implementation of Vision Transformer, a simple way to achieve SOTA in vision classification with only a single transformer encoder, in Pytorch

I've attempted it here https://github.com/lucidrains/vit-pytorch/blob/main/vit_pytorch/rvt.py but those who have tried it haven't seen knock out results as 1d. Perhaps the axial lengths are too small to see a benefit

performer-pytorch

2 1,055 1.8 Python

An implementation of Performer, a linear attention-based transformer, in Pytorch

Performer is the best linear attention variant, but linear attention is just one type of efficient attention solution. I have rotary embeddings already in the repo https://github.com/lucidrains/performer-pytorch and you can witness this phenomenon yourself by toggling it on / off

InfluxDB

www.influxdata.com sponsored

Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Will Transformers Take over Artificial Intelligence?
5 projects | news.ycombinator.com | 10 Mar 2022
x-transformers
1 project | news.ycombinator.com | 31 Mar 2024
Is it easier to go from Pytorch to TF and Keras than the other way around?
1 project | /r/pytorch | 13 May 2023
A single API call using almost the whole 32k context window costs around 2$.
1 project | /r/OpenAI | 15 Mar 2023
GPT-4 architecture: what we can deduce from research literature
1 project | news.ycombinator.com | 14 Mar 2023

[R] Rotary Positional Embeddings - a new relative positional embedding for Transformers that significantly improves convergence (20-30%) and works for both regular and efficient attention

This page summarizes the projects mentioned and recommended in the original post on /r/MachineLearning
Artificial intelligence attention-mechanism Transformers Deep Learning Computer Vision
Post date: 21 Apr 2021

vit-pytorch

performer-pytorch

InfluxDB

Related posts

[R] Rotary Positional Embeddings - a new relative positional embedding for Transformers that significantly improves convergence (20-30%) and works for both regular and efficient attention

This page summarizes the projects mentioned and recommended in the original post on /r/MachineLearning Artificial intelligence attention-mechanism Transformers Deep Learning Computer Vision Post date: 21 Apr 2021

vit-pytorch

performer-pytorch

InfluxDB

Related posts

This page summarizes the projects mentioned and recommended in the original post on /r/MachineLearning
Artificial intelligence attention-mechanism Transformers Deep Learning Computer Vision
Post date: 21 Apr 2021