[R] Google Replaces BERT Self-Attention with Fourier Transform: 92% Accuracy, 7 Times Faster on GPUs

Our great sponsors

InfluxDB - Power Real-Time Data Analytics at Scale

WorkOS - The modern identity platform for B2B SaaS

SaaSHub - Software Alternatives and Reviews

Our great sponsors

aphantasia

21 768 3.9 Python

CLIP + FFT/DWT/RGB = text to image/video

One of the public colabs using CLIP uses fourier transforms for image generation and it really is very fast. https://github.com/eps696/aphantasia

sentencepiece

19 9,480 8.1 C++

Unsupervised text tokenizer for Neural Network-based text generation.

I don't think this is valid in the context of this article. The input tokens are not one-hot encodings of the input characters, they are learned embeddings on a 32K SentencePiece vocabulary (4.1.1). As "STOP" and "SPOT" are probably fairly common words in their training dataset, I think it's safe to assume that each word would be assigned its own unique vector rather than be represented by the four "subword units" comprising their character decomposition.

InfluxDB

www.influxdata.com sponsored

Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

sentencepiece
1 project | news.ycombinator.com | 11 Dec 2023
[P] TokenMonster Ungreedy ~ 35% faster inference and 35% increased context-length for large language models (compared to tiktoken). Benchmarks included.
2 projects | /r/MachineLearning | 4 Jun 2023
LLaMA tokenizer: is a JavaScript implementation available anywhere?
1 project | /r/LocalLLaMA | 27 May 2023
[P] New tokenization method improves LLM performance & context-length by 25%+
2 projects | /r/MachineLearning | 13 May 2023
Code runs without definition of function (automatically calls a different function instead)
2 projects | /r/learnpython | 27 Mar 2023

[R] Google Replaces BERT Self-Attention with Fourier Transform: 92% Accuracy, 7 Times Faster on GPUs

This page summarizes the projects mentioned and recommended in the original post on /r/MachineLearning
neural-machine-translation Natural Language Processing word-segmentation
Post date: 14 May 2021

aphantasia

sentencepiece

InfluxDB

Related posts

[R] Google Replaces BERT Self-Attention with Fourier Transform: 92% Accuracy, 7 Times Faster on GPUs

This page summarizes the projects mentioned and recommended in the original post on /r/MachineLearning neural-machine-translation Natural Language Processing word-segmentation Post date: 14 May 2021

aphantasia

sentencepiece

InfluxDB

Related posts

This page summarizes the projects mentioned and recommended in the original post on /r/MachineLearning
neural-machine-translation Natural Language Processing word-segmentation
Post date: 14 May 2021