Flash-attention-minimal Alternatives

Similar projects and alternatives to flash-attention-minimal

Vulkan-Samples

44 3,933 9.2 C++ flash-attention-minimal VS Vulkan-Samples

One stop solution for all Vulkan samples
Halide

43 5,703 9.5 C++ flash-attention-minimal VS Halide

a language for fast, portable data-parallel computation
InfluxDB

www.influxdata.com featured

Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a better flash-attention-minimal alternative or higher similarity.

Suggest an alternative to flash-attention-minimal

flash-attention-minimal reviews and mentions

Posts with mentions or reviews of flash-attention-minimal. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2024-03-31.

Google's First Tensor Processing Unit: Architecture
2 projects | news.ycombinator.com | 31 Mar 2024

Vulcan is a driver-level API. It competes with DirectX and OpenGL.
CUDA is a language you write kernels. It competes with OpenAI's Triton language.
Here's what CUDA looks like: https://github.com/tspeterkim/flash-attention-minimal/blob/m...
This is what Triton looks like: https://triton-lang.org/main/getting-started/tutorials/06-fu...
By contrast Vulcan looks like this: https://github.com/KhronosGroup/Vulkan-Samples/blob/main/sam...
(It's true to some extent that maybe you could use Vulcan shaders to write deep learning kernels, maybe? I'm not aware of anyone doing it though)
Show HN: Flash Attention in ~100 lines of CUDA
2 projects | news.ycombinator.com | 16 Mar 2024

Stats

Basic flash-attention-minimal repo stats

Mentions

Stars

410

Activity

5.7

Last Commit

24 days ago

tspeterkim/flash-attention-minimal is an open source project licensed under Apache License 2.0 which is an OSI approved license.

The primary programming language of flash-attention-minimal is Cuda.