memory-efficient-attention-pytorch VS flash-attention

Compare memory-efficient-attention-pytorch vs flash-attention and see what are their differences.

memory-efficient-attention-pytorch

Implementation of a memory efficient multi-head attention as proposed in the paper, "Self-attention Does Not Need O(n²) Memory" (by lucidrains)

flash-attention

Fast and memory-efficient exact attention (by Dao-AILab)
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com
featured
memory-efficient-attention-pytorch flash-attention
2 27
227 15,142
- 4.9%
6.1 9.2
almost 2 years ago 7 days ago
Python Python
MIT License BSD 3-clause "New" or "Revised" License
The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.

memory-efficient-attention-pytorch

Posts with mentions or reviews of memory-efficient-attention-pytorch. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2022-06-09.

flash-attention

Posts with mentions or reviews of flash-attention. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2024-07-11.

What are some alternatives?

When comparing memory-efficient-attention-pytorch and flash-attention you can also consider the following projects:

vit-pytorch - Implementation of Vision Transformer, a simple way to achieve SOTA in vision classification with only a single transformer encoder, in Pytorch

xformers - Hackable and optimized Transformers building blocks, supporting a composable construction.

performer-pytorch - An implementation of Performer, a linear attention-based transformer, in Pytorch

TensorRT - NVIDIA® TensorRT™ is an SDK for high-performance deep learning inference on NVIDIA GPUs. This repository contains the open source components of TensorRT.

Compact-Transformers - Escaping the Big Data Paradigm with Compact Transformers, 2021 (Train your Vision Transformers in 30 mins on CIFAR-10 with a single GPU!)

DeepSpeed - DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.

memory-efficient-attention-pyt

RWKV-LM - RWKV (pronounced RwaKuv) is an RNN with great LLM performance, which can also be directly trained like a GPT transformer (parallelizable). We are at RWKV-7 "Goose". So it's combining the best of RNN and transformer - great performance, linear time, constant space (no kv-cache), fast training, infinite ctx_len, and free sentence embedding.

routing-transformer - Fully featured implementation of Routing Transformer

XMem - [ECCV 2022] XMem: Long-Term Video Object Segmentation with an Atkinson-Shiffrin Memory Model

x-transformers - A concise but complete full-attention transformer with a set of promising experimental features from various papers

alpaca_lora_4bit

SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com
featured