Kernl lets you run PyTorch transformer models several times faster on GPU with a single line of code, and is designed to be easily hackable.
Why do you think that https://github.com/Dao-AILab/flash-attention is a good alternative to kernl
Kernl lets you run PyTorch transformer models several times faster on GPU with a single line of code, and is designed to be easily hackable.
Why do you think that https://github.com/Dao-AILab/flash-attention is a good alternative to kernl