TransformerEngine vs fastaudio

TransformerEngine

A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit floating point (FP8) precision on Hopper and Ada GPUs, to provide better performance with lower memory utilization in both training and inference. (by NVIDIA)

Source Code

docs.nvidia.com

Suggest alternative

Edit details

fastaudio

🔊 Audio and fastai v2 (by fastaudio)

Audio Fastai Pytorch Python Deep Learning GPU

Source Code

fastaudio.github.io

Suggest alternative

Edit details

Our great sponsors

WorkOS - The modern identity platform for B2B SaaS

InfluxDB - Power Real-Time Data Analytics at Scale

SaaSHub - Software Alternatives and Reviews

Our great sponsors

TransformerEngine		fastaudio
	Project
2	Mentions	1
1,428	Stars	165
13.1%	Growth	1.8%
9.5	Activity	0.0
4 days ago	Latest Commit	4 months ago
Python	Language	Python
Apache License 2.0	License	MIT License

The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.

TransformerEngine

Posts with mentions or reviews of TransformerEngine. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2023-01-18.

Benchmarking Large Language Models on NVIDIA H100 GPUs with CoreWeave (Part 1)
1 project | /r/nvidia | 30 Apr 2023

4090 now has its 8-bit float enabled as well, see the [transformer engine issue](https://github.com/NVIDIA/TransformerEngine/issues/15)
GPUs for Deep Learning in 2023 – An In-depth Analysis
4 projects | news.ycombinator.com | 18 Jan 2023

Would be curious to see your benchmarks. Btw, Nvidia will be providing support for fp8 in a future release of CUDA - https://github.com/NVIDIA/TransformerEngine/issues/15
I think TMA may not matter as much for consumer cards given the disproportionate amount of fp32 / int32 compute that they have.
Would be interesting to see how close to theoretical folks are able to get once CUDA support comes through.

fastaudio

Posts with mentions or reviews of fastaudio. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2021-06-18.

CatMeows: A Publicly-Available Dataset of Cat Vocalizations
2 projects | news.ycombinator.com | 18 Jun 2021

What are some alternatives?

When comparing TransformerEngine and fastaudio you can also consider the following projects:

Whisper - High-performance GPGPU inference of OpenAI's Whisper automatic speech recognition (ASR) model

pytorch-forecasting - Time series forecasting with PyTorch

autocvd - Tool to automatically set CUDA_VISIBLE_DEVICES based on GPU utilization. Usable from command line and code.

RealCatTranslator - Project in several phases to ultimately built an app that translate from human to cats and cats to human using deep learning algorithm

warp-drive - Extremely Fast End-to-End Deep Multi-Agent Reinforcement Learning Framework on a GPU (JMLR 2022)

ivy - The Unified AI Framework

fps_highlights - Make beautiful highlights from FPS vidéos

nanoGPT - The simplest, fastest repository for training/finetuning medium-sized GPTs.

ivy - The Unified Machine Learning Framework [Moved to: https://github.com/unifyai/ivy]

liberate-fhe - A Fully Homomorphic Encryption (FHE) library for bridging the gap between theory and practice with a focus on performance and accuracy.

FastFold - Optimizing AlphaFold Training and Inference on GPU Clusters