Open Source Libraries

Our great sponsors

InfluxDB - Power Real-Time Data Analytics at Scale

WorkOS - The modern identity platform for B2B SaaS

SaaSHub - Software Alternatives and Reviews

Our great sponsors

TTS

231 29,174 9.5 Python

🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production

coqui-ai/TTS

tortoise-tts

144 11,755 8.2 Jupyter Notebook

A multi-voice TTS system trained with an emphasis on quality

neonbjb/tortoise-tts

InfluxDB

www.influxdata.com sponsored

Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
bark

66 32,517 6.5 Jupyter Notebook

🔊 Text-Prompted Generative Audio Model

suno-ai/bark

piper

33 3,902 8.9 C++

A fast, local neural text to speech system (by rhasspy)

rhasspy/piper

Matcha-TTS

1 381 8.2 Jupyter Notebook

[ICASSP 2024] 🍵 Matcha-TTS: A fast TTS architecture with conditional flow matching

shivammehta25/Matcha-TTS

whisper

343 60,303 6.4 Python

Robust Speech Recognition via Large-Scale Weak Supervision

openai/whisper

whisper.cpp

187 30,942 9.8 C

Port of OpenAI's Whisper model in C/C++

ggerganov/whisper.cpp

WorkOS

workos.com sponsored

The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
faster-whisper

22 8,723 8.3 Python

Faster Whisper transcription with CTranslate2

guillaumekln/faster-whisper

wenet

5 3,691 9.6 Python

Production First and Production Ready End-to-End Speech Recognition Toolkit

wenet-e2e/wenet

seamless_communication

11 10,181 8.6 Jupyter Notebook

Foundational Models for State-of-the-Art Speech and Text Translation

facebookresearch/seamless_communication: Speech translation

pyannote-audio

15 5,027 8.6 Jupyter Notebook

Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding

pyannote/pyannote-audio

PaddleSpeech

6 10,120 7.6 Python

Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting. Won NAACL2022 Best Demo Award.

PaddlePaddle/PaddleSpeech

audio-webui

15 897 9.0 Python

A webui for different audio related Neural Networks

gitmylo/audio-webui

audiocraft

37 19,649 8.3 Python

Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable music generation LM with textual and melodic conditioning.

facebookresearch/audiocraft/MUSICGEN: Music Generation

jukebox

129 7,563 0.0 Python

Code for the paper "Jukebox: A Generative Model for Music"

openai/jukebox: Music Generation

Retrieval-based-Voice-Conversion-WebUI

56 18,860 9.6 Python

Voice data <= 10 mins can also be used to train a good VC model!

RVC-Project/Retrieval-based-Voice-Conversion-WebUI: Singing Voice Conversion

fish-diffusion

1 568 8.3 Python

An easy to understand TTS / SVS / SVC framework

fishaudio/fish-diffusion: Singing Voice Conversion

demucs

108 7,644 5.4 Python

Code for the paper Hybrid Spectrogram and Waveform Source Separation, but the goddamm motherfucker doesn't work.

facebookresearch/demucs: Stem seperation

ultimatevocalremovergui

82 14,833 8.9 Python

GUI for a Vocal Remover that uses Deep Neural Networks.

Anjok07/UltimateVocalRemoverGUI: Vocal isolation

DeepFilterNet

10 1,914 9.1 Python

Noise supression using deep filtering

Rikorose/DeepFilterNet: A Low Complexity Speech Enhancement Framework for Full-Band Audio (48kHz) using on Deep Filtering

PiDTLN

1 52 10.0 Python

Apply machine learning model DTLN for noise suppression and acoustic echo cancellation on Raspberry Pi

SaneBow/PiDTLN: DTLN model for noise suppression and acoustic echo cancellation on Raspberry Pi

versatile_audio_super_resolution

1 870 7.9 Python

Versatile audio super resolution (any -> 48kHz) with AudioSR.

haoheliu/versatile_audio_super_resolution: any -> 48kHz high fidelity Enhancer

basic-pitch

8 2,901 8.4 Python

A lightweight yet powerful audio-to-MIDI converter with pitch bend detection

spotify/basic-pitch: Audio to midi converter

pedalboard

24 4,846 8.3 C++

🎛 🔊 A Python library for audio.

spotify/pedalboard: audio effects for Python and TensorFlow

librosa

14 6,681 7.2 Python

Python library for audio and music analysis

librosa/librosa: Python library for audio and music analysis

SaaSHub

www.saashub.com sponsored

SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

AudioFlux: Open-source for audio and music analysis, feature extraction
1 project | news.ycombinator.com | 27 Mar 2024
Unsupervised (Semi-Supervised) ASR/STT training recipes
2 projects | /r/deeplearning | 3 Nov 2023
Beets: The music geek's media organizer
1 project | news.ycombinator.com | 25 Oct 2023
Manage offline music?
1 project | /r/DataHoarder | 22 Oct 2023
Romy & Fred again.. - Strong (Yelow Bootleg Remix) [2023]
2 projects | /r/trance | 12 Jul 2023

This page summarizes the projects mentioned and recommended in the original post on /r/AudioAI
Audio Pytorch Transformer Music Python
Post date: 2 Oct 2023

TTS

tortoise-tts

InfluxDB

bark

piper

Matcha-TTS

whisper

whisper.cpp

WorkOS

faster-whisper

wenet

seamless_communication

pyannote-audio

PaddleSpeech

audio-webui

audiocraft

jukebox

Retrieval-based-Voice-Conversion-WebUI

fish-diffusion

demucs

ultimatevocalremovergui

DeepFilterNet

PiDTLN

versatile_audio_super_resolution

basic-pitch

pedalboard

librosa

SaaSHub

Related posts

Open Source Libraries

This page summarizes the projects mentioned and recommended in the original post on /r/AudioAI Audio Pytorch Transformer Music Python Post date: 2 Oct 2023

Related posts

This page summarizes the projects mentioned and recommended in the original post on /r/AudioAI
Audio Pytorch Transformer Music Python
Post date: 2 Oct 2023