SaaSHub helps you find the best software and product alternatives Learn more →
Top 23 Python Audio Projects
-
Project mention: Ultimate Vocal Remover GUI, a FOSS audio stem splitter | news.ycombinator.com | 2025-05-09
-
InfluxDB
InfluxDB – Built for High-Performance Time Series Workloads. InfluxDB 3 OSS is now GA. Transform, enrich, and act on time series data directly in the database. Automate critical tasks and eliminate the need to move data externally. Download now.
-
I've used beets to import and tag a huge personal music library:
https://beets.io/
-
-
Simple Diarizer Simple Diarizer is a speaker diarization library that utilizes pretrained models from SpeechBrain . To get started with simple_diarizer, follow these steps:
-
-
SpeechRecognition
Speech recognition module for Python, supporting several engines and APIs, online and offline.
-
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
-
Project mention: Librosa: Python library for audio and music analysis | news.ycombinator.com | 2024-09-03
-
Project mention: Ten years after the last release, Aegisub 3.4.0 released | news.ycombinator.com | 2024-12-21
Aegis is great for authoring new subtitles but if you're just looking to sync then take a look at https://github.com/smacke/ffsubsync
Plex also recently added auto-sync subtitles to the Plex Pass
https://support.plex.tv/articles/auto-sync-subtitles/
-
-
pyAudioAnalysis
Python Audio Analysis Library: Feature Extraction, Classification, Segmentation and Applications
-
-
picard
A cross-platform music tagger powered by the MusicBrainz database. Picard organizes your music collection by updating your tags, renaming your files, and sorting them into a folder structure, exactly the way you want it.
Make sure to checkout Picard:
https://picard.musicbrainz.org/
Which uses the MusicBrainz DB to auto tag and correct audio file names. Makes it really easy to organize a large collection of (pirated) audio.
-
-
distil-whisper
Distilled variant of Whisper for speech recognition. 6x faster, 50% smaller, within 1% word error rate.
Details will be shared tomorrow, but from what I have read they have distilled the large model decoder into this turbo that only has 4 layers instead of 32, the encoder should remain the same size. Similar to https://github.com/huggingface/distil-whisper but the model is distilled using multilingual data instead of just English, and the decoder is 4 layers instead of 2.
-
Project mention: Benn Jordan's AI poison pill and the weird world of adversarial noise | news.ycombinator.com | 2025-04-15
https://github.com/riffusion/riffusion-hobby
The more advanced music generators out now I believe have more of a 'stems' approach and a larger processing pipeline to increase fidelity and add tracking vocal capability but the underlying idea is the same.
Any adversarial attack to hide information in the spectrograph to fool the model into categorizing the track as something it is not isn't different than the image adversarial attacks which have been found to have ways to be mitigated.
Various forms of filtering for inaudible spectral information coupled with methods that destroy and re-synthesize/randomize phase information would likely break this poisoning attack.
-
-
Project mention: Real-time ML audio noise suppression on Raspberry Pi Pico 2 | news.ycombinator.com | 2024-08-09
Very cool! Would be curious to see how this compares to https://github.com/Rikorose/DeepFilterNet written in Rust.
Or this Samsung Research paper https://research.samsung.com/blog/FSPEN-AN-ULTRA-LIGHTWEIGHT...
-
aeneas
aeneas is a Python/C library and a set of tools to automagically synchronize audio and text (aka forced alignment)
-
-
-
-
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
Python Audio discussion
Python Audio related posts
-
This Week In Python
-
Benn Jordan's AI poison pill and the weird world of adversarial noise
-
Pulsemixer: TUI Alternative to Pavucontrol
-
Show HN: Video processing pipeline for LLM – Python
-
So you would like to digitise your CD collection? (Part 2)
-
What You Might Miss When Backing Up CDs
-
Ten years after the last release, Aegisub 3.4.0 released
-
A note from our sponsor - SaaSHub
www.saashub.com | 17 May 2025
Index
What are some of the best open-source Audio projects in Python? This list will help you:
# | Project | Stars |
---|---|---|
1 | ultimatevocalremovergui | 20,601 |
2 | beets | 13,475 |
3 | AudioGPT | 10,143 |
4 | speechbrain | 9,808 |
5 | pydub | 9,355 |
6 | SpeechRecognition | 8,723 |
7 | jukebox | 7,978 |
8 | librosa | 7,612 |
9 | ffsubsync | 7,144 |
10 | dejavu | 6,534 |
11 | pyAudioAnalysis | 6,009 |
12 | Porcupine | 4,103 |
13 | picard | 4,074 |
14 | basic-pitch | 3,913 |
15 | distil-whisper | 3,850 |
16 | riffusion-hobby | 3,682 |
17 | auto-editor | 3,296 |
18 | DeepFilterNet | 3,038 |
19 | aeneas | 2,640 |
20 | mkchromecast | 2,257 |
21 | m3u8 | 2,149 |
22 | matchering | 2,048 |
23 | Tauon | 2,047 |