InfluxDB 3 OSS is now GA. Transform, enrich, and act on time series data directly in the database. Automate critical tasks and eliminate the need to move data externally. Download now. Learn more β
Top 23 Python audio-processing Projects
-
Project mention: DeepMind releases Lyria 2 music generation model | news.ycombinator.com | 2025-04-24
-
InfluxDB
InfluxDB β Built for High-Performance Time Series Workloads. InfluxDB 3 OSS is now GA. Transform, enrich, and act on time series data directly in the database. Automate critical tasks and eliminate the need to move data externally. Download now.
-
Simple Diarizer Simple Diarizer is a speaker diarization library that utilizes pretrained models from SpeechBrain . To get started with simple_diarizer, follow these steps:
-
-
audio-reactive-led-strip
:musical_note: :rainbow: Real-time LED strip music visualization using Python and the ESP8266 or Raspberry Pi
-
-
LedFx
LedFx is a network based LED effect engine designed to deliver advanced real-time audio effects to a wide variety of devices.
-
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
-
-
-
StreamSpeech
StreamSpeech is an βAll in Oneβ seamless model for offline and simultaneous speech recognition, speech translation and speech synthesis.
Has anyone had any luck with an offline, free, open-source real-time speech-to-speech translation app on under-powered devices (i.e., older smart phones)?
* https://github.com/ictnlp/StreamSpeech
* https://github.com/k2-fsa/sherpa-onnx
* https://github.com/openai/whisper
I'm looking for a simple app that can listen for English, translate into Korean (and other languages), then perform speech synthesis on the translation. Basically, a Babelfish that doesn't stick in the ear. Although real-time would be great, a 3- to 5-second delay is manageable.
RTranslator is awkward (couldn't get it to perform speech-to-speech using a single phone). 3PO sprouts errors like dandelions and requires an online connection.
Any suggestions?
-
-
wunjo.wladradchenko.ru
Wunjo CE: Face Swap, Lip Sync, Control Remove Objects & Text & Background, Restyling, Audio Separator, Clone Voice, Video Generation. Open Source, Local & Free.
Project mention: Why My Open Source Project Wunjo Canβt Reach 1K Stars? π’ | dev.to | 2025-03-25Iβve been building Wunjo, an Open Source AI-powered video editing tool that can today automatically cut, highlight, and transform videos with a simple text prompt. Sounds cool, right? Yet, getting to 1K stars on GitHub feels like an endless grind. This is a set of tools in software to optimization process of video, photo editing and API (API Docs) inside for other pet-projects.
-
FoleyCrafter
FoleyCrafter: Bring Silent Videos to Life with Lifelike and Synchronized Sounds. AIζι³ε€§εΈοΌη»δ½ ηζ ε£°θ§ι’ζ·»ε ηε¨θδΈεζ₯ηι³ζ π
Project mention: Bring Silent Videos to Life Sounds(Open-Source) | news.ycombinator.com | 2025-02-27 -
-
-
whisper-at
Code and Pretrained Models for Interspeech 2023 Paper "Whisper-AT: Noise-Robust Automatic Speech Recognizers are Also Strong Audio Event Taggers"
Project mention: Show HN: Voice-Pro β AI Voice Cloning Magic: Transform Any Voice in 15 Seconds | news.ycombinator.com | 2024-11-27Have you considered supporting whisper-at - https://github.com/YuanGongND/whisper-at ? Being able to identify sounds on a timeline can be useful e.g. politicians speech and how the audience is reacting to it (e.g. clapping, applauding)
-
-
-
stemgen
π Stemgen is a Stem file generator. Convert any track into a Stem and have fun with Traktor.
-
pyCrossfade
pyCrossfade is the result of a personal project to use beat matching, gradual bpm shift on bars, and EQ modification to provide smooth and tunable transitions between music files.
-
-
Project mention: Initiative from Google, OpenAI, Discord, others could transform trust and safety | news.ycombinator.com | 2025-02-11
-
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
Python audio-processing discussion
Python audio-processing related posts
-
Are stems a good way of making mashups
-
Big News!
-
Anybody here know what AI model does Steinberg's Spectralayers use to do stem separation?
-
Comparing Humans, GPT-4, and GPT-4V on Abstraction and Reasoning Tasks
-
Help needed in developing this! Itβs an AI-powered audio manipulation application designed to provide a rich yet simplified experience for sound designers that features AI text-to-audio, onboard fx, onscreen ChatGPT, and more. Send a line if you can help!
-
AI tools list sorted by category in one place
-
Software to lower tracks?
-
A note from our sponsor - InfluxDB
www.influxdata.com | 19 May 2025
Index
What are some of the best open-source audio-processing projects in Python? This list will help you:
# | Project | Stars |
---|---|---|
1 | spleeter | 26,840 |
2 | speechbrain | 9,834 |
3 | auto-editor | 3,296 |
4 | audio-reactive-led-strip | 2,750 |
5 | ailia-models | 2,196 |
6 | LedFx | 1,545 |
7 | audio-slicer | 1,344 |
8 | SALMONN | 1,227 |
9 | SincNet | 1,171 |
10 | StreamSpeech | 1,073 |
11 | nnAudio | 1,064 |
12 | wunjo.wladradchenko.ru | 1,024 |
13 | FoleyCrafter | 582 |
14 | unsilence | 576 |
15 | TimeSide | 385 |
16 | whisper-at | 355 |
17 | moseca | 322 |
18 | spectrographic | 277 |
19 | stemgen | 234 |
20 | pyCrossfade | 128 |
21 | see2sound | 124 |
22 | voice-safety-classifier | 84 |
23 | gensound | 81 |