Top 11 voice-activity-detection Open-Source Projects

NoiseTorch

106 8,948 5.9 Go

Real-time microphone noise suppression on Linux.

Project mention: Ask HN: What are some unpopular technologies you wish people knew more about? | news.ycombinator.com | 2023-12-02

Noisetorch. https://github.com/noisetorch/NoiseTorch
ffsubsync

31 6,478 4.8 Python

Automagically synchronize subtitles with video.

Project mention: The GitHub Black Market That Helps Coders Cheat the Popularity Contest | news.ycombinator.com | 2023-10-23

> Another giveaway is the ratio of stars to watchers / forks. I remember one project with thousands of stars but only 10 users "watching" it. They went on to raise a sizable seed round too.
Not necessarily indicative of foul play. I have two projects like this (https://github.com/smacke/ffsubsync and https://github.com/ipyflow/ipyflow) and I attribute it to not having great developer documentation.
InfluxDB

www.influxdata.com
sponsored

Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
pyannote-audio

15 4,930 8.7 Jupyter Notebook

Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding

Project mention: Open Source Libraries | /r/AudioAI | 2023-10-02

pyannote/pyannote-audio
FunASR

2 3,023 9.9 Python

A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models. ｜语音识别工具包，包含丰富的性能优越的开源预训练模型，支持语音识别、语音端点检测、文本后处理等，具备服务部署能力。

Project mention: FunASR: Fundamental End-to-End Speech Recognition Toolkit | news.ycombinator.com | 2024-01-13
silero-vad

10 2,780 6.5 Python

Silero VAD: pre-trained enterprise-grade Voice Activity Detector

Project mention: New models and developer products announced at OpenAI DevDay | news.ycombinator.com | 2023-11-06

>How do you detect speech starting and stopping?
https://github.com/snakers4/silero-vad
voice_datasets

3 1,525 3.5

🔊 A comprehensive list of open-source datasets for voice and sound computing (95+ datasets).
Python-ai-assistant

1 852 0.0 Python

Python AI assistant 🧠

Project mention: Jarvis: A Voice Virtual Assistant in Python (OpenAI, ElevenLabs, Deepgram) | news.ycombinator.com | 2023-12-18

There is another one (Also Jarvis) that's been around for a while and is more useful, wonder if they can combine forces? https://github.com/ggeop/Python-ai-assistant
Not sure if anyone has noticed but OpenAI now has a mobile app (I've been using the PWA all this time) and the voice assistant on there is really strong. Sounds good, fast, and seems to even run a pass on my voice before it submits the query.
WorkOS

workos.com
sponsored

The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
inaSpeechSegmenter

3 692 6.4 Python

CNN-based audio segmentation toolkit. Allows to detect speech, music, noise and speaker gender. Has been designed for large scale gender equality studies based on speech time per gender.

Project mention: Listen to HD radio with a $30 RTL SDR dongle | news.ycombinator.com | 2023-11-05

I have a little hobby project where I record an FM radio music station using a SDR and then remove all the non-music portions for offline listening. I like the music selections the DJs pick, but I prefer not to listen to the DJ commentary and the advertisements.
I evaluated three methods of recording: analog capture from a standalone FM receiver, using this nrsc5 library to record the "HD" radio stream, and using an AirSpy SDR with this library: https://github.com/jj1bdx/airspy-fmradion
Recording the "HD" (what a misnomer) radio was nice in that there was no hiss or multipath effects, but in comparison to the other methods the digital compression artifacts became impossible to un-hear. It seems to top out at about 96 kbps
The airspy-fmradion library has some nice stuff in it to address multipath, resulting in the best audio quality of the three methods I tested.
I use https://github.com/ina-foss/inaSpeechSegmenter to identify which segments of the recordings are speech vs. music.
subaligner

3 411 6.5 Python

Automatically synchronize and translate subtitles, or create new ones by transcribing, using pre-trained DNNs, Forced Alignments and Transformers. https://subaligner.readthedocs.io/
whisper-auto-transcribe

8 192 6.1 Python

Auto transcribe tool based on whisper

Project mention: Using Whisper to transcribe the entire Forensic Files series | /r/DataHoarder | 2023-06-04
android-vad

1 185 9.0 C

Android Voice Activity Detection (VAD) library. Supports WebRTC VAD GMM, Silero VAD DNN, Yamnet VAD DNN models.

Project mention: Android Voice Activity Detection | news.ycombinator.com | 2024-01-02

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020). The latest post mention was on 2024-01-13.

voice-activity-detection related posts

Audio crackling woes on Pop_OS 22.04
1 project | /r/pop_os | 22 May 2023
Steam Deck's fan noises interfere with a built in mic
1 project | /r/SteamDeck | 14 May 2023
Mic problems in game (Apex legends)
1 project | /r/linux_gaming | 20 Feb 2023
FOSS open source version of adobe enhance - Enhance voice recordings
1 project | /r/selfhosted | 2 Feb 2023
Noise cancellation for linux
1 project | /r/linuxquestions | 24 Jan 2023
Noisetorch becoming glitchy
1 project | /r/linuxquestions | 4 Jan 2023
PSA, Discord GPU acceleration doesn't work correctly on Linux, here's how to properly enable it
2 projects | /r/linux_gaming | 21 Dec 2022
A note from our sponsor - InfluxDB
www.influxdata.com | 18 Apr 2024

Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality. Learn more →

Index

What are some of the best open-source voice-activity-detection projects? This list will help you:

	Project	Stars
1	NoiseTorch	8,948
2	ffsubsync	6,478
3	pyannote-audio	4,930
4	FunASR	3,023
5	silero-vad	2,780
6	voice_datasets	1,525
7	Python-ai-assistant	852
8	inaSpeechSegmenter	692
9	subaligner	411
10	whisper-auto-transcribe	192
11	android-vad	185