voice-activity-detection

Open-source projects categorized as voice-activity-detection

Top 11 voice-activity-detection Open-Source Projects

  • NoiseTorch

    Real-time microphone noise suppression on Linux.

    Project mention: Ask HN: What are some unpopular technologies you wish people knew more about? | news.ycombinator.com | 2023-12-02

    Noisetorch. https://github.com/noisetorch/NoiseTorch

  • ffsubsync

    Automagically synchronize subtitles with video.

    Project mention: The GitHub Black Market That Helps Coders Cheat the Popularity Contest | news.ycombinator.com | 2023-10-23

    > Another giveaway is the ratio of stars to watchers / forks. I remember one project with thousands of stars but only 10 users "watching" it. They went on to raise a sizable seed round too.

    Not necessarily indicative of foul play. I have two projects like this (https://github.com/smacke/ffsubsync and https://github.com/ipyflow/ipyflow) and I attribute it to not having great developer documentation.

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

  • pyannote-audio

    Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding

    Project mention: Open Source Libraries | /r/AudioAI | 2023-10-02

    pyannote/pyannote-audio

  • FunASR

    A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models. |语音识别工具包,包含丰富的性能优越的开源预训练模型,支持语音识别、语音端点检测、文本后处理等,具备服务部署能力。

    Project mention: FunASR: Fundamental End-to-End Speech Recognition Toolkit | news.ycombinator.com | 2024-01-13
  • silero-vad

    Silero VAD: pre-trained enterprise-grade Voice Activity Detector

    Project mention: New models and developer products announced at OpenAI DevDay | news.ycombinator.com | 2023-11-06

    >How do you detect speech starting and stopping?

    https://github.com/snakers4/silero-vad

  • voice_datasets

    🔊 A comprehensive list of open-source datasets for voice and sound computing (95+ datasets).

  • Python-ai-assistant

    Python AI assistant 🧠

    Project mention: Jarvis: A Voice Virtual Assistant in Python (OpenAI, ElevenLabs, Deepgram) | news.ycombinator.com | 2023-12-18

    There is another one (Also Jarvis) that's been around for a while and is more useful, wonder if they can combine forces? https://github.com/ggeop/Python-ai-assistant

    Not sure if anyone has noticed but OpenAI now has a mobile app (I've been using the PWA all this time) and the voice assistant on there is really strong. Sounds good, fast, and seems to even run a pass on my voice before it submits the query.

  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

  • inaSpeechSegmenter

    CNN-based audio segmentation toolkit. Allows to detect speech, music, noise and speaker gender. Has been designed for large scale gender equality studies based on speech time per gender.

    Project mention: Listen to HD radio with a $30 RTL SDR dongle | news.ycombinator.com | 2023-11-05

    I have a little hobby project where I record an FM radio music station using a SDR and then remove all the non-music portions for offline listening. I like the music selections the DJs pick, but I prefer not to listen to the DJ commentary and the advertisements.

    I evaluated three methods of recording: analog capture from a standalone FM receiver, using this nrsc5 library to record the "HD" radio stream, and using an AirSpy SDR with this library: https://github.com/jj1bdx/airspy-fmradion

    Recording the "HD" (what a misnomer) radio was nice in that there was no hiss or multipath effects, but in comparison to the other methods the digital compression artifacts became impossible to un-hear. It seems to top out at about 96 kbps

    The airspy-fmradion library has some nice stuff in it to address multipath, resulting in the best audio quality of the three methods I tested.

    I use https://github.com/ina-foss/inaSpeechSegmenter to identify which segments of the recordings are speech vs. music.

  • subaligner

    Automatically synchronize and translate subtitles, or create new ones by transcribing, using pre-trained DNNs, Forced Alignments and Transformers. https://subaligner.readthedocs.io/

  • whisper-auto-transcribe

    Auto transcribe tool based on whisper

    Project mention: Using Whisper to transcribe the entire Forensic Files series | /r/DataHoarder | 2023-06-04
  • android-vad

    Android Voice Activity Detection (VAD) library. Supports WebRTC VAD GMM, Silero VAD DNN, Yamnet VAD DNN models.

    Project mention: Android Voice Activity Detection | news.ycombinator.com | 2024-01-02
NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020). The latest post mention was on 2024-01-13.

voice-activity-detection related posts

Index

What are some of the best open-source voice-activity-detection projects? This list will help you:

Project Stars
1 NoiseTorch 8,948
2 ffsubsync 6,478
3 pyannote-audio 4,930
4 FunASR 3,023
5 silero-vad 2,780
6 voice_datasets 1,525
7 Python-ai-assistant 852
8 inaSpeechSegmenter 692
9 subaligner 411
10 whisper-auto-transcribe 192
11 android-vad 185
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com