Python voice-activity-detection

Open-source Python projects categorized as voice-activity-detection

Top 8 Python voice-activity-detection Projects

voice-activity-detection
  1. FunASR

    A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.

    Project mention: Omni SenseVoice: High-Speed Speech Recognition with Words Timestamps | news.ycombinator.com | 2024-10-12

    Apparently not. See https://github.com/lifeiteng/OmniSenseVoice/blob/main/src/om.... See also FunASR running SenseVoice but using Kaldi for speaker identification https://github.com/modelscope/FunASR/blob/cd684580991661b9a0...

  2. Stream

    Stream - Scalable APIs for Chat, Feeds, Moderation, & Video. Stream helps developers build engaging apps that scale to millions with performant and flexible Chat, Feeds, Moderation, and Video APIs and SDKs powered by a global edge network and enterprise-grade infrastructure.

    Stream logo
  3. ffsubsync

    Automagically synchronize subtitles with video.

    Project mention: Ten years after the last release, Aegisub 3.4.0 released | news.ycombinator.com | 2024-12-21

    Aegis is great for authoring new subtitles but if you're just looking to sync then take a look at https://github.com/smacke/ffsubsync

    Plex also recently added auto-sync subtitles to the Plex Pass

    https://support.plex.tv/articles/auto-sync-subtitles/

  4. silero-vad

    Silero VAD: pre-trained enterprise-grade Voice Activity Detector

    Project mention: AI Voice Agents: Opensource, Pre-Trained Voice Activity Detector | news.ycombinator.com | 2024-07-28
  5. diart

    A python package to build AI-powered real-time audio applications

    Project mention: Ask HN: What is the current state of the art for transcribing with diarization? | news.ycombinator.com | 2024-10-18

    Why do you think the space is stalled? There are quite a few apps in that space. https://github.com/juanmc2005/diart

  6. Python-ai-assistant

    Python AI assistant 🧠

  7. inaSpeechSegmenter

    CNN-based audio segmentation toolkit. Allows to detect speech, music, noise and speaker gender. Has been designed for large scale gender equality studies based on speech time per gender.

  8. subaligner

    Automatically synchronize and translate subtitles, or create new ones by transcribing, using pre-trained DNNs, Forced Alignments and Transformers. https://subaligner.readthedocs.io/

  9. InfluxDB

    InfluxDB – Built for High-Performance Time Series Workloads. InfluxDB 3 OSS is now GA. Transform, enrich, and act on time series data directly in the database. Automate critical tasks and eliminate the need to move data externally. Download now.

    InfluxDB logo
  10. whisper-auto-transcribe

    Auto transcribe tool based on whisper

    Project mention: Whisper-WebUI | news.ycombinator.com | 2024-08-21

    I've used https://github.com/tomchang25/whisper-auto-transcribe to generate subtitles and then translate them to English and it worked fairly well. It's not professional-level, but it was good enough to understand what they were saying and enjoy foreign TV.

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

Python voice-activity-detection discussion

Log in or Post with

Index

What are some of the best open-source voice-activity-detection projects in Python? This list will help you:

# Project Stars
1 FunASR 11,452
2 ffsubsync 7,239
3 silero-vad 6,295
4 diart 1,361
5 Python-ai-assistant 984
6 inaSpeechSegmenter 818
7 subaligner 479
8 whisper-auto-transcribe 225

Sponsored
Stream - Scalable APIs for Chat, Feeds, Moderation, & Video.
Stream helps developers build engaging apps that scale to millions with performant and flexible Chat, Feeds, Moderation, and Video APIs and SDKs powered by a global edge network and enterprise-grade infrastructure.
getstream.io

Did you know that Python is
the 2nd most popular programming language
based on number of references?