Python speaker-diarization

Open-source Python projects categorized as speaker-diarization

Top 7 Python speaker-diarization Projects

  • espnet

    End-to-End Speech Processing Toolkit

  • Project mention: WhisperSpeech – An Open Source text-to-speech system built by inverting Whisper | news.ycombinator.com | 2024-01-17

    You might check out this list from espnet. They list the different corpuses they use to train their models sorted by language and task (ASR, TTS etc):

    https://github.com/espnet/espnet/blob/master/egs2/README.md

  • speechbrain

    A PyTorch-based Speech Toolkit

  • Project mention: SpeechBrain 1.0: A free and open-source AI toolkit for all things speech | news.ycombinator.com | 2024-02-28
  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

    WorkOS logo
  • FunASR

    A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models. |语音识别工具包,包含丰富的性能优越的开源预训练模型,支持语音识别、语音端点检测、文本后处理等,具备服务部署能力。

  • Project mention: FunASR: Fundamental End-to-End Speech Recognition Toolkit | news.ycombinator.com | 2024-01-13
  • uis-rnn

    This is the library for the Unbounded Interleaved-State Recurrent Neural Network (UIS-RNN) algorithm, corresponding to the paper Fully Supervised Speaker Diarization.

  • whisper-timestamped

    Multilingual Automatic Speech Recognition with word-level timestamps and confidence

  • Project mention: Show HN: AI Dub Tool I Made to Watch Foreign Language Videos with My 7-Year-Old | news.ycombinator.com | 2024-02-28

    Yes. But Whisper's word-level timings are actually quite inaccurate out of the box. There are some Python libraries that mitigate that. I tested several of them. whisper-timestamped seems to be the best one. [0]

    [0] https://github.com/linto-ai/whisper-timestamped

  • diart

    A python package to build AI-powered real-time audio applications

  • pyannote-whisper

  • Project mention: Summarization of long transcriptions | /r/LocalLLaMA | 2023-07-18

    These will be 3-5 hour recordings of 4-5 people. I plan to use https://github.com/yinruiqing/pyannote-whisper to generate the transcript from the recording.

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

Python speaker-diarization related posts

Index

What are some of the best open-source speaker-diarization projects in Python? This list will help you:

Project Stars
1 espnet 7,872
2 speechbrain 7,869
3 FunASR 3,299
4 uis-rnn 1,529
5 whisper-timestamped 1,501
6 diart 789
7 pyannote-whisper 414

Sponsored
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com