Top 8 speaker-verification Open-Source Projects

Kaldi Speech Recognition Toolkit

22 13,706 7.4 Shell

kaldi-asr/kaldi is the official location of the Kaldi project.

Project mention: Amazon plans to charge for Alexa in June–unless internal conflict delays revamp | news.ycombinator.com | 2024-01-20

Yeah, whisper is the closest thing we have, but even it requires more processing power than is present in most of these edge devices in order to feel smooth. I've started a voice interface project on a Raspberry Pi 4, and it takes about 3 seconds to produce a result. That's impressive, but not fast enough for Alexa.
From what I gather a Pi 5 can do it in 1.5 seconds, which is closer, so I suspect it's only a matter of time before we do have fully local STT running directly on speakers.
> Probably anathema to the space, but if the devices leaned into the ~five tasks people use them for (timers, weather, todo list?) could probably tighten up the AI models to be more accurate and/or resource efficient.
Yes, this is the approach taken by a lot of streaming STT systems, like Kaldi [0]. Rather than use a fully capable model, you train a specialized one that knows what kinds of things people are likely to say to it.
[0] http://kaldi-asr.org/

speechbrain

26 7,869 9.8 Python

A PyTorch-based Speech Toolkit

Project mention: SpeechBrain 1.0: A free and open-source AI toolkit for all things speech | news.ycombinator.com | 2024-02-28

InfluxDB

www.influxdata.com sponsored

Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
vosk-api

59 7,025 5.9 Jupyter Notebook

Offline speech recognition API for Android, iOS, Raspberry Pi and servers with Python, Java, C# and Node

Project mention: VOSK Offline Speech Recognition API | news.ycombinator.com | 2024-04-13

pyannote-audio

15 5,027 8.6 Jupyter Notebook

Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding

Project mention: Open Source Libraries | /r/AudioAI | 2023-10-02

pyannote/pyannote-audio

awesome-speech-recognition-speech-synthesis-papers

0 2,870 3.5

Automatic Speech Recognition (ASR), Speaker Verification, Speech Synthesis, Text-to-Speech (TTS), Language Modelling, Singing Voice Synthesis (SVS), Voice Conversion (VC)
SincNet

3 1,097 0.0 Python

SincNet is a neural architecture for efficiently processing raw audio samples.
ECAPA-TDNN

1 525 1.0 Python

Unofficial reimplementation of ECAPA-TDNN for speaker recognition (EER=0.86 for Vox1_O when train only in Vox2)
WorkOS

workos.com sponsored

The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
UniSpeech

1 386 4.5 Python

UniSpeech - Large Scale Self-Supervised Learning for Speech

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

speaker-verification related posts

How to get high-quality, low-cost Speech-to-Text transcription?
3 projects | /r/AskProgramming | 24 Jul 2022
5 Best Open Source Libraries and APIs for Speaker Diarization
2 projects | dev.to | 10 Feb 2022
Nerd-dictation, hackable speech to text on Linux
10 projects | news.ycombinator.com | 17 Jan 2022
[D] ASR/Automatic Speech Recognition toolkit that provides precise word-level timing data? (eg, where in the audio stream a word starts and ends?)
2 projects | /r/MachineLearning | 23 Aug 2021

Index

What are some of the best open-source speaker-verification projects? This list will help you:

	Project	Stars
1	Kaldi Speech Recognition Toolkit	13,706
2	speechbrain	7,869
3	vosk-api	7,025
4	pyannote-audio	5,027
5	awesome-speech-recognition-speech-synthesis-papers	2,870
6	SincNet	1,097
7	ECAPA-TDNN	525
8	UniSpeech	386