Top 8 speaker-verification Open-Source Projects
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
-
vosk-api
Offline speech recognition API for Android, iOS, Raspberry Pi and servers with Python, Java, C# and Node
-
pyannote-audio
Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding
-
awesome-speech-recognition-speech-synthesis-papers
Automatic Speech Recognition (ASR), Speaker Verification, Speech Synthesis, Text-to-Speech (TTS), Language Modelling, Singing Voice Synthesis (SVS), Voice Conversion (VC)
-
ECAPA-TDNN
Unofficial reimplementation of ECAPA-TDNN for speaker recognition (EER=0.86 for Vox1_O when train only in Vox2)
-
WorkOS
The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
Project mention: Amazon plans to charge for Alexa in June–unless internal conflict delays revamp | news.ycombinator.com | 2024-01-20Yeah, whisper is the closest thing we have, but even it requires more processing power than is present in most of these edge devices in order to feel smooth. I've started a voice interface project on a Raspberry Pi 4, and it takes about 3 seconds to produce a result. That's impressive, but not fast enough for Alexa.
From what I gather a Pi 5 can do it in 1.5 seconds, which is closer, so I suspect it's only a matter of time before we do have fully local STT running directly on speakers.
> Probably anathema to the space, but if the devices leaned into the ~five tasks people use them for (timers, weather, todo list?) could probably tighten up the AI models to be more accurate and/or resource efficient.
Yes, this is the approach taken by a lot of streaming STT systems, like Kaldi [0]. Rather than use a fully capable model, you train a specialized one that knows what kinds of things people are likely to say to it.
[0] http://kaldi-asr.org/
Project mention: SpeechBrain 1.0: A free and open-source AI toolkit for all things speech | news.ycombinator.com | 2024-02-28
pyannote/pyannote-audio
speaker-verification related posts
- How to get high-quality, low-cost Speech-to-Text transcription?
- 5 Best Open Source Libraries and APIs for Speaker Diarization
- Nerd-dictation, hackable speech to text on Linux
- [D] ASR/Automatic Speech Recognition toolkit that provides precise word-level timing data? (eg, where in the audio stream a word starts and ends?)
Index
What are some of the best open-source speaker-verification projects? This list will help you:
Project | Stars | |
---|---|---|
1 | Kaldi Speech Recognition Toolkit | 13,706 |
2 | speechbrain | 7,869 |
3 | vosk-api | 7,025 |
4 | pyannote-audio | 5,027 |
5 | awesome-speech-recognition-speech-synthesis-papers | 2,870 |
6 | SincNet | 1,097 |
7 | ECAPA-TDNN | 525 |
8 | UniSpeech | 386 |
Sponsored