Top 14 Kaldi Open-Source Projects

Kaldi Speech Recognition Toolkit

22 13,788 6.7 Shell

kaldi-asr/kaldi is the official location of the Kaldi project.

Project mention: Amazon plans to charge for Alexa in June–unless internal conflict delays revamp | news.ycombinator.com | 2024-01-20

Yeah, whisper is the closest thing we have, but even it requires more processing power than is present in most of these edge devices in order to feel smooth. I've started a voice interface project on a Raspberry Pi 4, and it takes about 3 seconds to produce a result. That's impressive, but not fast enough for Alexa.
From what I gather a Pi 5 can do it in 1.5 seconds, which is closer, so I suspect it's only a matter of time before we do have fully local STT running directly on speakers.
> Probably anathema to the space, but if the devices leaned into the ~five tasks people use them for (timers, weather, todo list?) could probably tighten up the AI models to be more accurate and/or resource efficient.
Yes, this is the approach taken by a lot of streaming STT systems, like Kaldi [0]. Rather than use a fully capable model, you train a specialized one that knows what kinds of things people are likely to say to it.
[0] http://kaldi-asr.org/

espnet

15 7,932 10.0 Python

End-to-End Speech Processing Toolkit

Project mention: WhisperSpeech – An Open Source text-to-speech system built by inverting Whisper | news.ycombinator.com | 2024-01-17

You might check out this list from espnet. They list the different corpuses they use to train their models sorted by language and task (ASR, TTS etc):
https://github.com/espnet/espnet/blob/master/egs2/README.md

InfluxDB

www.influxdata.com featured

Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
vosk-api

61 7,149 6.6 Jupyter Notebook

Offline speech recognition API for Android, iOS, Raspberry Pi and servers with Python, Java, C# and Node

Project mention: Infini-Gram: Scaling unbounded n-gram language models to a trillion tokens | news.ycombinator.com | 2024-05-05

Dragonfire

2 1,382 0.0 Python

the open-source virtual assistant for Ubuntu based Linux distributions
pykaldi

2 979 5.4 Python

A Python wrapper for Kaldi
lhotse

1 869 9.0 Python

Tools for handling speech data in machine learning projects.

Project mention: Does anyone else find lhotse a pain to use | /r/speechtech | 2023-06-14

vosk-server

4 848 5.0 Python

WebSocket, gRPC and WebRTC speech recognition server based on Vosk and Kaldi libraries
SaaSHub

www.saashub.com featured

SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
vosk-android-demo

7 685 1.2 Java

Offline speech recognition for Android with Vosk library.
react-transcript-editor

1 536 0.0 JavaScript

A React component to make correcting automated transcriptions of audio and video easier and faster. By BBC News Labs. - Work in progress
kaldi-active-grammar

10 329 0.0 Python

Python Kaldi speech recognition with grammars that can be set active/inactive dynamically at decode-time

Project mention: Ask HN: How do you get started with adding voice commands to a computer system? | news.ycombinator.com | 2023-11-21

https://github.com/dictation-toolbox/dragonfly
https://github.com/daanzu/kaldi-active-grammar

vosk-browser

3 336 0.0 JavaScript

A speech recognition library running in the browser thanks to a WebAssembly build of Vosk
docker-kaldi-gstreamer-server

1 288 0.0 Dockerfile

Dockerfile for kaldi-gstreamer-server.
vosk-build-model

1 61 0.0 Shell

How to create your own model for vosk
ovos-stt-plugin-vosk

1 14 2.9 Python

vosk STT plugin for mycroft
SaaSHub

www.saashub.com featured

SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

Kaldi related posts

Amazon plans to charge for Alexa in June–unless internal conflict delays revamp

1 project | news.ycombinator.com | 20 Jan 2024
Unsupervised (Semi-Supervised) ASR/STT training recipes

2 projects | /r/deeplearning | 3 Nov 2023
Steve's Explanation of the Viterbi Algorithm

1 project | news.ycombinator.com | 16 Oct 2023
add a TTS (text-to-speach) and ASR (automatic-speech-recognition) capabilities to obscure language?

2 projects | /r/androiddev | 14 Feb 2023
C++ for machine learning

2 projects | /r/cscareerquestions | 7 Jan 2023
Íslensk talgervilsrödd sem hægt er að nota á Macca

1 project | /r/Iceland | 16 Dec 2022
The Advantages and disadvantages of In-House Speech Acknowledgment

1 project | /r/datatangblogbotshare | 12 Dec 2022
A note from our sponsor - InfluxDB
www.influxdata.com | 21 May 2024

Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality. Learn more →

Index

What are some of the best open-source Kaldi projects? This list will help you:

	Project	Stars
1	Kaldi Speech Recognition Toolkit	13,788
2	espnet	7,932
3	vosk-api	7,149
4	Dragonfire	1,382
5	pykaldi	979
6	lhotse	869
7	vosk-server	848
8	vosk-android-demo	685
9	react-transcript-editor	536
10	kaldi-active-grammar	329
11	vosk-browser	336
12	docker-kaldi-gstreamer-server	288
13	vosk-build-model	61
14	ovos-stt-plugin-vosk	14

Kaldi

Top 14 Kaldi Open-Source Projects

Kaldi related posts

Amazon plans to charge for Alexa in June–unless internal conflict delays revamp

Unsupervised (Semi-Supervised) ASR/STT training recipes

Steve's Explanation of the Viterbi Algorithm

add a TTS (text-to-speach) and ASR (automatic-speech-recognition) capabilities to obscure language?

C++ for machine learning

Íslensk talgervilsrödd sem hægt er að nota á Macca

The Advantages and disadvantages of In-House Speech Acknowledgment

Index