kaldi-gstreamer-server vs vosk-server

kaldi-gstreamer-server

Real-time full-duplex speech recognition server, based on the Kaldi toolkit and the GStreamer framwork. (by alumae)

speech-recognition

Source Code

Suggest alternative

Edit details

vosk-server

WebSocket, gRPC and WebRTC speech recognition server based on Vosk and Kaldi libraries (by alphacep)

WebSocket speech-recognition Kaldi Python Asr Grpc SaaS WebRTC vosk

Source Code

Suggest alternative

Edit details

Our great sponsors

WorkOS - The modern identity platform for B2B SaaS

InfluxDB - Power Real-Time Data Analytics at Scale

SaaSHub - Software Alternatives and Reviews

Our great sponsors

kaldi-gstreamer-server		vosk-server
	Project
4	Mentions	4
1,054	Stars	837
-	Growth	3.0%
0.0	Activity	5.5
over 3 years ago	Latest Commit	19 days ago
Python	Language	Python
BSD 2-clause "Simplified" License	License	Apache License 2.0

The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.

kaldi-gstreamer-server

Posts with mentions or reviews of kaldi-gstreamer-server. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2021-08-29.

Real-time full-duplex speech recognition server, based on Kaldi and GStreamer
1 project | news.ycombinator.com | 1 Dec 2022
Ask HN: What problem are you close to solving and how can we help?
17 projects | news.ycombinator.com | 29 Aug 2021
Open Source ASR with user-specific custom vocabularies?
2 projects | /r/LanguageTechnology | 17 Jul 2021

Through my research, the most promising real-time transcription options appear to be Vosk or Kaldi Gstreamer. I’ve set them both up & they appear to work well for general transcription, but I’m not sure how to handle the user-specific custom vocabularies.
Speech to text software
2 projects | /r/opensource | 18 Mar 2021

It is kind of difficult to find something like this free of charge (and open source) since the ASR service needs to be hosted somewhere. If you are really interested in the topic then you could take a lit into kaldi and its pretrained models (but kaldi is kind of difficult to learn so I don't really recommend it if you want something quick) and then you could also combine that with kaldi-gstreamer in order to set up a server which you can turn on and off whenever you like.

vosk-server

Posts with mentions or reviews of vosk-server. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2022-08-04.

Self-hosted audio transcription?
3 projects | /r/selfhosted | 4 Aug 2022
Open Source ASR with user-specific custom vocabularies?
2 projects | /r/LanguageTechnology | 17 Jul 2021

Through my research, the most promising real-time transcription options appear to be Vosk or Kaldi Gstreamer. I’ve set them both up & they appear to work well for general transcription, but I’m not sure how to handle the user-specific custom vocabularies.
Voice2json: Offline speech and intent recognition on Linux
4 projects | news.ycombinator.com | 21 May 2021
Connecting vosk python model with react
1 project | /r/speechrecognition | 21 Apr 2021

What are some alternatives?

When comparing kaldi-gstreamer-server and vosk-server you can also consider the following projects:

espnet - End-to-End Speech Processing Toolkit

vosk-api - Offline speech recognition API for Android, iOS, Raspberry Pi and servers with Python, Java, C# and Node

Kaldi Speech Recognition Toolkit - kaldi-asr/kaldi is the official location of the Kaldi project.

common-voice - Common Voice is part of Mozilla's initiative to help teach machines how real people speak.

rhasspy - Offline private voice assistant for many human languages

TTS - 🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production

bert-for-inference - A small repo showing how to easily use BERT (or other transformers) for inference

julius - Open-Source Large Vocabulary Continuous Speech Recognition Engine

ChessPositionRanking - Software suite for ranking chess positions and accurately estimating the number of legal chess positions

vosk-android-demo - Offline speech recognition for Android with Vosk library.

mtpng - A parallelized PNG encoder in Rust

ovos-stt-plugin-vosk - vosk STT plugin for mycroft

kaldi-gstreamer-server vs espnet vosk-server vs vosk-api kaldi-gstreamer-server vs Kaldi Speech Recognition Toolkit vosk-server vs common-voice kaldi-gstreamer-server vs rhasspy vosk-server vs TTS kaldi-gstreamer-server vs bert-for-inference vosk-server vs julius kaldi-gstreamer-server vs ChessPositionRanking vosk-server vs vosk-android-demo kaldi-gstreamer-server vs mtpng vosk-server vs ovos-stt-plugin-vosk

Compare kaldi-gstreamer-server vs vosk-server and see what are their differences.

kaldi-gstreamer-server

vosk-server

kaldi-gstreamer-server

vosk-server

What are some alternatives?