vosk-server vs kaldi-gstreamer-server

Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

www.influxdata.com

featured

SaaSHub - Software Alternatives and Reviews

SaaSHub helps you find the best software and product alternatives

www.saashub.com

featured

vosk-server		kaldi-gstreamer-server
	Project
4	Mentions	4
837	Stars	1,054
1.1%	Growth	-
5.5	Activity	0.0
23 days ago	Latest Commit	over 3 years ago
Python	Language	Python
Apache License 2.0	License	BSD 2-clause "Simplified" License

The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.

vosk-server

Posts with mentions or reviews of vosk-server. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2022-08-04.

Self-hosted audio transcription?
3 projects | /r/selfhosted | 4 Aug 2022
Open Source ASR with user-specific custom vocabularies?
2 projects | /r/LanguageTechnology | 17 Jul 2021

Through my research, the most promising real-time transcription options appear to be Vosk or Kaldi Gstreamer. I’ve set them both up & they appear to work well for general transcription, but I’m not sure how to handle the user-specific custom vocabularies.
Voice2json: Offline speech and intent recognition on Linux
4 projects | news.ycombinator.com | 21 May 2021
Connecting vosk python model with react
1 project | /r/speechrecognition | 21 Apr 2021

kaldi-gstreamer-server

Posts with mentions or reviews of kaldi-gstreamer-server. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2021-08-29.

Real-time full-duplex speech recognition server, based on Kaldi and GStreamer
1 project | news.ycombinator.com | 1 Dec 2022
Ask HN: What problem are you close to solving and how can we help?
17 projects | news.ycombinator.com | 29 Aug 2021
Open Source ASR with user-specific custom vocabularies?
2 projects | /r/LanguageTechnology | 17 Jul 2021

Through my research, the most promising real-time transcription options appear to be Vosk or Kaldi Gstreamer. I’ve set them both up & they appear to work well for general transcription, but I’m not sure how to handle the user-specific custom vocabularies.
Speech to text software
2 projects | /r/opensource | 18 Mar 2021

It is kind of difficult to find something like this free of charge (and open source) since the ASR service needs to be hosted somewhere. If you are really interested in the topic then you could take a lit into kaldi and its pretrained models (but kaldi is kind of difficult to learn so I don't really recommend it if you want something quick) and then you could also combine that with kaldi-gstreamer in order to set up a server which you can turn on and off whenever you like.

What are some alternatives?

When comparing vosk-server and kaldi-gstreamer-server you can also consider the following projects:

vosk-api - Offline speech recognition API for Android, iOS, Raspberry Pi and servers with Python, Java, C# and Node

espnet - End-to-End Speech Processing Toolkit

common-voice - Common Voice is part of Mozilla's initiative to help teach machines how real people speak.

Kaldi Speech Recognition Toolkit - kaldi-asr/kaldi is the official location of the Kaldi project.

TTS - 🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production

rhasspy - Offline private voice assistant for many human languages

julius - Open-Source Large Vocabulary Continuous Speech Recognition Engine

bert-for-inference - A small repo showing how to easily use BERT (or other transformers) for inference

vosk-android-demo - Offline speech recognition for Android with Vosk library.

ChessPositionRanking - Software suite for ranking chess positions and accurately estimating the number of legal chess positions

ovos-stt-plugin-vosk - vosk STT plugin for mycroft

mtpng - A parallelized PNG encoder in Rust

vosk-server vs vosk-api kaldi-gstreamer-server vs espnet vosk-server vs common-voice kaldi-gstreamer-server vs Kaldi Speech Recognition Toolkit vosk-server vs TTS kaldi-gstreamer-server vs rhasspy vosk-server vs julius kaldi-gstreamer-server vs bert-for-inference vosk-server vs vosk-android-demo kaldi-gstreamer-server vs ChessPositionRanking vosk-server vs ovos-stt-plugin-vosk kaldi-gstreamer-server vs mtpng

Compare vosk-server vs kaldi-gstreamer-server and see what are their differences.

vosk-server

kaldi-gstreamer-server

vosk-server

kaldi-gstreamer-server

What are some alternatives?