vosk-browser vs vosk-server

With SurveyJS form UI libraries, you can build and style forms in a fully-integrated drag & drop form builder, render them in your JS app, and store form submission data in any backend, inc. PHP, ASP.NET Core, and Node.js.

surveyjs.io

featured

InfluxDB - Power Real-Time Data Analytics at Scale

Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

www.influxdata.com

featured

vosk-browser		vosk-server
	Project
3	Mentions	4
330	Stars	843
-	Growth	1.8%
0.0	Activity	5.5
4 months ago	Latest Commit	29 days ago
JavaScript	Language	Python
Apache License 2.0	License	Apache License 2.0

The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.

vosk-browser

Posts with mentions or reviews of vosk-browser. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2022-11-15.

Show HN: I record myself on audio 24x7 and use an AI to process the information
13 projects | news.ycombinator.com | 15 Nov 2022

Not the OP but I've been tinkering with the same concept (24/7 processing).
'm using vosk browser: https://github.com/ccoreilly/vosk-browser
To do speech to text locally and it works very well for English.
Speech-to-Text Client-Side?
1 project | news.ycombinator.com | 19 Aug 2022
On-device browser translations with Firefox Translations
5 projects | news.ycombinator.com | 10 Jul 2022

I believe this is called the Bergamot project, more can be found here: https://browser.mt/
The GitHub repo for it is here: https://github.com/browsermt/bergamot-translator
The repo contains some details about how to run it in WASM which is quite interesting for embedding it in pages. I've been playing around with using WASM to capture speech to text (https://github.com/ccoreilly/vosk-browser) and automatically translating it using Bergamot.
Results have been, ok. I don't think the tech is quite there yet and the speech to text obviously struggles with multiple speakers.

vosk-server

Posts with mentions or reviews of vosk-server. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2022-08-04.

Self-hosted audio transcription?
3 projects | /r/selfhosted | 4 Aug 2022
Open Source ASR with user-specific custom vocabularies?
2 projects | /r/LanguageTechnology | 17 Jul 2021

Through my research, the most promising real-time transcription options appear to be Vosk or Kaldi Gstreamer. I’ve set them both up & they appear to work well for general transcription, but I’m not sure how to handle the user-specific custom vocabularies.
Voice2json: Offline speech and intent recognition on Linux
4 projects | news.ycombinator.com | 21 May 2021
Connecting vosk python model with react
1 project | /r/speechrecognition | 21 Apr 2021

What are some alternatives?

When comparing vosk-browser and vosk-server you can also consider the following projects:

cheetah - On-device streaming speech-to-text engine powered by deep learning

vosk-api - Offline speech recognition API for Android, iOS, Raspberry Pi and servers with Python, Java, C# and Node

common-voice - Common Voice is part of Mozilla's initiative to help teach machines how real people speak.

ovos-stt-plugin-vosk - vosk STT plugin for mycroft

kaldi-gstreamer-server - Real-time full-duplex speech recognition server, based on the Kaldi toolkit and the GStreamer framwork.

react-native-vosk - Speech recognition module for react native using Vosk library

TTS - 🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production

haven - Haven is for people who need a way to protect their personal spaces and possessions without compromising their own privacy, through an Android app and on-device sensors

julius - Open-Source Large Vocabulary Continuous Speech Recognition Engine

whisper - Robust Speech Recognition via Large-Scale Weak Supervision

vosk-android-demo - Offline speech recognition for Android with Vosk library.

vosk-browser vs cheetah vosk-server vs vosk-api vosk-browser vs vosk-api vosk-server vs common-voice vosk-browser vs ovos-stt-plugin-vosk vosk-server vs kaldi-gstreamer-server vosk-browser vs react-native-vosk vosk-server vs TTS vosk-browser vs haven vosk-server vs julius vosk-browser vs whisper vosk-server vs vosk-android-demo

Compare vosk-browser vs vosk-server and see what are their differences.

vosk-browser

vosk-server

vosk-browser

vosk-server

What are some alternatives?