vosk-browser
ovos-stt-plugin-vosk
vosk-browser | ovos-stt-plugin-vosk | |
---|---|---|
3 | 1 | |
326 | 14 | |
- | - | |
0.0 | 2.9 | |
4 months ago | 4 months ago | |
JavaScript | Python | |
Apache License 2.0 | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
vosk-browser
-
Show HN: I record myself on audio 24x7 and use an AI to process the information
Not the OP but I've been tinkering with the same concept (24/7 processing).
'm using vosk browser: https://github.com/ccoreilly/vosk-browser
To do speech to text locally and it works very well for English.
- Speech-to-Text Client-Side?
-
On-device browser translations with Firefox Translations
I believe this is called the Bergamot project, more can be found here: https://browser.mt/
The GitHub repo for it is here: https://github.com/browsermt/bergamot-translator
The repo contains some details about how to run it in WASM which is quite interesting for embedding it in pages. I've been playing around with using WASM to capture speech to text (https://github.com/ccoreilly/vosk-browser) and automatically translating it using Bergamot.
Results have been, ok. I don't think the tech is quite there yet and the speech to text obviously struggles with multiple speakers.
ovos-stt-plugin-vosk
-
Slow responses from picroft
for STT there is streaming support which should improve things, google cloud is supported in mycroft-core, but there are some plugins out there that support streaming like vosk
What are some alternatives?
cheetah - On-device streaming speech-to-text engine powered by deep learning
vosk-server - WebSocket, gRPC and WebRTC speech recognition server based on Vosk and Kaldi libraries
vosk-api - Offline speech recognition API for Android, iOS, Raspberry Pi and servers with Python, Java, C# and Node
wenet - Production First and Production Ready End-to-End Speech Recognition Toolkit
pykaldi - A Python wrapper for Kaldi
react-native-vosk - Speech recognition module for react native using Vosk library
werpy - 🐍📦 Rapidly calculate and analyze the Word Error Rate (WER) with this powerful yet lightweight Python package.
haven - Haven is for people who need a way to protect their personal spaces and possessions without compromising their own privacy, through an Android app and on-device sensors
mock-backend - A Flask personal backend alternative for running your own version of https://home.mycroft.ai
whisper - Robust Speech Recognition via Large-Scale Weak Supervision
elograf - Utility for launching and configuring nerd-dictation