eardrum
vosk-browser
Our great sponsors
eardrum | vosk-browser | |
---|---|---|
2 | 3 | |
12 | 322 | |
- | - | |
0.0 | 0.0 | |
about 4 years ago | 3 months ago | |
Kotlin | JavaScript | |
The Unlicense | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
eardrum
-
Show HN: I record myself on audio 24x7 and use an AI to process the information
Here's a 24/7 background audio recorder app I made for Android. The impact on battery and storage is surprisingly reasonable.
https://github.com/miguelrochefort/eardrum
- Timestamped Audio Capture
vosk-browser
-
Show HN: I record myself on audio 24x7 and use an AI to process the information
Not the OP but I've been tinkering with the same concept (24/7 processing).
'm using vosk browser: https://github.com/ccoreilly/vosk-browser
To do speech to text locally and it works very well for English.
- Speech-to-Text Client-Side?
-
On-device browser translations with Firefox Translations
I believe this is called the Bergamot project, more can be found here: https://browser.mt/
The GitHub repo for it is here: https://github.com/browsermt/bergamot-translator
The repo contains some details about how to run it in WASM which is quite interesting for embedding it in pages. I've been playing around with using WASM to capture speech to text (https://github.com/ccoreilly/vosk-browser) and automatically translating it using Bergamot.
Results have been, ok. I don't think the tech is quite there yet and the speech to text obviously struggles with multiple speakers.
What are some alternatives?
haven - Haven is for people who need a way to protect their personal spaces and possessions without compromising their own privacy, through an Android app and on-device sensors
cheetah - On-device streaming speech-to-text engine powered by deep learning
openhab-addons - Add-ons for openHAB
vosk-api - Offline speech recognition API for Android, iOS, Raspberry Pi and servers with Python, Java, C# and Node
dicio-android - Dicio assistant app for Android
vosk-server - WebSocket, gRPC and WebRTC speech recognition server based on Vosk and Kaldi libraries
ovos-stt-plugin-vosk - vosk STT plugin for mycroft
react-native-vosk - Speech recognition module for react native using Vosk library
whisper - Robust Speech Recognition via Large-Scale Weak Supervision
languagetool - Style and Grammar Checker for 25+ Languages
bergamot-translator - Cross platform C++ library focusing on optimized machine translation on the consumer-grade device.