pocketsphinx
native-messaging-espeak-ng
pocketsphinx | native-messaging-espeak-ng | |
---|---|---|
6 | 21 | |
3,745 | 4 | |
0.9% | - | |
7.4 | 6.7 | |
about 1 month ago | 10 months ago | |
C | JavaScript | |
GNU General Public License v3.0 or later | - |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
pocketsphinx
- [Discussion] Looking for an Open-Source Speech to Text model (english) that captures filler words, pauses and also records timestamps for each word.
-
I Created A Web Speech API NPM Package Called SpeechKit
There are espeak-ng https://github.com/espeak-ng/espeak-ng and pocketsphinx https://github.com/cmusphinx/pocketsphinx which can be used locally without making external requests.
-
"Why not just transcribe the audio?" I thought
And so I installed PocketSphinx, "one of Carnegie Mellon University's open source large vocabulary, speaker-independent continuous speech recognition engines."
-
How to train large deep learning models as a startup
- https://github.com/cmusphinx/pocketsphinx
This avoids having to stream audio 24x7 to a cloud model which would be super expensive. This being said, I'm pretty sure what the Alexa does, for example, is send any positive wake word to a cloud model (that is bigger and more accurate) to verify the prediction of the local wake word detection model AFAIK.
- Speech recognition library for financial markets
-
Speech recognition
PocketSphinx is generally regarded among voice assistant communities as a less reliable, but straight OOTB, alternative to a robust listener. It's a good solution when you want multiple hotwords (or just aren't in a position to train even one word.)
native-messaging-espeak-ng
-
Have we reached a point of no return on managing software dependencies?
I'm just trying to use coqui-ai/TTS so I can stream speech synthesis output to the browser as I do with eSpeak NG https://github.com/guest271314/native-messaging-espeak-ng. I think the issue has been brought up before on GitHub. I have not read a solution. I am ready to try again if you can suggest a minimal build process.
-
Deno should target the browser officially
You can use a Native Messaging host to run local code controlled from the browser. See native-messaging-deno for a general purpose and extensible solution and deno-server where Deno's serveTls is dynamically started to run a local application, stream stdout from the application to the browser, then stop the local server.
- Streaming speech synthesis output to the browser using Bash with GNU head and Native Messaging
-
Execute Terminal Commands and Receive Live Output with React SSE
A single page Deno server can be found here https://github.com/guest271314/native-messaging-espeak-ng/blob/deno-server/local_server.js. I have used this source code // https://github.com/chcunningham/atomics-post-message/blob/main/server.js, renamed to server.msj modified to use Ecmascript Modules instead of CommonJS, e.g.,
-
IAMA senior javascript dev, ask me anything
I've already achieved the requirement multiple ways, already; from using Native Messaging https://github.com/guest271314/native-messaging-espeak-ng, to using GNU Core Utilities tail, to Deno.watchFs() https://github.com/guest271314/fs, et al., see captureSystemAudio. The one approach I have not yet achieved is compiling to Emscripten - with SSML support.
-
how to fix these errors when trying to request from a rest API?
Create a self-signed certificate. If you are on Chromium or Chrome launch with --ignore-certificate-errors-spki-list=.... Read this https://github.com/GoogleChrome/samples/blob/gh-pages/webtransport/webtransport_server.py#L42-L72. This is how I use HTTPS for Deno and Node local servers and WebTransport https://github.com/guest271314/native-messaging-espeak-ng/tree/deno-server.
-
Which backend JavaScript framework is the one you use ?
I use the source code for Deno's serveTls https://github.com/guest271314/native-messaging-espeak-ng/blob/deno-server/local_server.js and wrote a Web server module for QuickJS https://github.com/guest271314/webserver-c/tree/quickjs-webserver.
-
[Express] - How to have a self-updating display in browser window? Template Engines sufficient? Or use Vue/Angular/React?]
This https://github.com/guest271314/native-messaging-espeak-ng/tree/deno-server is what I do using Deno
-
Web Speech API is (still) broken on Linux circa 2023
I created https://github.com/guest271314/native-messaging-espeak-ng which provides a means to send text or SSML to the eSpeak NG speech synthesis engine and parse the generated WAV in the browser. That bypasses waiting around another N years for Google to prioritize Web Speech API, which I see no evidence of Google doing - except for its cloud service.
-
Build a Text-to-Speech component in React
I merely read the article to see how the author was implementing "Text-to-Speech"; how they implemented "Text-to-Speech"; e.g., native-messaging-espeak-ng to overcome or avoid the multiple issues and limitations with using the specified Web Speech API in the browser.
What are some alternatives?
vosk - VOSK Speech Recognition Toolkit
GoogleNetworkSpeechSynthesis - Google's Network Speech Synthesis: Bring your own Google API key and proxy
snowboy - Future versions with model training module will be maintained through a forked version here: https://github.com/seasalt-ai/snowboy
DeepSpeech - DeepSpeech is an open source embedded (offline, on-device) speech-to-text engine which can run in real time on devices ranging from a Raspberry Pi 4 to high power GPU servers.
vosk-api - Offline speech recognition API for Android, iOS, Raspberry Pi and servers with Python, Java, C# and Node
speechd - Common high-level interface to speech synthesis
Spoken-Keyword-Spotting - In this repository, we explore using a hybrid system consisting of a Convolutional Neural Network and a Support Vector Machine for Keyword Spotting task.
speech-api - Web Speech API
localcroft - Bits for locally-served Mycroft instances
AudioWorkletStream - fetch() => ReadableStream => AudioWorklet
C_to_Python_translator - Using File I/O we were able to convert C code written in one text file to Python code in another text file with the application of multiple function that could identify and accordingly process specific key words and formats used in the C language.
webserver-c - A simple HTTP webserver written in C.