STT
LocalSTT
Our great sponsors
STT | LocalSTT | |
---|---|---|
11 | 5 | |
2,131 | 85 | |
2.7% | - | |
0.6 | 0.0 | |
about 2 months ago | over 2 years ago | |
C++ | Java | |
Mozilla Public License 2.0 | GNU General Public License v3.0 only |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
STT
-
Rest in Peas: The Unrecognized Death of Speech Recognition (2010)
What has happened since then? I know Common Voice has come and gone https://en.wikipedia.org/wiki/Common_Voice https://github.com/coqui-ai/STT
And I've seen some neural approaches too
No idea where to look for comparisons though.
-
Numen - FOSS voice control for handsfree computing
I basically just used coqui stt https://github.com/coqui-ai/STT
-
Are there any OCR and Speech-to-Text services that are privacy friendly?
This speech-to-text works well: https://github.com/coqui-ai/STT. openai's "whisper" is probably better but I haven't tried it: https://towardsdatascience.com/transcribe-audio-files-with-openais-whisper-e973ae348aa7
-
Introducing Whisper
I use two SST to live-translate audio that I listen to so I can look back (in paragraph form) to see things that I or the youtube has previously said: https://github.com/coqui-ai/STT https://github.com/ratwithacompiler/OBS-captions-plugin
-
You can now tether any prod Vector to Wire's Open Source Escape Pod • thedroidyouarelookingfor
I did have to install Coqui STT and go-asticoqui manually before i was able to run Chipper.
-
Currently working on a custom Virtual Assistant ('Randy') to help automate things in my shed (mainly CNC equipment) and also perform basic tasks. This morning I was able to get it to publish events on my google calendar.
What do you use as STT? I have heard good things about coqui (https://github.com/coqui-ai/STT) and will use it for my Assistant-build.
- Speech to Text Best Resource
-
I put together a tutorial and overview on how to use DeepSpeech to do Speech Recognition in Python
If anyone is looking for a maintained version of DeepSpeech, checkout Coqui's repositories for STT and TTS. Coqui is lead by the engineers that used to work on DeepSpeech at Mozilla.
-
CoquiTTS: 🐸💬 - Open Source Text-to-Speech framework.
Link: https://github.com/coqui-ai/STT
- Mozilla Common Voice Adds 16 New Languages and 4,600 New Hours of Speech
LocalSTT
-
Known drawbacks to CalyxOS?
There is already a way to piece it all together so that Kõnele, which normally needs a remote backend, just acts as a frontend for Vosk instead. I actually have this working on my phone, but, err, I think I'm forgetting one piece of the puzzle that needs downloading. Actually, I think you need to install a fork of this that supports English, unless you're fine with speaking Catalan. I forget which fork that is, however.
-
How long did it take you to adjust to a de-googled phone?... Big hurdles I am struggling with.
For an experimental Speechtotext, you can search localstt on github and use it with kõnele(F-Droid). And set it up as voice assistant in default apps settings. Not a great alternative, but something to fallback on. LocalSTT
- Is possible to replace Google Speech To Text and TTS with VOSK and LARYNX ?
- Interesting Project: LocalSTT
-
Coqui, a startup providing open speech tech for everyone
You might want something like LocalSTT if it's on mobile: https://github.com/ccoreilly/LocalSTT
What are some alternatives?
DeepSpeech - DeepSpeech is an open source embedded (offline, on-device) speech-to-text engine which can run in real time on devices ranging from a Raspberry Pi 4 to high power GPU servers.
vosk-android-demo - Offline speech recognition for Android with Vosk library.
TTS - 🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
NeMo - A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
STT-examples - 🐸STT integration examples
vosk-api - Offline speech recognition API for Android, iOS, Raspberry Pi and servers with Python, Java, C# and Node
deepspeech-playbook - A crash course for training speech recognition models using DeepSpeech.
TTS - :robot: :speech_balloon: Deep learning for Text to Speech (Discussion forum: https://discourse.mozilla.org/c/tts)
K6nele-service - Kõnele service is an Android app that offers a speech-to-text service to other apps, in particular to Kõnele. It implements SpeechRecognizer, backed by an open source speech recognition server software https://github.com/alumae/kaldi-gstreamer-server.
OBS-captions-plugin - Closed Captioning OBS plugin using Google Speech Recognition