Top 3 Shell speech-to-text Projects
-
Project mention: Amazon plans to charge for Alexa in June–unless internal conflict delays revamp | news.ycombinator.com | 2024-01-20
Yeah, whisper is the closest thing we have, but even it requires more processing power than is present in most of these edge devices in order to feel smooth. I've started a voice interface project on a Raspberry Pi 4, and it takes about 3 seconds to produce a result. That's impressive, but not fast enough for Alexa.
From what I gather a Pi 5 can do it in 1.5 seconds, which is closer, so I suspect it's only a matter of time before we do have fully local STT running directly on speakers.
> Probably anathema to the space, but if the devices leaned into the ~five tasks people use them for (timers, weather, todo list?) could probably tighten up the AI models to be more accurate and/or resource efficient.
Yes, this is the approach taken by a lot of streaming STT systems, like Kaldi [0]. Rather than use a fully capable model, you train a specialized one that knows what kinds of things people are likely to say to it.
-
Project mention: Is there a software/site that would transcribe English texts into IPA (relatively long texts)? | /r/LanguageTechnology | 2023-04-20
Some of the more popular tools for this include Sequitur and Phonetisaurus. It would also probably not be too hard to train something in PyTorch/Keras.
-
WorkOS
The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
-
NoteWhispers
Voice memos recorded from the microphone, transcribed offline to text and converted to Joplin notes
Project mention: Joplin – open-source note-taking and to-do application with sync | news.ycombinator.com | 2023-07-05
Shell speech-to-text related posts
Index
What are some of the best open-source speech-to-text projects in Shell? This list will help you:
Project | Stars | |
---|---|---|
1 | Kaldi Speech Recognition Toolkit | 13,685 |
2 | Phonetisaurus | 430 |
3 | NoteWhispers | 22 |