Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality. Learn more →
Shell Kaldi Projects
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
Project mention: Amazon plans to charge for Alexa in June–unless internal conflict delays revamp | news.ycombinator.com | 2024-01-20Yeah, whisper is the closest thing we have, but even it requires more processing power than is present in most of these edge devices in order to feel smooth. I've started a voice interface project on a Raspberry Pi 4, and it takes about 3 seconds to produce a result. That's impressive, but not fast enough for Alexa.
From what I gather a Pi 5 can do it in 1.5 seconds, which is closer, so I suspect it's only a matter of time before we do have fully local STT running directly on speakers.
> Probably anathema to the space, but if the devices leaned into the ~five tasks people use them for (timers, weather, todo list?) could probably tighten up the AI models to be more accurate and/or resource efficient.
Yes, this is the approach taken by a lot of streaming STT systems, like Kaldi [0]. Rather than use a fully capable model, you train a specialized one that knows what kinds of things people are likely to say to it.
[0] http://kaldi-asr.org/
Shell Kaldi related posts
- Amazon plans to charge for Alexa in June–unless internal conflict delays revamp
- Unsupervised (Semi-Supervised) ASR/STT training recipes
- Steve's Explanation of the Viterbi Algorithm
- The Advantages and disadvantages of In-House Speech Acknowledgment
- xbp-src to only cross compile 32-bit
- Machine Learning with Unix Pipes
- Lexicap: Lex Fridman Podcast Whisper Captions by Andrej Karpathy
-
A note from our sponsor - InfluxDB
www.influxdata.com | 23 Apr 2024
Index
Project | Stars | |
---|---|---|
1 | Kaldi Speech Recognition Toolkit | 13,685 |
2 | vosk-build-model | 53 |
Sponsored