Our great sponsors
-
SurveyJS
Open-Source JSON Form Builder to Create Dynamic Forms Right in Your App. With SurveyJS form UI libraries, you can build and style forms in a fully-integrated drag & drop form builder, render them in your JS app, and store form submission data in any backend, inc. PHP, ASP.NET Core, and Node.js.
-
DeepSpeech
DeepSpeech is an open source embedded (offline, on-device) speech-to-text engine which can run in real time on devices ranging from a Raspberry Pi 4 to high power GPU servers.
You mean like https://mycroft.ai/ ?
Consider looking at https://github.com/mozilla/DeepSpeech , you can get pre-compiled versions of it and there are versions that will run on a Raspberry Pi, and yes... it's all local, but your mileage may vary. And there is also https://picovoice.ai/ they run stuff locally on the machine, but again each use a constrained local language model/syntax. The other real question, as https://www.reddit.com/user/eduncan911/ correctly states is the use of the wake-word... most systems process all sound, ie. are listening all the time, a number of the Alexa or Google Assistants or similar approaches embed a smaller model or use hardware/neural networks to recognize the wake-word before passing on sound to further syntax processing, so think of most of these devices as always listening and processing and you'd be right, so factor that into power usage etc.
Consider looking at https://github.com/mozilla/DeepSpeech , you can get pre-compiled versions of it and there are versions that will run on a Raspberry Pi, and yes... it's all local, but your mileage may vary. And there is also https://picovoice.ai/ they run stuff locally on the machine, but again each use a constrained local language model/syntax. The other real question, as https://www.reddit.com/user/eduncan911/ correctly states is the use of the wake-word... most systems process all sound, ie. are listening all the time, a number of the Alexa or Google Assistants or similar approaches embed a smaller model or use hardware/neural networks to recognize the wake-word before passing on sound to further syntax processing, so think of most of these devices as always listening and processing and you'd be right, so factor that into power usage etc.
Related posts
- Researchers find Amazon uses Alexa voice data to target you with ads
- FLaNK Stack Weekly 12 February 2024
- Lightning AI Studios – A persistent GPU cloud environment
- Exploring Open-Source Alternatives to Landing AI for Robust MLOps
- With the recent developments, It looks like AI art is finally beginning to evolve in the right direction