Our great sponsors
-
DeepSpeech
DeepSpeech is an open source embedded (offline, on-device) speech-to-text engine which can run in real time on devices ranging from a Raspberry Pi 4 to high power GPU servers.
-
WorkOS
The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
You could use something like Mozilla DeepSpeech for this purpose (https://github.com/mozilla/DeepSpeech). It's open source and solely relies on Mozillas Common Voice data and as far as I know it's compatible with Mycroft.ai. The only problem for the outlined workflow would be, that the Model is very bad at recognizing any names (e.g. of bands or musicians) whatsoever, as it only knows dictionary vocabulary afaik