Our great sponsors
-
kaldi-gstreamer-server
Real-time full-duplex speech recognition server, based on the Kaldi toolkit and the GStreamer framwork.
-
WorkOS
The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
It is kind of difficult to find something like this free of charge (and open source) since the ASR service needs to be hosted somewhere. If you are really interested in the topic then you could take a lit into kaldi and its pretrained models (but kaldi is kind of difficult to learn so I don't really recommend it if you want something quick) and then you could also combine that with kaldi-gstreamer in order to set up a server which you can turn on and off whenever you like.
There is also rhasspy which you can host locally (e.g. docker container, raspberry pi) and it will work similarly to the above. This may be an overkill since it does a lot of other things. In general, what you are trying to do is called 'dictation', so maybe if you browse github with that keyword you may find something better. Personally, I am not aware of a simple cli tool which could do that but I bet someone has created it since the technology and the pretrained models exist.