Our great sponsors
-
DeepSpeech
DeepSpeech is an open source embedded (offline, on-device) speech-to-text engine which can run in real time on devices ranging from a Raspberry Pi 4 to high power GPU servers.
-
WorkOS
The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
https://github.com/mozilla/DeepSpeech (no longer actively supported by Mozilla but still a pretty good library, relatively easy to use, and decent out of the box accuracy)
https://kaldi-asr.org/ (best out of the box accuracy but it is a complicated toolkit and not beginner friendly)
https://github.com/espnet/espnet (kind of like a newer Kaldi, but also not beginner friendly)
NOTE:
The number of mentions on this list indicates mentions on common posts plus user suggested alternatives.
Hence, a higher number means a more popular project.
Related posts
- How to get high-quality, low-cost Speech-to-Text transcription?
- Voice input option for AnySoftKeyboard?
- [D] ASR/Automatic Speech Recognition toolkit that provides precise word-level timing data? (eg, where in the audio stream a word starts and ends?)
- Show HN: AI Dub Tool I Made to Watch Foreign Language Videos with My 7-Year-Old
- help with script (beginner)