Help picking a good speech recognition library

Our great sponsors

WorkOS - The modern identity platform for B2B SaaS

InfluxDB - Power Real-Time Data Analytics at Scale

SaaSHub - Software Alternatives and Reviews

Our great sponsors

DeepSpeech

67 24,212 0.0 C++

DeepSpeech is an open source embedded (offline, on-device) speech-to-text engine which can run in real time on devices ranging from a Raspberry Pi 4 to high power GPU servers.

https://github.com/mozilla/DeepSpeech (no longer actively supported by Mozilla but still a pretty good library, relatively easy to use, and decent out of the box accuracy)

Kaldi Speech Recognition Toolkit

22 13,685 7.4 Shell

kaldi-asr/kaldi is the official location of the Kaldi project.

https://kaldi-asr.org/ (best out of the box accuracy but it is a complicated toolkit and not beginner friendly)

WorkOS

workos.com sponsored

The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
espnet

15 7,852 10.0 Python

End-to-End Speech Processing Toolkit

https://github.com/espnet/espnet (kind of like a newer Kaldi, but also not beginner friendly)

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project