Speech Recognition Training Data Tools?

This page summarizes the projects mentioned and recommended in the original post on /r/LanguageTechnology

Our great sponsors
  • WorkOS - The modern identity platform for B2B SaaS
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • SaaSHub - Software Alternatives and Reviews
  • aeneas

    aeneas is a Python/C library and a set of tools to automagically synchronize audio and text (aka forced alignment)

    In case you have let's say: a 20min entry from an audio book, and the sentences seperatly in a txt file and you want to cut the sentences out of the audio manually you can look at a tool like aeneas. If you still have to annotated all your data yourself i do not really know a tool for this :/

  • DeepSpeech

    DeepSpeech is an open source embedded (offline, on-device) speech-to-text engine which can run in real time on devices ranging from a Raspberry Pi 4 to high power GPU servers.

    I'd also recommend to go on Mozilla Discourse for deepspeech. Basically a little forum with the developers and other STT enthousiasts. Before you open a problem it would be good to go trough the documentation first and have a look at their github because sometimes people ask stuff that might be explained in there and in that case the people on the forum will just ask you whether you've read the documentation or not. You can also ask for general advice etc over there.

  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts