Python and Speech recognition

This page summarizes the projects mentioned and recommended in the original post on /r/learnpython

Our great sponsors
  • WorkOS - The modern identity platform for B2B SaaS
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • SaaSHub - Software Alternatives and Reviews
  • forced-alignment-tools

    A collection of links and notes on forced alignment tools

    Since you know that you have one or two phonemes in each recordings (one for vowel, two for a consonant) you will be able to find where on the recordings the utterances takes place. Which is a simplified approach of "forced alignment".

  • allosaurus

    Allosaurus is a pretrained universal phone recognizer for more than 2000 languages

    And for phonemes recognition: - this looks like it could be useful (I'm sure you won't mind if it's "phones" instead of "phonemes"): https://github.com/xinjli/allosaurus - about using standard speech recognition tools: https://cmusphinx.github.io/wiki/phonemerecognition/

  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

  • SpeechRecognition

    Speech recognition module for Python, supporting several engines and APIs, online and offline.

    I’m not who you replied to but I saw the Sphinx integration has a keyword recognizer api: https://github.com/Uberi/speech_recognition/blob/master/examples/special_recognizer_features.py

  • common-voice

    Common Voice is part of Mozilla's initiative to help teach machines how real people speak.

    Check Mozilla's common voice. It's a great project, it's easy to participate and easy to use the data. (BTW they've also released DeepSpeech for speech recognition.)

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts