lhotse
pykaldi
lhotse | pykaldi | |
---|---|---|
1 | 2 | |
866 | 978 | |
4.5% | 0.4% | |
9.0 | 5.4 | |
1 day ago | 4 days ago | |
Python | Python | |
Apache License 2.0 | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
lhotse
pykaldi
-
Speech recognition using Python
It's possible, but it's going to be tough. Your best bet at the moment would probably be looking at things like PyKaldi (a Python wrapper for Kaldi) and speech recognition tutorials for TensorFlow. You can get into WebRTC a little later--if I understand what your project is about, it'll mainly be for embedding the speech recognition stuff into the larger app you mention.
- Speech recognition project using Python
What are some alternatives?
comcrawl - A python utility for downloading Common Crawl data
speechpy - :speech_balloon: SpeechPy - A Library for Speech Processing and Recognition: http://speechpy.readthedocs.io/en/latest/
EmotiVoice - EmotiVoice π: a Multi-Voice and Prompt-Controlled TTS Engine
lingvo - Lingvo
thunder-speech - A Hackable speech recognition library.
ovos-stt-plugin-vosk - vosk STT plugin for mycroft
SALMONN - SALMONN: Speech Audio Language Music Open Neural Network
zeroth - Kaldi-based Korean ASR (νκ΅μ΄ μμ±μΈμ) open-source project
image_feature_extraction - A collection of python classes for feature extractions. The features are calculated inside a Region of Interest (ROI) and not for the whole image: the image is trully a polygon!
vosk-server - WebSocket, gRPC and WebRTC speech recognition server based on Vosk and Kaldi libraries