pykaldi
image_feature_extraction
pykaldi | image_feature_extraction | |
---|---|---|
2 | 1 | |
978 | 0 | |
0.4% | - | |
5.4 | 0.0 | |
4 days ago | almost 3 years ago | |
Python | Python | |
Apache License 2.0 | MIT License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
pykaldi
-
Speech recognition using Python
It's possible, but it's going to be tough. Your best bet at the moment would probably be looking at things like PyKaldi (a Python wrapper for Kaldi) and speech recognition tutorials for TensorFlow. You can get into WebRTC a little later--if I understand what your project is about, it'll mainly be for embedding the speech recognition stuff into the larger app you mention.
- Speech recognition project using Python
image_feature_extraction
-
Class or Functions?
It’s a little of both. You can check out https://github.com/giakou4/image_feature_extraction Each way I approach the issue, both ways seem correct to me. Classes seem a good way to organise but functions are easy for debugging.
What are some alternatives?
speechpy - :speech_balloon: SpeechPy - A Library for Speech Processing and Recognition: http://speechpy.readthedocs.io/en/latest/
fishington.io-bot - Fishington.io bot with OpenCV and NumPy
lingvo - Lingvo
eulerian-remote-heartrate-detection - Remote heart rate detection through Eulerian magnification of face videos
ovos-stt-plugin-vosk - vosk STT plugin for mycroft
GPT4RoI - GPT4RoI: Instruction Tuning Large Language Model on Region-of-Interest
zeroth - Kaldi-based Korean ASR (한국어 음성인식) open-source project
Fast-Poisson-Image-Editing - A fast poisson image editing implementation that can utilize multi-core CPU or GPU to handle a high-resolution image input.
lhotse - Tools for handling speech data in machine learning projects.
towhee - Towhee is a framework that is dedicated to making neural data processing pipelines simple and fast.