Our great sponsors
lingvo | pykaldi | |
---|---|---|
1 | 2 | |
2,780 | 978 | |
0.2% | 0.6% | |
8.7 | 5.7 | |
15 days ago | about 1 month ago | |
Python | Python | |
Apache License 2.0 | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
lingvo
-
Voice assistant that can be taught how to swear (Part 1)
To calculate the Word Error Rate I took a python script from the tensorflow/lingvo project and rewrote it in js. In essence, it is just a simple solution of the Edit Distance task, in addition to error calculation for each of the three types: deletion, insertion, and replacement. In the end, I did not the most intelligent method of comparing texts, and yet it was sufficient enough to later on add parameters to queries to Speech-to-Tex.
pykaldi
-
Speech recognition using Python
It's possible, but it's going to be tough. Your best bet at the moment would probably be looking at things like PyKaldi (a Python wrapper for Kaldi) and speech recognition tutorials for TensorFlow. You can get into WebRTC a little later--if I understand what your project is about, it'll mainly be for embedding the speech recognition stuff into the larger app you mention.
- Speech recognition project using Python
What are some alternatives?
TTS-Voice-Wizard - Speech to Text to Speech. Song now playing. Sends text as OSC messages to VRChat to display on avatar. (STTTS) (Speech to TTS) (VRC STT System) (VTuber TTS)
speechpy - :speech_balloon: SpeechPy - A Library for Speech Processing and Recognition: http://speechpy.readthedocs.io/en/latest/
seq2seq - A general-purpose encoder-decoder framework for Tensorflow
ovos-stt-plugin-vosk - vosk STT plugin for mycroft
allosaurus - Allosaurus is a pretrained universal phone recognizer for more than 2000 languages
zeroth - Kaldi-based Korean ASR (한국어 음성인식) open-source project
awesome-speech-recognition-speech-synthesis-papers - Automatic Speech Recognition (ASR), Speaker Verification, Speech Synthesis, Text-to-Speech (TTS), Language Modelling, Singing Voice Synthesis (SVS), Voice Conversion (VC)
lhotse - Tools for handling speech data in machine learning projects.
Mava - 🦁 A research-friendly codebase for fast experimentation of multi-agent reinforcement learning in JAX
image_feature_extraction - A collection of python classes for feature extractions. The features are calculated inside a Region of Interest (ROI) and not for the whole image: the image is trully a polygon!
deepspeech-playbook - A crash course for training speech recognition models using DeepSpeech.
vosk-server - WebSocket, gRPC and WebRTC speech recognition server based on Vosk and Kaldi libraries