speech-recognition-uk
dc_tts
speech-recognition-uk | dc_tts | |
---|---|---|
1 | 4 | |
297 | 1,150 | |
- | - | |
6.3 | 0.0 | |
4 months ago | about 1 year ago | |
Python | Python | |
- | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
speech-recognition-uk
dc_tts
-
Recommendation: This sub should have a Wiki with resources to help noobs get started
I've been trying to find out what the most popular tools are for vocal synthesis. I've stumbled upon https://github.com/NVIDIA/tacotron2 and https://github.com/Kyubyong/dc_tts but those git repos haven't been updated since June 2020 and April 2018 respectively. Does anyone know what the most common tools are that folks use to do vocal synthesis?
- [D] Why does a relatively small batch number and neural network use up so much memory?
- [D] Making a text to speech model from scratch? (Deep learning)
-
NVIDIA Jarvis and its text-to-speech pipeline
For custom voices, you will need a dataset. My fav custom TTS is still: https://github.com/Kyubyong/dc_tts
What are some alternatives?
react-native-spokestack - Spokestack: give your React Native app a voice interface!
tacotron2 - Tacotron 2 - PyTorch implementation with faster-than-realtime inference
lingvo - Lingvo
Voice-Cloning-App - A Python/Pytorch app for easily synthesising human voices
NeMo - A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
ukrainian-onnx-model - An ONNX model for speech recognition of the Ukrainian language
aeneas - aeneas is a Python/C library and a set of tools to automagically synchronize audio and text (aka forced alignment)
silero-models - Silero Models: pre-trained speech-to-text, text-to-speech and text-enhancement models made embarrassingly simple
TTS - 🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
spokestack-python - Spokestack is a library that allows a user to easily incorporate a voice interface into any Python application with a focus on embedded systems.
MockingBird - 🚀AI拟声: 5秒内克隆您的声音并生成任意语音内容 Clone a voice in 5 seconds to generate arbitrary speech in real-time