tts-tortoise-gradio
lingvo
tts-tortoise-gradio | lingvo | |
---|---|---|
1 | 1 | |
42 | 2,777 | |
- | -0.1% | |
1.4 | 8.5 | |
12 months ago | 5 days ago | |
Python | Python | |
Apache License 2.0 | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
tts-tortoise-gradio
lingvo
-
Voice assistant that can be taught how to swear (Part 1)
To calculate the Word Error Rate I took a python script from the tensorflow/lingvo project and rewrote it in js. In essence, it is just a simple solution of the Edit Distance task, in addition to error calculation for each of the three types: deletion, insertion, and replacement. In the end, I did not the most intelligent method of comparing texts, and yet it was sufficient enough to later on add parameters to queries to Speech-to-Tex.
What are some alternatives?
wavegrad - A fast, high-quality neural vocoder.
TTS-Voice-Wizard - Speech to Text to Speech. Song now playing. Sends text as OSC messages to VRChat to display on avatar. (STTTS) (Speech to TTS) (VRC STT System) (VTuber TTS)
gTTS - Python library and CLI tool to interface with Google Translate's text-to-speech API
seq2seq - A general-purpose encoder-decoder framework for Tensorflow
NATSpeech - A Non-Autoregressive Text-to-Speech (NAR-TTS) framework, including official PyTorch implementation of PortaSpeech (NeurIPS 2021) and DiffSpeech (AAAI 2022)
allosaurus - Allosaurus is a pretrained universal phone recognizer for more than 2000 languages
diffwave - DiffWave is a fast, high-quality neural vocoder and waveform synthesizer.
awesome-speech-recognition-speech-synthesis-papers - Automatic Speech Recognition (ASR), Speaker Verification, Speech Synthesis, Text-to-Speech (TTS), Language Modelling, Singing Voice Synthesis (SVS), Voice Conversion (VC)
voice100 - Voice100 includes neural TTS/ASR models. Inference of Voice100 is low cost as its models are tiny and only depend on CNN without autoregression.
Mava - 🦁 A research-friendly codebase for fast experimentation of multi-agent reinforcement learning in JAX
aeneas - aeneas is a Python/C library and a set of tools to automagically synchronize audio and text (aka forced alignment)
deepspeech-playbook - A crash course for training speech recognition models using DeepSpeech.