tacotron2
RHVoice
Our great sponsors
tacotron2 | RHVoice | |
---|---|---|
28 | 13 | |
4,890 | 1,425 | |
1.2% | 2.9% | |
0.0 | 8.1 | |
4 months ago | 12 days ago | |
Jupyter Notebook | C++ | |
BSD 3-clause "New" or "Revised" License | GNU General Public License v3.0 only |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
tacotron2
- [D] What is the best open source text to speech model?
-
[D] The model used in the AI generated Jay-z vocals
Which might use https://github.com/NVIDIA/tacotron2 in their backend
-
Can anyone reccomend any free voice cloning software/websites even if it provided limited word options
One thing is uberduck.ai but I think it's freemium (it's free but some features are premium). There's also tacotron 2.0 and its pytorch page. Many other softwares on sub but tacotron gave this and this and this.
-
Sauron be spitting bars
Maybe we can use AI to hear this rapped by a famous rapper?
-
Kerfuś
Sadly GothicBot the TTS I knew, doesn't exist anymore, but here is an alternative. It works in polish from what I heard.
-
How far are we from being able to clone a singers voice?
From what I’ve seen, NVIDIA’s Tacotron2 can already be used to create some pretty convincing singing.
-
Is it possible to make compelling synthesized speech with fairly low-quality recordings?
You might want to try something like Tacotron 2 by Nvidia to experiment with your current data.
-
What voice-changing apps are available right now?
We have the TorToiSe repo, the SV2TTS repo, and from here you have the other models like Tacotron 2, FastSpeech 2, and such. A there is a lot that goes into training a baseline for these models on the LJSpeech and LibriTTS datasets. Fine tuning is left up to the user.
- The OG (OC)
-
XQC Falls for dono thinkng its Adept
I had tried tacotron2 + waveglow and it's quite easy to get very good results. The hardest part is collecting clean data.
RHVoice
- StyleTTS2 – open-source Eleven Labs quality Text To Speech
-
⟳ 4 apps added, 28 updated at f-droid.org
RHVoice - a free and open source speech synthesize (version 1.8.0): TTS engine with extended languages support (incl. Russian)
-
Balacoon: python package for text-to-speech
Interesting. So some random questions - how easy is it to make a new voice? What about a new voice in a new language? - ever looked at SAPI? Is it possible to make a SAPI bridge for this on windows? - how does it fit with other systems. Like coqui and RHvoice? https://github.com/RHVoice/RHVoice
-
Extra voices for windows?
I like the voices from RHvoice https://rhvoice.org
-
Translate app with speech to text, text to speech?
I use this: https://rhvoice.org/
-
Major Text to Speech upgrades for 64 bit devices
I have tried RHVoice on Android, and it works okay.
-
TTS engine that allows me to add my own MSI files
try these and which works https://github.com/RHVoice/RHVoice
-
⟳ 2 apps added, 55 updated at f-droid.org
RHVoice - a free and open source speech synthesize (version 1.6.0): TTS engine with extended languages support (incl. Russian)
- Dicio: Free and open source voice assistant for Android
-
HERE WeGo apparently no longer supports voice navigation on devices without Google, and OSM isn't good enough in my area. Do I have any options besides switching to iOS, getting a standalone GPS, or using Google products?
Also, there are instructions on how to create new voices for RHVoice, if you're interested. It'll sound really unnatural, but at least it's not Google! https://github.com/RHVoice/RHVoice/wiki
What are some alternatives?
tortoise-tts - A multi-voice TTS system trained with an emphasis on quality
espeak-ng - eSpeak NG is an open source speech synthesizer that supports more than hundred languages and accents.
Voice-Cloning-App - A Python/Pytorch app for easily synthesising human voices
Real-Time-Voice-Cloning - Clone a voice in 5 seconds to generate arbitrary speech in real-time
TensorVox - Desktop application for neural speech synthesis written in C++
TTS - 🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
NeMo - A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
TTS - :robot: :speech_balloon: Deep learning for Text to Speech (Discussion forum: https://discourse.mozilla.org/c/tts)
waveglow - A Flow-based Generative Network for Speech Synthesis
luci - LuCI - OpenWrt Configuration Interface