Voice-Cloning-App
razdel
Voice-Cloning-App | razdel | |
---|---|---|
18 | 1 | |
1,247 | 244 | |
- | 0.4% | |
0.0 | 2.1 | |
about 1 year ago | 10 months ago | |
Python | Python | |
BSD 3-clause "New" or "Revised" License | MIT License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
Voice-Cloning-App
- AI-genereeritud Politseikroonika
- Making Voices For System Members
- [Discussion] Is there any open-source alternative to voice.ai ? Looking for open-source speech to speech AI
-
Voice actor I need died a decade ago. Is there a program which can create text-to-voice with the voice of a specific person through providing the software voice samples to work from?
I then feed the audio and transcriptions into this tool which handles assembling the dataset, and training s voice from that dataset.
-
Trying to get it working
This Voice CLoning App is much easier to use then others, maybe you can give it a try?
-
What's the most effective voice-cloning tool these days?
Although I had to fix some errors before I got it to work (mainly due to my data containing characters that do not appear in the alphabet), I achieved exceptional results with https://github.com/BenAAndrew/Voice-Cloning-App. It uses Tacotron2 and has a useful web GUI.
-
Silero V3: fast high-quality text-to-speech in 20 languages with 173 voices
Nice to see this here - Silero is also the engine that powers the "dataset builder" for Voice-Cloning-App (https://github.com/BenAAndrew/Voice-Cloning-App), a GUI TTS system that modifies Tacotron2 slightly.
Just sharing the links in case others are new to the space and keen to tinker on some solid open-source offerings.
- Show HN: Voice Clones for Creators
-
Is there a guide to train talknet locally?
If you have an Nvidia card you can try this one
-
Accel World Post Anime Fan Adaptation Progress Update #2: DeepFake AI Voice Creation Going Well. Looking for help.
Voice cloning app github:https://github.com/BenAAndrew/Voice-Cloning-App
razdel
-
Silero V3: fast high-quality text-to-speech in 20 languages with 173 voices
Also currently we abandoned batching, so GPUs are not really required at all.
> the quality (as in: what I'm hearing, not a formally measured metric) is good but (YMMV) not as good as turtle.
I believe the compute required during training and inference … may differ by 3 or 4 orders of magnitude (!).
Also note, that some speakers and languages just sound better due to high quality of source material and the amount of work invested and polish.
> it breaks with strange error messages if the text you feed it is too long
Well, there should be a warning somewhere, but it works with text no longer than 512-1024 symbols.
> there is mention of "a model for text repunctuation and recapitalization", which I wonder if it could be used to break a very long text (eg a book) into pieces that can be digested by the tts engine
This model only restores some punctuation marks and capital letters.
There are libraries like razdel for this - https://github.com/natasha/razdel
What are some alternatives?
tacotron2 - Tacotron 2 - PyTorch implementation with faster-than-realtime inference
silero-models - Silero Models: pre-trained speech-to-text, text-to-speech and text-enhancement models made embarrassingly simple
Voice_cloner - A guide to clone anyone's voice and use it as a text-to-speech with android
ttsprech - Simple text2speech for the command line
vall-e - An unofficial PyTorch implementation of the audio LM VALL-E
wtpsplit - Code for Where's the Point? Self-Supervised Multilingual Punctuation-Agnostic Sentence Segmentation
Awesome-DeepFake-Learning - The approach I work on DeepFake.
spaCy - 💫 Industrial-strength Natural Language Processing (NLP) in Python
add-stress-to-epub - A program that sets the stress and the letter ё of Russian text and ebooks using Wiktionary data and grammar analysis.
TTS - :robot: :speech_balloon: Deep learning for Text to Speech (Discussion forum: https://discourse.mozilla.org/c/tts)
pySBD - 🐍💯pySBD (Python Sentence Boundary Disambiguation) is a rule-based sentence boundary detection that works out-of-the-box.