WhisperSpeech
Retrieval-based-Voice-Conversion-WebUI
WhisperSpeech | Retrieval-based-Voice-Conversion-WebUI | |
---|---|---|
5 | 56 | |
3,417 | 19,460 | |
4.7% | 7.9% | |
9.2 | 9.6 | |
7 days ago | 3 days ago | |
Jupyter Notebook | Python | |
MIT License | MIT License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
WhisperSpeech
-
OpenVoice: Versatile Instant Voice Cloning
I haven't tried openvoice, but I did try whisperspeech and it will do the same thing. You can optionally pass in a file with a reference voice, and the tts uses it.
https://github.com/collabora/whisperspeech
I found it to be kind of creepy hearing it in my own voice. I also tried a friend of mine who had a french canadian accent and strangely the output didn't have his accent.
-
Show HN: WhisperFusion – Ultra-low latency conversations with an AI chatbot
- WhisperSpeech for the text-to-speech - https://github.com/collabora/WhisperSpeech
and an LLM (phi-2, Mistral, etc.) in between
-
WhisperFusion: Ultra-low latency conversations with an AI chatbot
Hi, I used the [WhisperSpeech](https://github.com/collabora/WhisperSpeech) model for the TTS part after I did some serious torch.compile optimizations to bring the latency down. The Whisper speech recognition and the LLM were optimized through TensorRT-LLM by Marcus and Vineet.
It's not perfect but I am still extremely proud of how it came out. :)
- WhisperSpeech – An Open Source text-to-speech system built by inverting Whisper
-
StyleTTS2 – open-source Eleven Labs quality Text To Speech
I think you’re talking about just using Whisper to annotate audio for a TTS pipeline but someone from Collabora actually created a TTS model directly from Whisper embeddings https://github.com/collabora/WhisperSpeech
Retrieval-based-Voice-Conversion-WebUI
-
OpenVoice: Versatile Instant Voice Cloning
RVC does live voice changing with a little latency: https://github.com/RVC-Project/Retrieval-based-Voice-Convers...
The product isn't exactly spectacular, but most of the works seems to have bene done. Just needs someone to go over the UI and make it less unstable, really.
-
I made a theme song for Vito Loses
Retrieval-based-Voice-Conversion-WebUI. Nearly destroyed a hard drive in the process of getting the fucking thing to train on Vito's voice but it came together eventually.
-
Spider-Man: The Animated Series in REAL
Yeah, Man all the AI tools are fun RVC is fun to play with as well.
-
Ask HN: AI Voice Reverse
Would it be possible to reverse a AI generated voice if they spoke themselves[0] instead of using TTS[1]?
Since the AI voice is trained shouldn't a reversing AI also be able to seperate the trained data?
[0] https://github.com/RVC-Project/Retrieval-based-Voice-Conversion-WebUI
-
RIAA Reports AI Vocal Cloning Site 'Voicify' to the U.S. Government
Well fortunately most people I see making AI cover they use open source tools to do that (https://github.com/RVC-Project/Retrieval-based-Voice-Convers...)
-
Open Source Libraries
RVC-Project/Retrieval-based-Voice-Conversion-WebUI: Singing Voice Conversion
- RVC WebUI and training on Intel ARC
-
Lyrebird the Linux voice changer now supports PipeWire
At least that's what https://github.com/RVC-Project/Retrieval-based-Voice-Convers... links to
Realtime Voice Conversion Software using RVC : w-okada/voice-changer
-
The next FF record, will have different versions with Burton's vocals...
I don't know how to use it yet, but the program is here. https://github.com/RVC-Project/Retrieval-based-Voice-Conversion-WebUI/blob/main/docs/README.en.md
- Retrieval Based Voice Conversion (WebUI)
What are some alternatives?
piper - A fast, local neural text to speech system
RVC-GUI - Just a fork of RVC for easy audio file voice conversion locally
WhisperFusion - WhisperFusion builds upon the capabilities of WhisperLive and WhisperSpeech to provide a seamless conversations with an AI.
Mangio-RVC-Fork - *CREPE+HYBRID TRAINING* A very experimental fork of the Retrieval-based-Voice-Conversion-WebUI repo that incorporates a variety of other f0 methods, along with a hybrid f0 nanmedian method.
whisper-ctranslate2 - Whisper command line client compatible with original OpenAI client based on CTranslate2.
bark - 🔊 Text-Prompted Generative Audio Model
monotonic_align - Monotonic Alignment Search
ultimatevocalremovergui - GUI for a Vocal Remover that uses Deep Neural Networks.
VoiceCraft - Zero-Shot Speech Editing and Text-to-Speech in the Wild
voice-changer - リアルタイムボイスチェンジャー Realtime Voice Changer
whisper - Robust Speech Recognition via Large-Scale Weak Supervision
so-vits-svc-fork - so-vits-svc fork with realtime support, improved interface and more features. [Moved to: https://github.com/voicepaw/so-vits-svc-fork]