Real-Time-Voice-Cloning vs FastSpeech2

Real-Time-Voice-Cloning

Clone a voice in 5 seconds to generate arbitrary speech in real-time (by CorentinJ)

Source Code

Suggest alternative

Edit details

FastSpeech2

An implementation of Microsoft's "FastSpeech 2: Fast and High-Quality End-to-End Text to Speech" (by ming024)

Suggest topics

Source Code

Suggest alternative

Edit details

Our great sponsors

WorkOS - The modern identity platform for B2B SaaS

InfluxDB - Power Real-Time Data Analytics at Scale

SaaSHub - Software Alternatives and Reviews

Our great sponsors

Real-Time-Voice-Cloning		FastSpeech2
	Project
96	Mentions	4
50,738	Stars	1,612
-	Growth	-
0.0	Activity	0.0
about 1 month ago	Latest Commit	6 months ago
Python	Language	Python
GNU General Public License v3.0 or later	License	MIT License

The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.

Real-Time-Voice-Cloning

Posts with mentions or reviews of Real-Time-Voice-Cloning. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2024-02-12.

FLaNK Stack Weekly 12 February 2024
52 projects | dev.to | 12 Feb 2024
Voice Cloning
1 project | /r/WebmasterValley | 28 Sep 2023
Show HN: Real Time Voice Cloning – Instant DeepFake Audio
1 project | news.ycombinator.com | 22 Sep 2023
Bu dakikadaki bahsedilen yapay zekayı bulamadım yardımcı olur musunuz?
1 project | /r/veYakinEvren | 1 Jul 2023
Alarming Rise of Voice Cloning Fraud Targeting the Elderly through AI
1 project | /r/hacking | 25 Jun 2023
Dark Brandon going hard
2 projects | /r/LivestreamFail | 8 Jun 2023
Conselho do TRF-4 afasta juiz da Lava Jato
1 project | /r/brasil | 22 May 2023
What Photoshop Can't Do, DragGAN Can! See How! Paper Explained, Along with Additional Supplementary Video Footage
2 projects | /r/singularity | 20 May 2023

Oh maybe it is available: https://github.com/CorentinJ/Real-Time-Voice-Cloning
Regarding recent posts about AI voice generation
2 projects | /r/skyrimmods | 19 Apr 2023

I know this isn't the bulk of your argument, but if the concern is uploading, offline voice cloners have existed for years (albeit not as good as elevenlabs) but will presumably get far better in the years to come now that everyone has seen what's possible and as PC compute power continues to improve. https://github.com/CorentinJ/Real-Time-Voice-Cloning
'He Would Still Be Here': Man Dies by Suicide After Talking with AI Chatbot, Widow Says | The incident raises concerns about guardrails around quickly-proliferating conversational AI models.
1 project | /r/technology | 31 Mar 2023

FastSpeech2

Posts with mentions or reviews of FastSpeech2. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2023-04-13.

[D] What is the best open source text to speech model?
15 projects | /r/MachineLearning | 13 Apr 2023

FastSpeech2 submitted: Jun 8, 2020 paper: https://arxiv.org/pdf/2006.04558.pdf github: https://github.com/ming024/FastSpeech2 (Not the official implementation but is the once cited the most)
What voice-changing apps are available right now?
4 projects | /r/artificial | 29 Jun 2022

We have the TorToiSe repo, the SV2TTS repo, and from here you have the other models like Tacotron 2, FastSpeech 2, and such. A there is a lot that goes into training a baseline for these models on the LJSpeech and LibriTTS datasets. Fine tuning is left up to the user.
I'm looking for something self-hosted, preferably linux-based (though win or mac will work too), that will allow me to train a 'voice model' with pre-recorded speech, and then replicate it from text of my choice.
3 projects | /r/VocalSynthesis | 20 Feb 2022
Voice-cloning library for conlangs?
3 projects | /r/conlangs | 9 Nov 2021

As for synthesis of text using your own voice - you can dig into Real Time Voice Cloning or maybe FastSpeech2, but I am not sure if you can use it with conlangs (and because of ML nature, you need many, many, many training data to get anything interesting).

What are some alternatives?

When comparing Real-Time-Voice-Cloning and FastSpeech2 you can also consider the following projects:

TTS - 🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production

Parallel-Tacotron2 - PyTorch Implementation of Google's Parallel Tacotron 2: A Non-Autoregressive Neural TTS Model with Differentiable Duration Modeling

DeepFaceLab - DeepFaceLab is the leading software for creating deepfakes.

tacotron2 - Tacotron 2 - PyTorch implementation with faster-than-realtime inference

tortoise-tts - A multi-voice TTS system trained with an emphasis on quality

voice100 - Voice100 includes neural TTS/ASR models. Inference of Voice100 is low cost as its models are tiny and only depend on CNN without autoregression.

MockingBird - 🚀AI拟声: 5秒内克隆您的声音并生成任意语音内容 Clone a voice in 5 seconds to generate arbitrary speech in real-time

tacotron - A TensorFlow implementation of Google's Tacotron speech synthesis with pre-trained model (unofficial)

silero-models - Silero Models: pre-trained speech-to-text, text-to-speech and text-enhancement models made embarrassingly simple

glados-voice-assistant - DIY Voice Assistant based on the GLaDOS character from Portal video game series. Works with home assistant!

vits - VITS: Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech

Real-Time-Voice-Cloning vs TTS FastSpeech2 vs Parallel-Tacotron2 Real-Time-Voice-Cloning vs DeepFaceLab FastSpeech2 vs tacotron2 Real-Time-Voice-Cloning vs tortoise-tts FastSpeech2 vs voice100 Real-Time-Voice-Cloning vs MockingBird FastSpeech2 vs tacotron Real-Time-Voice-Cloning vs silero-models FastSpeech2 vs tortoise-tts Real-Time-Voice-Cloning vs glados-voice-assistant FastSpeech2 vs vits

Compare Real-Time-Voice-Cloning vs FastSpeech2 and see what are their differences.

Real-Time-Voice-Cloning

FastSpeech2

Real-Time-Voice-Cloning

FastSpeech2

What are some alternatives?