Mellotron: a multispeaker voice synthesis model based on Tacotron 2 GST that can make a voice emote and sing without emotive or singing training data
Why do you think that https://github.com/CorentinJ/Real-Time-Voice-Cloning is a good alternative to mellotron