[P] TorToiSe - a true zero-shot multi-voice TTS engine

This page summarizes the projects mentioned and recommended in the original post on reddit.com/r/MachineLearning

Our great sponsors
  • SonarLint - Clean code begins in your IDE with SonarLint
  • talent.io - Download talent.io’s Tech Salary Report
  • Scout APM - Truly a developer’s best friend
  • tortoise-tts

    A multi-voice TTS system trained with an emphasis on quality

    I'd like to show off a TTS system I have been working on for the past year. I've open-sourced all the code and the trained model weights: https://github.com/neonbjb/tortoise-tts

  • yt-dlp

    A youtube-dl fork with additional features and fixes

    Had a lot of fun using this. Mimicked some friends' voices, as well as some semi-well-known streamers from YouTube via yt-dlp.

  • SonarLint

    Clean code begins in your IDE with SonarLint. Up your coding game and discover issues early. SonarLint is a free plugin that helps you find & fix bugs and security issues from the moment you start writing code. Install from your favorite IDE marketplace today.

  • espnet

    End-to-End Speech Processing Toolkit

    CMU WavLab has ESPNet https://espnet.github.io/espnet/ which includes a number of high quality TTS models including VITS (which in my subjective experience is just as good as what is demonstrated here). Also the inference on various ESPNet pretrained TTS models is reasonable and sentences take on average 5 seconds per word to generate the waveform on my totally mid PC setup.

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts