Voice-cloning library for conlangs?

This page summarizes the projects mentioned and recommended in the original post on /r/conlangs

Our great sponsors
  • WorkOS - The modern identity platform for B2B SaaS
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • SaaSHub - Software Alternatives and Reviews
  • Real-Time-Voice-Cloning

    Clone a voice in 5 seconds to generate arbitrary speech in real-time

  • As for synthesis of text using your own voice - you can dig into Real Time Voice Cloning or maybe FastSpeech2, but I am not sure if you can use it with conlangs (and because of ML nature, you need many, many, many training data to get anything interesting).

  • FastSpeech2

    An implementation of Microsoft's "FastSpeech 2: Fast and High-Quality End-to-End Text to Speech"

  • As for synthesis of text using your own voice - you can dig into Real Time Voice Cloning or maybe FastSpeech2, but I am not sure if you can use it with conlangs (and because of ML nature, you need many, many, many training data to get anything interesting).

  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

    WorkOS logo
  • voice100

    Voice100 includes neural TTS/ASR models. Inference of Voice100 is low cost as its models are tiny and only depend on CNN without autoregression.

  • If you would like to use your romanization, yes, first you have to have some way to perform grapheme-to-phoneme transcription. I dug for a bit and found something that looks pretty basic, where you can easily write your own phonemizer: https://github.com/kaiidams/voice100. Not sure how good this model is, as it's made to be working on small devices, but you may play with it.

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts