
-
> Last time I looked into TTS systems for German, Google was the only game in town. What I wouldn't give for a viable alternative! It doesn't even need to be open source, I'd be quite ready to pay top dollar.
Will you still pay top dollar if it is open source though? :D
Piper TTS[0] (MIT Licensed; developed by main dev of Larynx TTS, Mimic3 TTS & Rhasspy voice assistant) has support for ~30 languages, at least some of which have multiple voices available--in a range of quality & data licenses.
And, particularly fortuitous for your needs, potentially, there's at at least one German voice that was recorded[1] specifically for Piper[2] (with emotion variants and CC0-licensed, no less :) )...
Check out `thorsten` & `thorsten_emotional` on the samples page: https://rhasspy.github.io/piper-samples/
I can't speak to the quality of the German voice specifically but for English at least I've found Piper's quality & range of voices of use[3].
---- footnotes ----
[0] https://github.com/rhasspy/piper
[1] https://www.youtube.com/playlist?list=PL19C7uchWZeojjI5FUk3q...
[2] In addition to other German voices based on other sources: https://huggingface.co/rhasspy/piper-voices/tree/main/de/de_...
[3] Somewhat of an understatement.
-
Nutrient
Nutrient - The #1 PDF SDK Library. Bad PDFs = bad UX. Slow load times, broken annotations, clunky UX frustrates users. Nutrient’s PDF SDKs gives seamless document experiences, fast rendering, annotations, real-time collaboration, 100+ features. Used by 10K+ devs, serving ~half a billion users worldwide. Explore the SDK for free.
-
You can try to run this code to use
https://github.com/2noise/ChatTTS/blob/main/infer.ipynb
-
My interest in TTS is around "indie" game creation, animation and "radio plays".
A couple of years ago I started development of a tool to help with the generation of game audio such as NPC dialogue, "barks" or narration for those without access to/budget for human voice actors: https://rancidbacon.itch.io/dialogue-tool-for-larynx-text-to...
One thing I found interesting is that writing a small "scene" and then hearing dialogue being spoken by a variety of voices often prompted the writing of further lines of dialogue in response to perceived emotion contained in voices in the generated output. Plus it was just fun. :)
The version of the tool on that page is based on Larynx TTS which has continued development more recently as Piper TTS: https://github.com/rhasspy/piper
I'm yet to publish my port which uses Piper TTS though: https://gitlab.com/RancidBacon/larynx-dialogue/-/tree/featur...
Though I did upload some sample output (including some "radio announcer" samples in response to a HN comment :) ): https://rancidbacon.gitlab.io/piper-tts-demos/
Obviously there's variations in voice quality, and ability to control expression is currently limited but beats hearing my own voice. :D