Balacoon: python package for text-to-speech

Our great sponsors

WorkOS - The modern identity platform for B2B SaaS

InfluxDB - Power Real-Time Data Analytics at Scale

SaaSHub - Software Alternatives and Reviews

Our great sponsors

RHVoice

13 1,425 8.1 C++

a free and open source speech synthesizer for Russian and other languages

Interesting. So some random questions - how easy is it to make a new voice? What about a new voice in a new language? - ever looked at SAPI? Is it possible to make a SAPI bridge for this on windows? - how does it fit with other systems. Like coqui and RHvoice? https://github.com/RHVoice/RHVoice

en_us_normalization

1 0 10.0 Python

Grammars for en_us text normalization

I didnt not release trainy parts to build voices. I am considering, but there is so many packages already (coqui, espnet, piper, nemo, fairseq to name a few) that i focused on usability for now. Support for new languages is a different question. Everyone wants to train fancy neural nets. But support for new language is about writing rules and having language expertise. I did it for English (https://github.com/balacoon/en_us_normalization/tree/c1019cf878aa6baf25d6fff719cf418cca5a3107/production/classify). Doing it for all the other languages would probably take me a lifetime. Other speech synthesis solutions use 17-years old espeak for this purpose (https://github.com/espeak-ng/espeak-ng/blob/master/docs/languages.md). I introduced the fallback to it in Balacoon too. But generally, it is outdated technology and I believe we should do better.

WorkOS

workos.com sponsored

The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
espeak-ng

25 2,858 7.2 C

eSpeak NG is an open source speech synthesizer that supports more than hundred languages and accents.

I didnt not release trainy parts to build voices. I am considering, but there is so many packages already (coqui, espnet, piper, nemo, fairseq to name a few) that i focused on usability for now. Support for new languages is a different question. Everyone wants to train fancy neural nets. But support for new language is about writing rules and having language expertise. I did it for English (https://github.com/balacoon/en_us_normalization/tree/c1019cf878aa6baf25d6fff719cf418cca5a3107/production/classify). Doing it for all the other languages would probably take me a lifetime. Other speech synthesis solutions use 17-years old espeak for this purpose (https://github.com/espeak-ng/espeak-ng/blob/master/docs/languages.md). I introduced the fallback to it in Balacoon too. But generally, it is outdated technology and I believe we should do better.

piper

33 3,902 8.9 C++

A fast, local neural text to speech system (by rhasspy)

My own favorites other than Balacoon would be Piper (https://github.com/rhasspy/piper) and espnet (https://espnet.github.io/espnet/notebook/espnet2_tts_realtime_demo.html).

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project