Top 23 Tt Open-Source Projects

Real-Time-Voice-Cloning

96 50,738 0.0 Python

Clone a voice in 5 seconds to generate arbitrary speech in real-time

Project mention: FLaNK Stack Weekly 12 February 2024 | dev.to | 2024-02-12

MockingBird

9 33,796 5.8 Python

🚀AI拟声: 5秒内克隆您的声音并生成任意语音内容 Clone a voice in 5 seconds to generate arbitrary speech in real-time
InfluxDB

www.influxdata.com sponsored

Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
TTS

231 29,174 9.5 Python

🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production

Project mention: OpenAI deems its voice cloning tool too risky for general release | news.ycombinator.com | 2024-03-31

lol this marketing technique is getting very old. https://github.com/coqui-ai/TTS is already amazing and open source.

LocalAI

82 19,593 9.9 C++

:robot: The free, Open Source OpenAI alternative. Self-hosted, community-driven and local-first. Drop-in replacement for OpenAI running on consumer-grade hardware. No GPU required. Runs gguf, transformers, diffusers and many more models architectures. It allows to generate Text, Audio, Video, Images. Also with voice cloning capabilities.

Project mention: Drop-In Replacement for ChatGPT API | news.ycombinator.com | 2024-01-24

OpenVoice

12 17,263 8.8 Python

Instant voice cloning by MyShell.

Project mention: Ask HN: Voice ID adoption at financial institutions | news.ycombinator.com | 2024-04-03

Given the inevitability of easy voice cloning[1], it seems irresponsible to be using voice as a positive authentication signal.
Unfortunately, major US financial institutions seem to be ramping up adoption of this technology[2].
Am I missing something?
[1] https://github.com/myshell-ai/OpenVoice

PaddleSpeech

6 10,120 7.6 Python

Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting. Won NAACL2022 Best Demo Award.

Project mention: Open Source Libraries | /r/AudioAI | 2023-10-02

PaddlePaddle/PaddleSpeech

NeMo

29 10,021 9.8 Python

A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)

Project mention: [P] Making a TTS voice, HK-47 from Kotor using Tortoise (Ideally WaveRNN) | /r/MachineLearning | 2023-07-06

I don't test WaveRNN but from the ones that I know the best that is open source is FastPitch. And it's easy to use, here is the tutorial for voice cloning.

WorkOS

workos.com sponsored

The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
TTS

62 8,784 0.0 Jupyter Notebook

:robot: :speech_balloon: Deep learning for Text to Speech (Discussion forum: https://discourse.mozilla.org/c/tts) (by mozilla)

Project mention: Coqui.ai Is Shutting Down | news.ycombinator.com | 2024-01-03

Coqui-ai was a commercial continuation of Mozilla TTS and STT (https://github.com/mozilla/TTS).
At the time (2018-ish), it was really impressive for on-device voice synthesis (with a quality approaching the Google and Azure cloud-based voice synthesis options) and open source, so a lot of people in the FOSS community were hoping it could be used for a privacy-respecting home assistant, Linux speech synthesis that doesn't suck, etc.
After Mozilla abandoned the project, Coqui continued development and had some really impressive one-shot voice cloning, but pivoted to marketing speech synthesis for game developers. They were probably having trouble monetizing it, and it doesn't surprise me that they shut down.
An equivalent project that's still in active development and doing really well is Piper TTS (https://github.com/rhasspy/piper).

VALL-E-X

2 7,138 8.8 Python

An open source implementation of Microsoft's VALL-E X zero-shot TTS model. Demo is available in https://plachtaa.github.io

Project mention: FLaNK Stack Weekly 12 February 2024 | dev.to | 2024-02-12

EmotiVoice

5 6,270 8.9 Python

EmotiVoice 😊: a Multi-Voice and Prompt-Controlled TTS Engine

Project mention: FLaNK Stack Weekly 12 February 2024 | dev.to | 2024-02-12

vits

6 6,230 0.0 Python

VITS: Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech

Project mention: [D] TTS systems to download & run offline | /r/MachineLearning | 2023-05-14

And the voice encapsulation system VITS https://github.com/jaywalnut310/vits

silero-models

32 4,534 4.7 Jupyter Notebook

Silero Models: pre-trained speech-to-text, text-to-speech and text-enhancement models made embarrassingly simple

Project mention: Weird A.I. Yankovic, a cursed deep dive into the world of voice cloning | news.ycombinator.com | 2023-10-02

I doubt it's currently actually "the best open source text to speech", but the answer I came up with when throwing a couple of hours at the problem some months ago was "Silero" [0, 1].
Following the "standalone" guide [2], it was pretty trivial to make the model render my sample text in about 100 English "voices" (many of which were similar to each other, and in varying quality). Sampling those, I got about 10 that were pretty "good". And maybe 6 that were the "best ones" (pretty natural, not annoying to listen to).
IIRC the license was free for noncommercial use only. I'm not sure exactly "how open source" they are, but it was simple to install the dependencies and write the basic Python to try it out; I had to write a for loop to try all the voices like I wanted. I ended using something else for the project for other reasons, but this could still be fairly good backup option for some use cases IMO.
  [0] https://github.com/snakers4/silero-models#text-to-speech

DiffSinger

1 4,102 2.5 Python

DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism (SVS & TTS); AAAI 2022; Official code
piper

33 3,902 8.9 C++

A fast, local neural text to speech system (by rhasspy)

Project mention: WhisperSpeech – An Open Source text-to-speech system built by inverting Whisper | news.ycombinator.com | 2024-01-17

If you're not already aware, the primary developer of Mimic 3 (and its non-Mimic predecessor Larynx) continued TTS-related development with Larynx and the renamed project Piper: https://github.com/rhasspy/piper
Last year Piper development was supported by Nabu Casa for their "Year of Voice" project for Home Assistant and it sounds like Mike Hansen is going to continue on it with their support this year.

TensorFlowTTS

6 3,697 0.0 Python

:stuck_out_tongue_closed_eyes: TensorFlowTTS: Real-Time State-of-the-art Speech Synthesis for Tensorflow 2 (supported including English, French, Korean, Chinese, German and Easy to adapt for other languages)

Project mention: Ask HN: On-Device Text to Speech | news.ycombinator.com | 2023-08-31

Hey HN, has anyone found a viable solution for doing this locally and offline on iOS? I'd like to offer a privacy-friendly text to speech feature to my App, and Apple's speech synthesis sounds awful compared to some newer models and TTS engines. The only thing I've found is an older TensorflowTTS example here: https://github.com/TensorSpeech/TensorFlowTTS/tree/master/examples/ios
Any pointers or tips appreciated.

edge-tts

4 3,503 6.4 Python

Use Microsoft Edge's online text-to-speech service from Python WITHOUT needing Microsoft Edge or Windows or an API key

Project mention: [discussion] text to voice generation for textbooks (non-math part) | /r/MachineLearning | 2023-12-01

i would very much like to use it to turn the text parts of a book into an audio where i could listen to it while reading. i used edge's tts for speech by giving a paragraph to clipboard and to edge-tts in order to listen the text but it causes two problems: 1. you need internet connection and have the book opened 2. can only do paragraph by paragraph, and is prone to errors or sometimes if you use it too much it wont convert the full text afterwards.

WhisperSpeech

5 3,329 9.2 Jupyter Notebook

An Open Source text-to-speech system built by inverting Whisper.

Project mention: OpenVoice: Versatile Instant Voice Cloning | news.ycombinator.com | 2024-03-29

I haven't tried openvoice, but I did try whisperspeech and it will do the same thing. You can optionally pass in a file with a reference voice, and the tts uses it.
https://github.com/collabora/whisperspeech
I found it to be kind of creepy hearing it in my own voice. I also tried a friend of mine who had a french canadian accent and strangely the output didn't have his accent.

tacotron

3 2,921 0.0 Python

A TensorFlow implementation of Google's Tacotron speech synthesis with pre-trained model (unofficial)
vall-e

3 2,868 0.0 Python

An unofficial PyTorch implementation of the audio LM VALL-E
awesome-speech-recognition-speech-synthesis-papers

0 2,870 3.5

Automatic Speech Recognition (ASR), Speaker Verification, Speech Synthesis, Text-to-Speech (TTS), Language Modelling, Singing Voice Synthesis (SVS), Voice Conversion (VC)
lingvo

1 2,780 8.7 Python

Lingvo
polyglot

1 2,487 7.1 TypeScript

🤖️ Cross-platform AI language practice app （跨平台AI语言练习应用） (by liou666)

Project mention: What Chinese-speaking chatbots are available? | /r/ChineseLanguage | 2023-05-08

polyglot, downloadable for Mac and Windows.

aeneas

4 2,379 0.0 Python

aeneas is a Python/C library and a set of tools to automagically synchronize audio and text (aka forced alignment)
SaaSHub

www.saashub.com sponsored

SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

Tts related posts

Using Groq to Build a Real-Time Language Translation App
3 projects | dev.to | 5 Apr 2024
Ask HN: Voice ID adoption at financial institutions
1 project | news.ycombinator.com | 3 Apr 2024
OpenVoice: Versatile Instant Voice Cloning
7 projects | news.ycombinator.com | 29 Mar 2024
OpenAI: Navigating the Challenges and Opportunities of Synthetic Voices
1 project | news.ycombinator.com | 29 Mar 2024
WhisperFusion: Ultra-low latency conversations with an AI chatbot
2 projects | news.ycombinator.com | 25 Jan 2024
WhisperSpeech – An Open Source text-to-speech system built by inverting Whisper
9 projects | news.ycombinator.com | 17 Jan 2024
Building a local AI smart Home Assistant
11 projects | news.ycombinator.com | 13 Jan 2024
A note from our sponsor - SaaSHub
www.saashub.com | 24 Apr 2024

SaaSHub helps you find the best software and product alternatives Learn more →

Index

What are some of the best open-source Tt projects? This list will help you:

	Project	Stars
1	Real-Time-Voice-Cloning	50,738
2	MockingBird	33,796
3	TTS	29,174
4	LocalAI	19,593
5	OpenVoice	17,263
6	PaddleSpeech	10,120
7	NeMo	10,021
8	TTS	8,784
9	VALL-E-X	7,138
10	EmotiVoice	6,270
11	vits	6,230
12	silero-models	4,534
13	DiffSinger	4,102
14	piper	3,902
15	TensorFlowTTS	3,697
16	edge-tts	3,503
17	WhisperSpeech	3,329
18	tacotron	2,921
19	vall-e	2,868
20	awesome-speech-recognition-speech-synthesis-papers	2,870
21	lingvo	2,780
22	polyglot	2,487
23	aeneas	2,379