Top 23 Python speech-synthesis Projects

TTS

1 242 42,018 8.1 Python

🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production

Project mention: Build Your Own Clone: Best Open-Source AI Tools | dev.to | 2025-06-27

2. Coqui AI
InfluxDB

www.influxdata.com featured

InfluxDB – Built for High-Performance Time Series Workloads. InfluxDB 3 OSS is now GA. Transform, enrich, and act on time series data directly in the database. Automate critical tasks and eliminate the need to move data externally. Download now.
NeMo

2 31 15,552 9.9 Python

A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)

Project mention: FFmpeg 8.0 adds Whisper support | news.ycombinator.com | 2025-08-13

git clone https://github.com/NVIDIA/NeMo.git nemo
PaddleSpeech

3 6 12,189 8.8 Python

Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting. Won NAACL2022 Best Demo Award.
espnet

4 15 9,408 9.9 Python

End-to-End Speech Processing Toolkit
Amphion

5 6 9,311 7.6 Python

Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, music, and speech generation research and development.

Project mention: AIM Weekly for 04Nov2024 | dev.to | 2024-11-04

🌐 Composed Image Retrieval 📎 Intro to Multimodal LLama 3.2 🛠️ Multi Agent Concierge 💻 RAG with Langchain Granite, Milvus 🫶 Download content ✅ Transformer Replacement? 🤖 vLLM for runing models 🌐 Amphion 📝 Autogluon 🚙 Notebook LLama like Google's Notebook LLM 🫶 Monocle2ai for tracing GenAI app code LFA&D Project 🤖 Bee Agent Framework ✅ LLama RFP Response ▶️ GenAI Script 👽 Simular AI Agent S 🦾 DrawDB with AI ✨ Ollama with LLama 3.2 Vision!!!! Preview 🚕 Powerful RAG Checker 📊 SQL Generator 💻 Role of LLMs 🐍 Document Extraction 🕶️ Open Source Vector DB Reddit 🍔 The Practical Guide to Self Hosting LLM 🦾 Stagehand Controller 🕶️ Understanding HNSWLIB 🐍 Best practices in RAG 💻 Enigma Agent 📝 Langchain, Ollama, Phi3 for Function Calling 🔋 Compass Judger 📝 Princeton NLP SimPO 🍔 Princeton NLP ProLong 🔋 Princeton NLP HELMET 🧐 Ollama Cheatsheet 🚕 Princeton NLP CopyCat 📊 Princeton NLP Shp 🕶️ Can LLM Solve Hard Github Issues 📝 Enabling Large Language Models to Generate Text with Citations 🔋 Princeton NLP CharXiv 📊 Awesome AI Agents List 🦾 Nomic’s Matryoshka text embedding model
so-vits-svc-fork

6 16 9,101 2.4 Python

so-vits-svc fork with realtime support, improved interface and more features.
edge-tts

7 9 8,956 7.6 Python

Use Microsoft Edge's online text-to-speech service from Python WITHOUT needing Microsoft Edge or Windows or an API key

Project mention: Show HN: Voice Cloning and Multilingual TTS in One Click (Windows) | news.ycombinator.com | 2025-01-26

There is a MIT license in the repo. In that sense it's open source.
It's using "Edge TTS", which I believe means use API keys stolen [1] from Microsoft Edge and hope Microsoft doesn't sue you, non jolly-roger flying internet users beware.
Can't speak to other models and their licenses, I stopped looking after I saw this since I don't feel the need to use this.
[1] https://github.com/rany2/edge-tts/blob/ac41fb85ab2b2b48fef8a...
Sevalla

sevalla.com featured

Deploy and host your apps and databases, now with $50 credit! Sevalla is the PaaS you have been looking for! Advanced deployment pipelines, usage-based pricing, preview apps, templates, human support by developers, and much more!
EmotiVoice

8 5 8,124 7.9 Python

EmotiVoice 😊: a Multi-Voice and Prompt-Controlled TTS Engine
vits

9 6 7,466 0.0 Python

VITS: Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech
StyleTTS2

10 7 5,876 7.7 Python

StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models
DiffSinger

11 1 4,592 2.1 Python

DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism (SVS & TTS); AAAI 2022; Official code
voice-pro

12 11 4,406 8.6 Python

Gradio WebUI for creators and developers, featuring key TTS (Edge-TTS, kokoro) and zero-shot Voice Cloning (E2 & F5-TTS, CosyVoice), with Whisper audio processing, YouTube download, Demucs vocal isolation, and multilingual translation.

Project mention: Voice-Pro: Ultimate AI Voice Conversion and Multilingual Translation Tool 🔊 | dev.to | 2025-02-10

GitHub: https://github.com/abus-aikorea/voice-pro
speech-to-speech

13 3 4,152 8.7 Python

Speech To Speech: an effort for an open-sourced and modular GPT4-o

Project mention: Moonshine, the new state of the art for speech to text | news.ycombinator.com | 2024-10-27

I gave this a shot using speech-to-speech¹ modified so that it skips the LLM/AI assistant part and just repeats back what it thinks I said and displays the text.
For longer sentences my perception is that Moonshine performs at 80-90% of what Whisper² could do, while using considerably less resources. When trying shorter, two-word utterances it nosedived for some reason.
These numbers don't mean much, but when paired with MeloTTS, Moonshine and Whisper² ate up 1.2 and 2.5 GB of my GPU's memory, respectively.
¹ https://github.com/huggingface/speech-to-speech
metavoice-src

14 5 4,142 7.8 Python

Foundational model for human-like, expressive TTS

Project mention: Ask HN: What is the state of OSS voice cloning? | news.ycombinator.com | 2024-09-30
TensorFlowTTS

15 6 3,948 0.0 Python

:stuck_out_tongue_closed_eyes: TensorFlowTTS: Real-Time State-of-the-art Speech Synthesis for Tensorflow 2 (supported including English, French, Korean, Chinese, German and Easy to adapt for other languages)
RealtimeTTS

16 1 3,426 9.1 Python

Converts text to speech in realtime
tacotron

17 3 2,980 0.0 Python

A TensorFlow implementation of Google's Tacotron speech synthesis with pre-trained model (unofficial)
abogen

18 2 2,991 9.5 Python

Generate audiobooks from EPUBs, PDFs and text with synchronized captions.

Project mention: Abogen – Generate audiobooks from EPUBs, PDFs and text | news.ycombinator.com | 2025-08-09

It's probably due to the unusual sound format, 24kHz PCM, and the fact that it was somehow forced into a WebM container, which only supports the Vorbis and Opus formats.
It looks like they created it using the "higher quality" ffmpeg command line, except for the "webm" final extension, producing the opposite of what's described as "an MP4 file that's compatible with more devices".
https://github.com/denizsafak/abogen/tree/main/demo#for-high...
lingvo

19 1 2,853 6.9 Python

Lingvo
Tacotron-2

20 1 2,309 0.0 Python

DeepMind's Tacotron-2 Tensorflow implementation
hifi-gan

21 5 2,189 0.0 Python

HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis
WaveRNN

22 5 2,166 0.0 Python

WaveRNN Vocoder + TTS
kalliope

23 4 1,737 0.0 Python

Kalliope is a framework that will help you to create your own personal assistant.
SaaSHub

www.saashub.com featured

SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

Python speech-synthesis discussion

Python speech-synthesis related posts

Kitten TTS: 25MB CPU-Only, Open-Source Voice Model

19 projects | news.ycombinator.com | 5 Aug 2025
Show HN: Voice Cloning and Multilingual TTS in One Click (Windows)

2 projects | news.ycombinator.com | 26 Jan 2025
Edge TTS

4 projects | news.ycombinator.com | 22 Jan 2025
Show HN: Voice-Pro – AI Voice Cloning Magic: Transform Any Voice in 15 Seconds

10 projects | news.ycombinator.com | 27 Nov 2024
Play 3.0 mini – A lightweight, reliable, cost-efficient Multilingual TTS model

5 projects | news.ycombinator.com | 14 Oct 2024
Show HN: Offline audiobook from any format with one CLI command

7 projects | news.ycombinator.com | 6 Oct 2024
Ask HN: What is the state of OSS voice cloning?

6 projects | news.ycombinator.com | 30 Sep 2024
A note from our sponsor - InfluxDB
www.influxdata.com | 1 Sep 2025

InfluxDB 3 OSS is now GA. Transform, enrich, and act on time series data directly in the database. Automate critical tasks and eliminate the need to move data externally. Download now. Learn more →

Index

What are some of the best open-source speech-synthesis projects in Python? This list will help you:

#	Project	Stars
1	TTS	42,018
2	NeMo	15,552
3	PaddleSpeech	12,189
4	espnet	9,408
5	Amphion	9,311
6	so-vits-svc-fork	9,101
7	edge-tts	8,956
8	EmotiVoice	8,124
9	vits	7,466
10	StyleTTS2	5,876
11	DiffSinger	4,592
12	voice-pro	4,406
13	speech-to-speech	4,152
14	metavoice-src	4,142
15	TensorFlowTTS	3,948
16	RealtimeTTS	3,426
17	tacotron	2,980
18	abogen	2,991
19	lingvo	2,853
20	Tacotron-2	2,309
21	hifi-gan	2,189
22	WaveRNN	2,166
23	kalliope	1,737

Python speech-synthesis

Top 23 Python speech-synthesis Projects

Python speech-synthesis discussion

Python speech-synthesis related posts

Kitten TTS: 25MB CPU-Only, Open-Source Voice Model

Show HN: Voice Cloning and Multilingual TTS in One Click (Windows)

Edge TTS

Show HN: Voice-Pro – AI Voice Cloning Magic: Transform Any Voice in 15 Seconds

Play 3.0 mini – A lightweight, reliable, cost-efficient Multilingual TTS model

Show HN: Offline audiobook from any format with one CLI command

Ask HN: What is the state of OSS voice cloning?

Index

Did you know that Python is the 2nd most popular programming language based on number of references?

Did you know that Python is
the 2nd most popular programming language
based on number of references?