Python text-to-speech

Open-source Python projects categorized as text-to-speech

Top 23 Python text-to-speech Projects

text-to-speech
  1. unsloth

    Unsloth Studio is a web UI for training and running open models like Gemma 4, Qwen3.6, DeepSeek, gpt-oss locally.

    Project mention: I Trained an LLM on 75K of My Own Messages So It Would Stop Writing Like a Chatbot | dev.to | 2026-05-08

    Training: unsloth + trl (SFTTrainer). Unsloth handles the 4-bit quantization and gradient checkpointing; trl handles the training loop.

  2. SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
  3. GPT-SoVITS

    1 min voice data can also be used to train a good TTS model! (few shot voice cloning)

  4. TTS

    πŸΈπŸ’¬ - a deep learning toolkit for Text-to-Speech, battle-tested in research and production

    Project mention: My Journey to a reliable and enjoyable locally hosted voice assistant | news.ycombinator.com | 2026-03-16

    actually the hardest part of a locally hosted voice assistant isn't the llm. it's making the tts tolerable to actually talk to every day.

    the core issue is prosody: kokoro and piper are trained on read speech, but conversational responses have shorter breath groups and different stress patterns on function words. that's why numbers, addresses, and hedged phrases sound off even when everything else works.

    the fix is training data composition. conversational and read speech have different prosody distributions and models don't generalize across them. for self-hosted, coqui xtts-v2 [1] is worth trying if you want more natural english output than kokoro.

    btw i'm lily, cofounder of rime [2]. we're solving this for business voice agents at scale, not really the personal home assistant use case, but the underlying problem is the same.

    [1] https://github.com/coqui-ai/TTS

  5. ChatTTS

    A generative speech model for daily dialogue.

  6. MockingBird

    πŸš€Clone a voice in 5 seconds to generate arbitrary speech in real-time

  7. OpenVoice

    Instant voice cloning by MIT and MyShell. Audio foundation model.

    Project mention: 5 must know open-source repositories to build cool AI apps | dev.to | 2025-10-29

    Star the Open Voice repository ⭐

  8. VoxCPM

    VoxCPM2: Tokenizer-Free TTS for Multilingual Speech Generation, Creative Voice Design, and True-to-Life Cloning

    Project mention: Rust RAG, Tokenizer-Free TTS (VoxCPM2), & Project NOMAD: Local AI & Offline Deployments | dev.to | 2026-05-30

    Source: https://github.com/OpenBMB/VoxCPM

  9. CosyVoice

    Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.

    Project mention: CosyVoice 2025 Complete Guide: The Ultimate Multi-lingual Text-to-Speech Solution | dev.to | 2025-12-15

    git clone --recursive https://github.com/FunAudioLLM/CosyVoice.git cd CosyVoice # If submodule cloning fails due to network issues git submodule update --init --recursive

  10. index-tts

    An Industrial-Level Controllable and Efficient Zero-Shot Text-To-Speech System

    Project mention: IndexTTS2 Comprehensive Review: In-Depth Analysis of 2025's Most Powerful Emotional Speech Synthesis Model | dev.to | 2025-09-11

    # 1. Clone repository git clone https://github.com/index-tts/index-tts.git cd index-tts # 2. Install dependencies uv sync --all-extras # 3. Download model hf download IndexTeam/IndexTTS-2 --local-dir=checkpoints # 4. Launch web interface uv run webui.py

  11. dia

    A TTS model capable of generating ultra-realistic dialogue in one pass.

    Project mention: Kitten TTS: 25MB CPU-Only, Open-Source Voice Model | news.ycombinator.com | 2025-08-05

    The best open one I've found so far is Dia - https://github.com/nari-labs/dia - it has some limitations, but i think it's really impressive and I can run it on my laptop.

  12. pyvideotrans

    Translate the video from one language to another and embed dubbing & subtitles.

  13. edge-tts

    Use Microsoft Edge's online text-to-speech service from Python WITHOUT needing Microsoft Edge or Windows or an API key

    Project mention: I Made a Product Demo Video Entirely with AI | dev.to | 2026-03-02

    Each scene has a narration line. edge-tts turns them into MP3 files using Microsoft's neural TTS β€” free, no API key, surprisingly natural:

  14. voice-pro

    Gradio WebUI for creators and developers, featuring key TTS (Edge-TTS, kokoro) and zero-shot Voice Cloning (E2 & F5-TTS, CosyVoice), with Whisper audio processing, YouTube download, Demucs vocal isolation, and multilingual translation.

    Project mention: Show HN: Likes/day as fake profile β†’ built my own dating app in 100 days | news.ycombinator.com | 2025-12-16
  15. espnet

    End-to-End Speech Processing Toolkit

  16. Amphion

    Amphion (/Γ¦mˈfaΙͺΙ™n/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, music, and speech generation research and development.

  17. EmotiVoice

    EmotiVoice 😊: a Multi-Voice and Prompt-Controlled TTS Engine

  18. vits

    VITS: Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech

  19. mlx-audio

    A text-to-speech (TTS), speech-to-text (STT) and speech-to-speech (STS) library built on Apple's MLX framework, providing efficient speech analysis on Apple Silicon.

    Project mention: The Free, Open-Source Alternative to ElevenLabs Is Finally Here | dev.to | 2026-05-24

    uv pip install "git+https://github.com/Blaizzy/mlx-audio" --prerelease=allow uv pip install soundfile

  20. StyleTTS2

    StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models

  21. DiffSinger

    DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism (SVS & TTS); AAAI 2022; Official code

  22. abogen

    Generate audiobooks from EPUBs, PDFs and text with synchronized captions.

    Project mention: Abogen – Generate audiobooks from EPUBs, PDFs and text | news.ycombinator.com | 2025-08-09

    It's probably due to the unusual sound format, 24kHz PCM, and the fact that it was somehow forced into a WebM container, which only supports the Vorbis and Opus formats.

    It looks like they created it using the "higher quality" ffmpeg command line, except for the "webm" final extension, producing the opposite of what's described as "an MP4 file that's compatible with more devices".

    https://github.com/denizsafak/abogen/tree/main/demo#for-high...

  23. metavoice-src

    Foundational model for human-like, expressive TTS

  24. WhisperLive

    A nearly-live implementation of OpenAI's Whisper.

    Project mention: OpenBrief Review: Local-First Video AI Summarizer 2026 | dev.to | 2026-05-27

    You need real-time meeting transcription β€” OpenBrief is post-hoc, not live (use Whisper-Live or Fireflies for that)

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

Python text-to-speech discussion

Log in or Post with

Python text-to-speech related posts

  • The Free, Open-Source Alternative to ElevenLabs Is Finally Here

    2 projects | dev.to | 24 May 2026
  • My Journey to a reliable and enjoyable locally hosted voice assistant

    3 projects | news.ycombinator.com | 16 Mar 2026
  • I Made a Product Demo Video Entirely with AI

    1 project | dev.to | 2 Mar 2026
  • Ask HN: What's the current best local/open speech-to-speech setup?

    11 projects | news.ycombinator.com | 23 Jan 2026
  • IMS Toucan – Text-to-Speech for over 7000 Languages

    1 project | news.ycombinator.com | 2 Jan 2026
  • AI Twin β€” Voice Cloning with Text-to-Speech

    2 projects | dev.to | 16 Dec 2025
  • CosyVoice 2025 Complete Guide: The Ultimate Multi-lingual Text-to-Speech Solution

    5 projects | dev.to | 15 Dec 2025
  • A note from our sponsor - SaaSHub
    www.saashub.com | 6 Jun 2026
    SaaSHub helps you find the best software and product alternatives Learn more β†’

Index

What are some of the best open-source text-to-speech projects in Python? This list will help you:

# Project Stars
1 unsloth 65,373
2 GPT-SoVITS 58,362
3 TTS 45,294
4 ChatTTS 39,392
5 MockingBird 36,899
6 OpenVoice 36,234
7 VoxCPM 22,525
8 CosyVoice 21,440
9 index-tts 20,970
10 dia 19,305
11 pyvideotrans 17,787
12 edge-tts 11,178
13 voice-pro 10,489
14 espnet 9,853
15 Amphion 9,836
16 EmotiVoice 8,479
17 vits 7,860
18 mlx-audio 7,144
19 StyleTTS2 6,245
20 DiffSinger 4,747
21 abogen 4,692
22 metavoice-src 4,194
23 WhisperLive 4,054

Sponsored
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com

Did you know that Python is
the 1st most popular programming language
based on number of references?