Python Tts

Open-source Python projects categorized as Tts

Top 23 Python Tt Projects

  1. Real-Time-Voice-Cloning

    Clone a voice in 5 seconds to generate arbitrary speech in real-time

    Project mention: ChatGPT unexpectedly began speaking in a user's cloned voice during testing | news.ycombinator.com | 2024-08-11
  2. Judoscale

    Save 47% on cloud hosting with autoscaling that just works. Judoscale integrates with Django, FastAPI, Celery, and RQ to make autoscaling easy and reliable. Save big, and say goodbye to request timeouts and backed-up task queues.

    Judoscale logo
  3. GPT-SoVITS

    1 min voice data can also be used to train a good TTS model! (few shot voice cloning)

    Project mention: A CC-By Open-Source TTS Model with Voice Cloning | news.ycombinator.com | 2024-11-09

    Iโ€™ve had great luck so far with GPT-SoVITS. With a custom trained Japanese model and clean reference audio the quality is outstanding. It is quite finicky to set up and use though.

    https://github.com/RVC-Boss/GPT-SoVITS

  4. TTS

    ๐Ÿธ๐Ÿ’ฌ - a deep learning toolkit for Text-to-Speech, battle-tested in research and production

    Project mention: Show HN: Voice-Pro โ€“ AI Voice Cloning Magic: Transform Any Voice in 15 Seconds | news.ycombinator.com | 2024-11-27

    It's really easy for a technical person to do as well.

    I use Coqui TTS[0] as part of my home automation, I wrote a small python script that lets me upload a voice clip for it to clone after I got the idea from HeyWillow[1], and a small shim that lets me send the output to a Home Assistant media player instead of using their standard output device. I run the TTS container on a VM with a Tesla P4 (~ยฃ100 to buy) and get about 1x-2x (roughly the same time it'd take to say it, to process) using the large model.

    Just for a giggle, I uploaded a few 3s-5s second clip of myself speaking and cloned my voice, then executed a command to our living room media player to call my wife into the room; from another room, she was 100% convinced it was myself speaking words I'd never spoken.

    I tried playing with a variety of sentences for a few hours and overall, it sounded almost exactly like me, to me, with the exception of some "attitude" and "intonation" I know I wouldn't use in my speech. I didn't notice much of an improvement using much longer clips; the short ones were "good enough".

    Tangentially, it really bugs me that most phone providers in the UK insist you record a "personal greeting" now before they'll let you check your voice mail box, I just record silence, because the last thing I want/need is a voicemail greeting in my voice confirming to some randomer I didn't want calling me, who I am and that my number is active, even more so knowing how I can

    [0] https://github.com/coqui-ai/TTS

  5. MockingBird

    ๐Ÿš€AIๆ‹Ÿๅฃฐ: 5็ง’ๅ†…ๅ…‹้š†ๆ‚จ็š„ๅฃฐ้Ÿณๅนถ็”Ÿๆˆไปปๆ„่ฏญ้Ÿณๅ†…ๅฎน Clone a voice in 5 seconds to generate arbitrary speech in real-time

  6. OpenVoice

    Instant voice cloning by MIT and MyShell. Audio foundation model.

    Project mention: Car brands are collecting and sharing your data with third parties | news.ycombinator.com | 2024-10-08

    That's already possible with a pretty minimal sample (a few seconds). Not worth getting too twisted up about the potential for a data breach setting your voice free.

    https://research.myshell.ai/open-voice

  7. fish-speech

    SOTA Open Source TTS

    Project mention: Generating audiobooks from E-books with Kokoro-82M | news.ycombinator.com | 2025-01-15
  8. NeMo

    A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)

    Project mention: Speaker Diarization in Python | dev.to | 2024-08-22

    NVIDIA NeMo To perform speaker diarization using NVIDIA NeMo , follow these steps:

  9. InfluxDB

    InfluxDB high-performance time series database. Collect, organize, and act on massive volumes of high-resolution data to power real-time intelligent systems.

    InfluxDB logo
  10. PaddleSpeech

    Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting. Won NAACL2022 Best Demo Award.

  11. ebook2audiobook

    Convert ebooks to audiobooks with chapters and metadata using dynamic AI models and voice cloning. Supports 1,107+ languages!

    Project mention: Generating audiobooks from E-books with Kokoro-82M | news.ycombinator.com | 2025-01-15
  12. edge-tts

    Use Microsoft Edge's online text-to-speech service from Python WITHOUT needing Microsoft Edge or Windows or an API key

    Project mention: Show HN: Voice Cloning and Multilingual TTS in One Click (Windows) | news.ycombinator.com | 2025-01-26

    There is a MIT license in the repo. In that sense it's open source.

    It's using "Edge TTS", which I believe means use API keys stolen [1] from Microsoft Edge and hope Microsoft doesn't sue you, non jolly-roger flying internet users beware.

    Can't speak to other models and their licenses, I stopped looking after I saw this since I don't feel the need to use this.

    [1] https://github.com/rany2/edge-tts/blob/ac41fb85ab2b2b48fef8a...

  13. EmotiVoice

    EmotiVoice ๐Ÿ˜Š: a Multi-Voice and Prompt-Controlled TTS Engine

  14. VALL-E-X

    An open source implementation of Microsoft's VALL-E X zero-shot TTS model. Demo is available in https://plachtaa.github.io/vallex/

  15. vits

    VITS: Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech

  16. StyleTTS2

    StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models

  17. nexa-sdk

    Nexa SDK is a comprehensive toolkit for supporting GGML and ONNX models. It supports text generation, image generation, vision-language models (VLM), Audio Language Model, auto-speech-recognition (ASR), and text-to-speech (TTS) capabilities.

    Project mention: Benchmark GGUF models with a one line of code | news.ycombinator.com | 2024-11-01
  18. Orpheus-TTS

    Towards Human-Sounding Speech

    Project mention: Orpheus TTS: The Next Generation Open-Source Text-to-Speech System | dev.to | 2025-03-23

    # Clone the repository git clone https://github.com/canopyai/Orpheus-TTS.git cd Orpheus-TTS # Install dependencies pip install -e .

  19. DiffSinger

    DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism (SVS & TTS); AAAI 2022; Official code

  20. metavoice-src

    Foundational model for human-like, expressive TTS

    Project mention: Ask HN: What is the state of OSS voice cloning? | news.ycombinator.com | 2024-09-30
  21. TensorFlowTTS

    :stuck_out_tongue_closed_eyes: TensorFlowTTS: Real-Time State-of-the-art Speech Synthesis for Tensorflow 2 (supported including English, French, Korean, Chinese, German and Easy to adapt for other languages)

  22. voice-pro

    Gradio WebUI for creators and developers, featuring key TTS (Edge-TTS, kokoro) and zero-shot Voice Cloning (E2 & F5-TTS, CosyVoice), with Whisper audio processing, YouTube download, Demucs vocal isolation, and multilingual translation.

    Project mention: Voice-Pro: Ultimate AI Voice Conversion and Multilingual Translation Tool ๐Ÿ”Š | dev.to | 2025-02-10

    GitHub: https://github.com/abus-aikorea/voice-pro

  23. vall-e

    An unofficial PyTorch implementation of the audio LM VALL-E

  24. tacotron

    A TensorFlow implementation of Google's Tacotron speech synthesis with pre-trained model (unofficial)

  25. lingvo

    Lingvo

  26. CodeRabbit

    CodeRabbit: AI Code Reviews for Developers. Revolutionize your code reviews with AI. CodeRabbit offers PR summaries, code walkthroughs, 1-click suggestions, and AST-based analysis. Boost productivity and code quality across all major languages with each PR.

    CodeRabbit logo
NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

Python Tts discussion

Log in or Post with

Python Tts related posts

  • Orpheus-3B โ€“ Emotive TTS by Canopy Labs

    7 projects | news.ycombinator.com | 19 Mar 2025
  • Orpheus TTS: The Next Generation Open-Source Text-to-Speech System

    1 project | dev.to | 23 Mar 2025
  • Voice-Pro: Ultimate AI Voice Conversion and Multilingual Translation Tool ๐Ÿ”Š

    1 project | dev.to | 10 Feb 2025
  • Show HN: Voice Cloning and Multilingual TTS in One Click (Windows)

    2 projects | news.ycombinator.com | 26 Jan 2025
  • Show HN: Eleven Labs Alternative โ€“ Voice Cloning with RVC and Multilingual TTS

    1 project | news.ycombinator.com | 25 Jan 2025
  • Show HN: Voice-Pro โ€“ Now More Powerful and Easier to Use

    1 project | news.ycombinator.com | 23 Jan 2025
  • Edge TTS

    4 projects | news.ycombinator.com | 22 Jan 2025
  • A note from our sponsor - Judoscale
    judoscale.com | 30 Apr 2025
    Judoscale integrates with Django, FastAPI, Celery, and RQ to make autoscaling easy and reliable. Save big, and say goodbye to request timeouts and backed-up task queues. Learn more โ†’

Index

What are some of the best open-source Tt projects in Python? This list will help you:

# Project Stars
1 Real-Time-Voice-Cloning 54,104
2 GPT-SoVITS 45,592
3 TTS 39,540
4 MockingBird 36,186
5 OpenVoice 31,984
6 fish-speech 20,874
7 NeMo 13,734
8 PaddleSpeech 11,833
9 ebook2audiobook 9,584
10 edge-tts 8,060
11 EmotiVoice 7,929
12 VALL-E-X 7,769
13 vits 7,180
14 StyleTTS2 5,683
15 nexa-sdk 4,516
16 Orpheus-TTS 4,488
17 DiffSinger 4,474
18 metavoice-src 4,101
19 TensorFlowTTS 3,920
20 voice-pro 3,632
21 vall-e 2,982
22 tacotron 2,972
23 lingvo 2,837

Sponsored
Save 47% on cloud hosting with autoscaling that just works
Judoscale integrates with Django, FastAPI, Celery, and RQ to make autoscaling easy and reliable. Save big, and say goodbye to request timeouts and backed-up task queues.
judoscale.com

Did you know that Python is
the 2nd most popular programming language
based on number of references?