Python speech-synthesis

Open-source Python projects categorized as speech-synthesis

Top 23 Python speech-synthesis Projects

  • TTS

    🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production

  • Project mention: Show HN: Pi-C.A.R.D, a Raspberry Pi Voice Assistant | | 2024-05-13

    When I did a similar thing (but with less LLM) I liked but back then I needed to cut out the conversion step from tensor to a list of numbers to make it work really nicely.

  • PaddleSpeech

    Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting. Won NAACL2022 Best Demo Award.

  • Project mention: Open Source Libraries | /r/AudioAI | 2023-10-02


  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • NeMo

    A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)

  • Project mention: [P] Making a TTS voice, HK-47 from Kotor using Tortoise (Ideally WaveRNN) | /r/MachineLearning | 2023-07-06

    I don't test WaveRNN but from the ones that I know the best that is open source is FastPitch. And it's easy to use, here is the tutorial for voice cloning.

  • so-vits-svc-fork

    so-vits-svc fork with realtime support, improved interface and more features.

  • Project mention: Zade - Çaresizim | /r/zfam | 2023-06-21
  • espnet

    End-to-End Speech Processing Toolkit

  • Project mention: WhisperSpeech – An Open Source text-to-speech system built by inverting Whisper | | 2024-01-17

    You might check out this list from espnet. They list the different corpuses they use to train their models sorted by language and task (ASR, TTS etc):

  • EmotiVoice

    EmotiVoice 😊: a Multi-Voice and Prompt-Controlled TTS Engine

  • Project mention: FLaNK Stack Weekly 12 February 2024 | | 2024-02-12
  • vits

    VITS: Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech

  • SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
  • DiffSinger

    DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism (SVS & TTS); AAAI 2022; Official code

  • Amphion

    Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, music, and speech generation research and development.

  • Project mention: FLaNK Stack Weekly 11 Dec 2023 | | 2023-12-11
  • TensorFlowTTS

    :stuck_out_tongue_closed_eyes: TensorFlowTTS: Real-Time State-of-the-art Speech Synthesis for Tensorflow 2 (supported including English, French, Korean, Chinese, German and Easy to adapt for other languages)

  • Project mention: Ask HN: On-Device Text to Speech | | 2023-08-31

    Hey HN, has anyone found a viable solution for doing this locally and offline on iOS? I'd like to offer a privacy-friendly text to speech feature to my App, and Apple's speech synthesis sounds awful compared to some newer models and TTS engines. The only thing I've found is an older TensorflowTTS example here:

    Any pointers or tips appreciated.

  • edge-tts

    Use Microsoft Edge's online text-to-speech service from Python WITHOUT needing Microsoft Edge or Windows or an API key

  • Project mention: [discussion] text to voice generation for textbooks (non-math part) | /r/MachineLearning | 2023-12-01

    i would very much like to use it to turn the text parts of a book into an audio where i could listen to it while reading. i used edge's tts for speech by giving a paragraph to clipboard and to edge-tts in order to listen the text but it causes two problems: 1. you need internet connection and have the book opened 2. can only do paragraph by paragraph, and is prone to errors or sometimes if you use it too much it wont convert the full text afterwards.

  • tacotron

    A TensorFlow implementation of Google's Tacotron speech synthesis with pre-trained model (unofficial)

  • lingvo


  • Tacotron-2

    DeepMind's Tacotron-2 Tensorflow implementation

  • WaveRNN

    WaveRNN Vocoder + TTS

  • hifi-gan

    HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis

  • kalliope

    Kalliope is a framework that will help you to create your own personal assistant.

  • naturalspeech2-pytorch

    Implementation of Natural Speech 2, Zero-shot Speech and Singing Synthesizer, in Pytorch

  • SpeechT5

    Unified-Modal Speech-Text Pre-Training for Spoken Language Processing

  • Project mention: [HELP] Speech2Speech translator with speaker voice preservation | /r/learnmachinelearning | 2023-05-20

    Hey! I’m doing a somewhat similar project but for TTS / voice cloning. This might not be too relevant for you but it might be one way to solve your problem. We based our project onSpeecht5 which is a multimodal setup that can take in audio or text and output audio or text. It uses speaker embeddings to handle multiple speakers, so you could use Metas S2ST to translate audio and this model to preserve the voice by doing audio to audio speech conversion. Here’s a hugging tutorial which mentions speech conversion with speecht5

  • autovc

    AutoVC: Zero-Shot Voice Style Transfer with Only Autoencoder Loss

  • NATSpeech

    A Non-Autoregressive Text-to-Speech (NAR-TTS) framework, including official PyTorch implementation of PortaSpeech (NeurIPS 2021) and DiffSpeech (AAAI 2022)

  • voicefixer

    General Speech Restoration

  • Project mention: Linux Audio Noise suppression using deep filtering in Rust | | 2023-06-06
  • diffwave

    DiffWave is a fast, high-quality neural vocoder and waveform synthesizer.

  • SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

Python speech-synthesis related posts

  • Ask HN: Open-source, local Text-to-Speech (TTS) generators

    2 projects | | 7 May 2024
  • Show HN: WhisperFusion – Ultra-low latency conversations with an AI chatbot

    7 projects | | 29 Jan 2024
  • WhisperSpeech – An Open Source text-to-speech system built by inverting Whisper

    9 projects | | 17 Jan 2024
  • Microsoft releases Windows AI studio to run and fine tune models locally

    4 projects | | 13 Dec 2023
  • [D] What offline TTS Model is good enough for a realistic real-time task?

    2 projects | /r/MachineLearning | 10 Dec 2023
  • [discussion] text to voice generation for textbooks (non-math part)

    1 project | /r/MachineLearning | 1 Dec 2023
  • StyleTTS2 – open-source Eleven Labs quality Text To Speech

    10 projects | | 19 Nov 2023
  • A note from our sponsor - SaaSHub | 19 May 2024
    SaaSHub helps you find the best software and product alternatives Learn more →


What are some of the best open-source speech-synthesis projects in Python? This list will help you:

Project Stars
1 TTS 29,831
2 PaddleSpeech 10,233
3 NeMo 10,227
4 so-vits-svc-fork 8,378
5 espnet 7,932
6 EmotiVoice 6,405
7 vits 6,324
8 DiffSinger 4,107
9 Amphion 3,975
10 TensorFlowTTS 3,714
11 edge-tts 3,758
12 tacotron 2,928
13 lingvo 2,779
14 Tacotron-2 2,235
15 WaveRNN 2,086
16 hifi-gan 1,774
17 kalliope 1,699
18 naturalspeech2-pytorch 1,209
19 SpeechT5 1,044
20 autovc 961
21 NATSpeech 944
22 voicefixer 919
23 diffwave 727

SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives