[D] What is the best open source text to speech model?

Our great sponsors

InfluxDB - Power Real-Time Data Analytics at Scale

WorkOS - The modern identity platform for B2B SaaS

SaaSHub - Software Alternatives and Reviews

Our great sponsors

tortoise-tts

144 11,755 8.2 Jupyter Notebook

A multi-voice TTS system trained with an emphasis on quality

Tortoise TTS is supposed to be good. However inference can take a while if not on GPU's, so might not produce the real-time text-to-speech effect you want.

speechbrain

26 7,869 9.8 Python

A PyTorch-based Speech Toolkit

I don't know if it's the best, but Speechbrain is supposed to be state of the art.

InfluxDB

www.influxdata.com sponsored

Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
tacotron

3 2,927 0.0 Python

A TensorFlow implementation of Google's Tacotron speech synthesis with pre-trained model (unofficial)

Tacotron submitted: Mar 29, 2017 paper: https://arxiv.org/pdf/1703.10135.pdf github: https://github.com/keithito/tacotron (Not the official implementation but is the once cited the most)

tacotron2

28 4,890 0.0 Jupyter Notebook

Tacotron 2 - PyTorch implementation with faster-than-realtime inference
flowtron

6 881 0.0 Jupyter Notebook

Flowtron is an auto-regressive flow-based generative network for text to speech synthesis with control over speech variation and style transfer
FastSpeech2

4 1,612 0.0 Python

An implementation of Microsoft's "FastSpeech 2: Fast and High-Quality End-to-End Text to Speech"

FastSpeech2 submitted: Jun 8, 2020 paper: https://arxiv.org/pdf/2006.04558.pdf github: https://github.com/ming024/FastSpeech2 (Not the official implementation but is the once cited the most)

NeMo

29 10,021 9.8 Python

A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
WorkOS

workos.com sponsored

The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
waveglow

2 2,218 0.0 Python

A Flow-based Generative Network for Speech Synthesis
hifi-gan

5 1,757 0.0 Python

HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis
radtts

1 270 0.0 Roff

Provides training, inference and voice conversion recipes for RADTTS and RADTTS++: Flow-based TTS models with Robust Alignment Learning, Diverse Synthesis, and Generative Modeling and Fine-Grained Control over of Low Dimensional (F0 and Energy) Speech Attributes.

RadTTS submitted: Aug 18, 2021 (NVIDIA page, not Arxiv) paper: https://openreview.net/pdf?id=0NQwnnwAORi github: https://github.com/NVIDIA/radtts

Speech-Backbones

1 523 0.0 Jupyter Notebook

This is the main repository of open-sourced speech technology by Huawei Noah's Ark Lab.
vits

6 6,230 0.0 Python

VITS: Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech
STYLER

3 150 1.8 Python

Official repository of STYLER: Style Factor Modeling with Rapidity and Robustness via Speech Decomposition for Expressive and Controllable Neural Text to Speech, INTERSPEECH 2021 (by keonlee9420)
DiffSinger

1 223 10.0 Python

PyTorch implementation of DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism (focused on DiffSpeech) (by keonlee9420)

DiffTTS (DiffSinger) submitted: Apr 3, 2021 paper: https://arxiv.org/pdf/2104.01409v1.pdf github: https://github.com/keonlee9420/DiffSinger

SaaSHub

www.saashub.com sponsored

SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

WhisperSpeech – An Open Source text-to-speech system built by inverting Whisper
9 projects | news.ycombinator.com | 17 Jan 2024
[D] TTS systems to download & run offline
3 projects | /r/MachineLearning | 14 May 2023
githubで公開されている音声自動生成AI、日本のアニメキャラ2890名分の音声を学習素材に超速度で進化中
4 projects | /r/r_kenmou | 2 Nov 2022
日本語英語中国語を読み上げできる音声自動生成AIがgithubで公開され話題に
2 projects | /r/r_kenmou | 10 Oct 2022
[D] What offline TTS Model is good enough for a realistic real-time task?
2 projects | /r/MachineLearning | 10 Dec 2023

[D] What is the best open source text to speech model?

This page summarizes the projects mentioned and recommended in the original post on /r/MachineLearning
speech-synthesis Tts Deep Learning text-to-speech Pytorch
Post date: 13 Apr 2023

tortoise-tts

speechbrain

InfluxDB

tacotron

tacotron2

flowtron

FastSpeech2

NeMo

WorkOS

waveglow

hifi-gan

radtts

Speech-Backbones

vits

STYLER

DiffSinger

SaaSHub

Related posts

[D] What is the best open source text to speech model?

This page summarizes the projects mentioned and recommended in the original post on /r/MachineLearning speech-synthesis Tts Deep Learning text-to-speech Pytorch Post date: 13 Apr 2023

Related posts

This page summarizes the projects mentioned and recommended in the original post on /r/MachineLearning
speech-synthesis Tts Deep Learning text-to-speech Pytorch
Post date: 13 Apr 2023