Open-source projects categorized as text-to-speech

Top 23 text-to-speech Open-Source Projects

  • MockingBird

    🚀AI拟声: 5秒内克隆您的声音并生成任意语音内容 Clone a voice in 5 seconds to generate arbitrary speech in real-time

  • Scout Monitoring

    Free Django app performance insights with Scout Monitoring. Get Scout setup in minutes, and let us sweat the small stuff. A couple lines in settings.py is all you need to start monitoring your apps. Sign up for our free tier today.

    Scout Monitoring logo
  • TTS

    🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production

  • Project mention: AIM Weekly 17 June 2024 | dev.to | 2024-06-17
  • OpenVoice

    Instant voice cloning by MyShell.

  • Project mention: OpenVoice: Instant Voice Cloning | news.ycombinator.com | 2024-04-26
  • Leon

    🧠 Leon is your open-source personal assistant.

  • Project mention: Rabbit R1, Designed by Teenage Engineering | news.ycombinator.com | 2024-01-09

    It's indeed suspicious. You're sending your voice samples, your various services accounts, your location and more private data to some proprietary black box in some public cloud. Sorry, but this is a privacy nightmare. It should be open source and self-hosted like Mycroft (https://mycroft.ai) or Leon (https://getleon.ai) to be trustworthy.

  • TTS

    :robot: :speech_balloon: Deep learning for Text to Speech (Discussion forum: https://discourse.mozilla.org/c/tts) (by mozilla)

  • Project mention: Coqui.ai Is Shutting Down | news.ycombinator.com | 2024-01-03

    Coqui-ai was a commercial continuation of Mozilla TTS and STT (https://github.com/mozilla/TTS).

    At the time (2018-ish), it was really impressive for on-device voice synthesis (with a quality approaching the Google and Azure cloud-based voice synthesis options) and open source, so a lot of people in the FOSS community were hoping it could be used for a privacy-respecting home assistant, Linux speech synthesis that doesn't suck, etc.

    After Mozilla abandoned the project, Coqui continued development and had some really impressive one-shot voice cloning, but pivoted to marketing speech synthesis for game developers. They were probably having trouble monetizing it, and it doesn't surprise me that they shut down.

    An equivalent project that's still in active development and doing really well is Piper TTS (https://github.com/rhasspy/piper).

  • VALL-E-X

    An open source implementation of Microsoft's VALL-E X zero-shot TTS model. Demo is available in https://plachtaa.github.io

  • Project mention: FLaNK Stack Weekly 12 February 2024 | dev.to | 2024-02-12
  • pyvideotrans

    Translate the video from one language to another and add dubbing. 将视频从一种语言翻译为另一种语言,并添加配音

  • Project mention: FLaNK Stack Weekly 06 Nov 2023 | dev.to | 2023-11-06
  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • EmotiVoice

    EmotiVoice 😊: a Multi-Voice and Prompt-Controlled TTS Engine

  • Project mention: FLaNK Stack Weekly 12 February 2024 | dev.to | 2024-02-12
  • vits

    VITS: Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech

  • piper

    A fast, local neural text to speech system

  • Project mention: Coqui.ai TTS: A Deep Learning Toolkit for Text-to-Speech | news.ycombinator.com | 2024-06-11
  • silero-models

    Silero Models: pre-trained speech-to-text, text-to-speech and text-enhancement models made embarrassingly simple

  • Project mention: Weird A.I. Yankovic, a cursed deep dive into the world of voice cloning | news.ycombinator.com | 2023-10-02

    I doubt it's currently actually "the best open source text to speech", but the answer I came up with when throwing a couple of hours at the problem some months ago was "Silero" [0, 1].

    Following the "standalone" guide [2], it was pretty trivial to make the model render my sample text in about 100 English "voices" (many of which were similar to each other, and in varying quality). Sampling those, I got about 10 that were pretty "good". And maybe 6 that were the "best ones" (pretty natural, not annoying to listen to).

    IIRC the license was free for noncommercial use only. I'm not sure exactly "how open source" they are, but it was simple to install the dependencies and write the basic Python to try it out; I had to write a for loop to try all the voices like I wanted. I ended using something else for the project for other reasons, but this could still be fairly good backup option for some use cases IMO.

      [0] https://github.com/snakers4/silero-models#text-to-speech

  • DiffSinger

    DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism (SVS & TTS); AAAI 2022; Official code

  • Amphion

    Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, music, and speech generation research and development.

  • Project mention: FLaNK Stack Weekly 11 Dec 2023 | dev.to | 2023-12-11
  • edge-tts

    Use Microsoft Edge's online text-to-speech service from Python WITHOUT needing Microsoft Edge or Windows or an API key

  • Project mention: [discussion] text to voice generation for textbooks (non-math part) | /r/MachineLearning | 2023-12-01

    i would very much like to use it to turn the text parts of a book into an audio where i could listen to it while reading. i used edge's tts for speech by giving a paragraph to clipboard and to edge-tts in order to listen the text but it causes two problems: 1. you need internet connection and have the book opened 2. can only do paragraph by paragraph, and is prone to errors or sometimes if you use it too much it wont convert the full text afterwards.

  • espeak-ng

    eSpeak NG is an open source speech synthesizer that supports more than hundred languages and accents.

  • Project mention: ESpeak-ng: speech synthesizer with more than one hundred languages and accents | news.ycombinator.com | 2024-05-01

    After some brief research it seems the issue you're seeing may be a known bug in at least some versions/release of espeak-ng.

    Here's some potentially related links if you'd like to dig deeper:

    * "questions about mandarin data packet #1044": https://github.com/espeak-ng/espeak-ng/issues/1044

    * "ESpeak NJ-1.51’s Mandarin pronunciation is corrupted #12952": https://github.com/nvaccess/nvda/issues/12952

    * "The pronunciation of Mandarin Chinese using ESpeak NJ in NVDA is not normal #1028": https://github.com/espeak-ng/espeak-ng/issues/1028

    * "When espeak-ng translates Chinese (cmn), IPA tone symbols are not output correctly #305": https://github.com/rhasspy/piper/issues/305

    * "Please default ESpeak NG's voice role to 'Chinese (Mandarin, latin as Pinyin)' for Chinese to fix #12952 #13572": https://github.com/nvaccess/nvda/issues/13572

    * "Cmn voice not correctly translated #1370": https://github.com/espeak-ng/espeak-ng/issues/1370

  • TensorFlowTTS

    :stuck_out_tongue_closed_eyes: TensorFlowTTS: Real-Time State-of-the-art Speech Synthesis for Tensorflow 2 (supported including English, French, Korean, Chinese, German and Easy to adapt for other languages)

  • Project mention: Ask HN: On-Device Text to Speech | news.ycombinator.com | 2023-08-31

    Hey HN, has anyone found a viable solution for doing this locally and offline on iOS? I'd like to offer a privacy-friendly text to speech feature to my App, and Apple's speech synthesis sounds awful compared to some newer models and TTS engines. The only thing I've found is an older TensorflowTTS example here: https://github.com/TensorSpeech/TensorFlowTTS/tree/master/examples/ios

    Any pointers or tips appreciated.

  • Awesome-Prompt-Engineering

    This repository contains a hand-curated resources for Prompt Engineering with a focus on Generative Pre-trained Transformer (GPT), ChatGPT, PaLM etc

  • bark-with-voice-clone

    🔊 Text-prompted Generative Audio Model - With the ability to clone voices

  • Project mention: I've open sourced my Flutter plugin to run on-device LLMs on any platform. TestFlight builds available now. | /r/FlutterDev | 2023-12-08

    And more stuff I’m often checking back on: - https://github.com/staghado/vit.cpp - https://github.com/serp-ai/bark-with-voice-clone - https://github.com/leejet/stable-diffusion.cpp (generate images) - etc … there’s too much fun stuff out there. Wish I had more free time haha.

  • vall-e

    An unofficial PyTorch implementation of the audio LM VALL-E

  • aeneas

    aeneas is a Python/C library and a set of tools to automagically synchronize audio and text (aka forced alignment)

  • Tacotron-2

    DeepMind's Tacotron-2 Tensorflow implementation

  • marytts

    MARY TTS -- an open-source, multilingual text-to-speech synthesis system written in pure java

  • gTTS

    Python library and CLI tool to interface with Google Translate's text-to-speech API

  • Project mention: Using Groq to Build a Real-Time Language Translation App | dev.to | 2024-04-05

    For our real-time TTS needs, we'll employ the fantastic library called gTTS.

  • SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

text-to-speech discussion

Log in or Post with

text-to-speech related posts


What are some of the best open-source text-to-speech projects? This list will help you:

Project Stars
1 MockingBird 34,282
2 TTS 31,005
3 OpenVoice 26,751
4 Leon 14,837
5 TTS 8,951
6 VALL-E-X 7,361
7 pyvideotrans 6,612
8 EmotiVoice 6,565
9 vits 6,431
10 piper 4,693
11 silero-models 4,650
12 DiffSinger 4,107
13 Amphion 4,084
14 edge-tts 4,004
15 espeak-ng 3,850
16 TensorFlowTTS 3,750
17 Awesome-Prompt-Engineering 3,389
18 bark-with-voice-clone 2,919
19 vall-e 2,875
20 aeneas 2,379
21 Tacotron-2 2,244
22 marytts 2,208
23 gTTS 2,176

Free Django app performance insights with Scout Monitoring
Get Scout setup in minutes, and let us sweat the small stuff. A couple lines in settings.py is all you need to start monitoring your apps. Sign up for our free tier today.