speech-synthesis

Open-source projects categorized as speech-synthesis

Top 23 speech-synthesis Open-Source Projects

  • TTS

    🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production

    Project mention: What things are happening in ML that we can't hear oer the din of LLMs? | news.ycombinator.com | 2024-03-28

    Not sure how relevant this is but note that Coqui TTS (the realistic TTS) has already shut down

    https://coqui.ai

  • Leon

    🧠 Leon is your open-source personal assistant.

    Project mention: Rabbit R1, Designed by Teenage Engineering | news.ycombinator.com | 2024-01-09

    It's indeed suspicious. You're sending your voice samples, your various services accounts, your location and more private data to some proprietary black box in some public cloud. Sorry, but this is a privacy nightmare. It should be open source and self-hosted like Mycroft (https://mycroft.ai) or Leon (https://getleon.ai) to be trustworthy.

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

  • DeepLearningExamples

    State-of-the-Art Deep Learning scripts organized by models - easy to train and deploy with reproducible accuracy and performance on enterprise-grade infrastructure.

  • PaddleSpeech

    Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting. Won NAACL2022 Best Demo Award.

    Project mention: Open Source Libraries | /r/AudioAI | 2023-10-02

    PaddlePaddle/PaddleSpeech

  • NeMo

    NeMo: a framework for generative AI

    Project mention: [P] Making a TTS voice, HK-47 from Kotor using Tortoise (Ideally WaveRNN) | /r/MachineLearning | 2023-07-06

    I don't test WaveRNN but from the ones that I know the best that is open source is FastPitch. And it's easy to use, here is the tutorial for voice cloning.

  • so-vits-svc-fork

    so-vits-svc fork with realtime support, improved interface and more features.

    Project mention: Zade - Çaresizim | /r/zfam | 2023-06-21
  • espnet

    End-to-End Speech Processing Toolkit

    Project mention: WhisperSpeech – An Open Source text-to-speech system built by inverting Whisper | news.ycombinator.com | 2024-01-17

    You might check out this list from espnet. They list the different corpuses they use to train their models sorted by language and task (ASR, TTS etc):

    https://github.com/espnet/espnet/blob/master/egs2/README.md

  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

  • EmotiVoice

    EmotiVoice 😊: a Multi-Voice and Prompt-Controlled TTS Engine

    Project mention: FLaNK Stack Weekly 12 February 2024 | dev.to | 2024-02-12
  • vits

    VITS: Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech

    Project mention: [D] TTS systems to download & run offline | /r/MachineLearning | 2023-05-14

    And the voice encapsulation system VITS https://github.com/jaywalnut310/vits

  • silero-models

    Silero Models: pre-trained speech-to-text, text-to-speech and text-enhancement models made embarrassingly simple

    Project mention: Weird A.I. Yankovic, a cursed deep dive into the world of voice cloning | news.ycombinator.com | 2023-10-02

    I doubt it's currently actually "the best open source text to speech", but the answer I came up with when throwing a couple of hours at the problem some months ago was "Silero" [0, 1].

    Following the "standalone" guide [2], it was pretty trivial to make the model render my sample text in about 100 English "voices" (many of which were similar to each other, and in varying quality). Sampling those, I got about 10 that were pretty "good". And maybe 6 that were the "best ones" (pretty natural, not annoying to listen to).

    IIRC the license was free for noncommercial use only. I'm not sure exactly "how open source" they are, but it was simple to install the dependencies and write the basic Python to try it out; I had to write a for loop to try all the voices like I wanted. I ended using something else for the project for other reasons, but this could still be fairly good backup option for some use cases IMO.

      [0] https://github.com/snakers4/silero-models#text-to-speech

  • DiffSinger

    DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism (SVS & TTS); AAAI 2022; Official code

  • Amphion

    Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, music, and speech generation research and development.

    Project mention: FLaNK Stack Weekly 11 Dec 2023 | dev.to | 2023-12-11
  • TensorFlowTTS

    :stuck_out_tongue_closed_eyes: TensorFlowTTS: Real-Time State-of-the-art Speech Synthesis for Tensorflow 2 (supported including English, French, Korean, Chinese, German and Easy to adapt for other languages)

    Project mention: Ask HN: On-Device Text to Speech | news.ycombinator.com | 2023-08-31

    Hey HN, has anyone found a viable solution for doing this locally and offline on iOS? I'd like to offer a privacy-friendly text to speech feature to my App, and Apple's speech synthesis sounds awful compared to some newer models and TTS engines. The only thing I've found is an older TensorflowTTS example here: https://github.com/TensorSpeech/TensorFlowTTS/tree/master/examples/ios

    Any pointers or tips appreciated.

  • piper

    A fast, local neural text to speech system (by rhasspy)

    Project mention: WhisperSpeech – An Open Source text-to-speech system built by inverting Whisper | news.ycombinator.com | 2024-01-17

    If you're not already aware, the primary developer of Mimic 3 (and its non-Mimic predecessor Larynx) continued TTS-related development with Larynx and the renamed project Piper: https://github.com/rhasspy/piper

    Last year Piper development was supported by Nabu Casa for their "Year of Voice" project for Home Assistant and it sounds like Mike Hansen is going to continue on it with their support this year.

  • edge-tts

    Use Microsoft Edge's online text-to-speech service from Python WITHOUT needing Microsoft Edge or Windows or an API key

    Project mention: [discussion] text to voice generation for textbooks (non-math part) | /r/MachineLearning | 2023-12-01

    i would very much like to use it to turn the text parts of a book into an audio where i could listen to it while reading. i used edge's tts for speech by giving a paragraph to clipboard and to edge-tts in order to listen the text but it causes two problems: 1. you need internet connection and have the book opened 2. can only do paragraph by paragraph, and is prone to errors or sometimes if you use it too much it wont convert the full text afterwards.

  • WhisperSpeech

    An Open Source text-to-speech system built by inverting Whisper.

    Project mention: Show HN: WhisperFusion – Ultra-low latency conversations with an AI chatbot | news.ycombinator.com | 2024-01-29

    - WhisperSpeech for the text-to-speech - https://github.com/collabora/WhisperSpeech

    and an LLM (phi-2, Mistral, etc.) in between

  • tacotron

    A TensorFlow implementation of Google's Tacotron speech synthesis with pre-trained model (unofficial)

    Project mention: [D] What is the best open source text to speech model? | /r/MachineLearning | 2023-04-13

    Tacotron submitted: Mar 29, 2017 paper: https://arxiv.org/pdf/1703.10135.pdf github: https://github.com/keithito/tacotron (Not the official implementation but is the once cited the most)

  • awesome-speech-recognition-speech-synthesis-papers

    Automatic Speech Recognition (ASR), Speaker Verification, Speech Synthesis, Text-to-Speech (TTS), Language Modelling, Singing Voice Synthesis (SVS), Voice Conversion (VC)

  • espeak-ng

    eSpeak NG is an open source speech synthesizer that supports more than hundred languages and accents.

    Project mention: IAMA senior javascript dev, ask me anything | /r/learnjavascript | 2023-07-01

    I'm skeptical about a senior JavaScript developer claiming to be bored. Nonetheless, let's see. How would you go about modifying [this](ng/blob/master/emscripten/espeakng_glue.idl) IDL file, this C++ glue code, and the relevant Make file to compile eSpeak NG to JavaScript with Emscripten with SSML support enabled?

  • lingvo

    Lingvo

  • chat-with-gpt

    An open-source ChatGPT app with a voice

    Project mention: Different chatGPT 4 API Interface? | /r/ChatGPTPro | 2023-08-18
  • Tacotron-2

    DeepMind's Tacotron-2 Tensorflow implementation

  • marytts

    MARY TTS -- an open-source, multilingual text-to-speech synthesis system written in pure java

    Project mention: Breton voice generation from text? | /r/NewToReddit | 2023-06-08

    I did find one but I am not sure if the link is still reliable https://marytts.github.io/.

  • SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020). The latest post mention was on 2024-03-28.

speech-synthesis related posts

Index

What are some of the best open-source speech-synthesis projects? This list will help you:

Project Stars
1 TTS 28,249
2 Leon 14,415
3 DeepLearningExamples 12,490
4 PaddleSpeech 9,957
5 NeMo 9,714
6 so-vits-svc-fork 8,193
7 espnet 7,769
8 EmotiVoice 6,131
9 vits 6,124
10 silero-models 4,460
11 DiffSinger 4,051
12 Amphion 3,715
13 TensorFlowTTS 3,670
14 piper 3,506
15 edge-tts 3,243
16 WhisperSpeech 3,234
17 tacotron 2,913
18 awesome-speech-recognition-speech-synthesis-papers 2,848
19 espeak-ng 2,780
20 lingvo 2,780
21 chat-with-gpt 2,242
22 Tacotron-2 2,230
23 marytts 2,193
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com