Fish Speech TTS: clone OpenAI TTS in 30 minutes

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

Scout Monitoring - Free Django app performance insights with Scout Monitoring
Get Scout setup in minutes, and let us sweat the small stuff. A couple lines in settings.py is all you need to start monitoring your apps. Sign up for our free tier today.
www.scoutapm.com
featured
InfluxDB - Power Real-Time Data Analytics at Scale
Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
www.influxdata.com
featured
  • fish-speech

    Brand new TTS solution

  • While we are still figuring out ways to improve the agent's emotional response to OpenAI GPT-4 level, we have already made significant progress in aligning OpenAI's TTS performance. To begin this experiment, we collected 10 hours of OpenAI TTS data to perform supervised fine-tuning (SFT) on both the LLM and VITS models, which took approximately 30 minutes. After that, we used 15 seconds of audio as a prompt during inference.

    Demos Available: https://firefly-ai.notion.site/OpenAI-Examples-34975ae263a9496c84e89fb7b1ea25a4?pvs=4

    As you can see, the model's emotion, rhythm, accent, and timbre match the OpenAI speakers, though there is some degradation in audio quality, which we are working on. To avoid any legal issues, we are unable to release the fine-tuned model, but I believe everyone can tune Fish Speech to this level within hours and for around $20.

    Our experiment shows that with only 25 seconds of prompts (few-shot learning), without any fine-tuning, the model can mimic most behaviors except for how it reads numbers. To the best of our knowledge, you can clone how someone speaks in English, Chinese, and Japanese with 30 minutes of data using this framework.

    Repo: https://github.com/fishaudio/fish-speech

  • Scout Monitoring

    Free Django app performance insights with Scout Monitoring. Get Scout setup in minutes, and let us sweat the small stuff. A couple lines in settings.py is all you need to start monitoring your apps. Sign up for our free tier today.

    Scout Monitoring logo
NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts

  • [D] Prepared a Deep Voice Cloning tutorial by using TorToiSe TTS. Do you thin it is best available open source at the moment?

    4 projects | /r/MachineLearning | 14 May 2023
  • I made Lisa-nee TTS (Imai Lisa)

    2 projects | /r/BanGDream | 4 Feb 2023
  • VALL-E unoffical implementation (text to speech synthesis)

    1 project | news.ycombinator.com | 21 Jan 2023
  • Do you think Vall-E will ever be Open Source?

    1 project | /r/ValleAI | 16 Jan 2023
  • Ollama v0.1.45

    5 projects | news.ycombinator.com | 15 Jun 2024