Fish Speech TTS: clone OpenAI TTS in 30 minutes

Scout Monitoring - Free Django app performance insights with Scout Monitoring

Get Scout setup in minutes, and let us sweat the small stuff. A couple lines in settings.py is all you need to start monitoring your apps. Sign up for our free tier today.

www.scoutapm.com

featured

InfluxDB - Power Real-Time Data Analytics at Scale

Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

www.influxdata.com

featured

fish-speech

2 2,240 9.7 Python

Brand new TTS solution

While we are still figuring out ways to improve the agent's emotional response to OpenAI GPT-4 level, we have already made significant progress in aligning OpenAI's TTS performance. To begin this experiment, we collected 10 hours of OpenAI TTS data to perform supervised fine-tuning (SFT) on both the LLM and VITS models, which took approximately 30 minutes. After that, we used 15 seconds of audio as a prompt during inference.
Demos Available: https://firefly-ai.notion.site/OpenAI-Examples-34975ae263a9496c84e89fb7b1ea25a4?pvs=4
As you can see, the model's emotion, rhythm, accent, and timbre match the OpenAI speakers, though there is some degradation in audio quality, which we are working on. To avoid any legal issues, we are unable to release the fine-tuned model, but I believe everyone can tune Fish Speech to this level within hours and for around $20.
Our experiment shows that with only 25 seconds of prompts (few-shot learning), without any fine-tuning, the model can mimic most behaviors except for how it reads numbers. To the best of our knowledge, you can clone how someone speaks in English, Chinese, and Japanese with 30 minutes of data using this framework.
Repo: https://github.com/fishaudio/fish-speech

Scout Monitoring

www.scoutapm.com featured

Free Django app performance insights with Scout Monitoring. Get Scout setup in minutes, and let us sweat the small stuff. A couple lines in settings.py is all you need to start monitoring your apps. Sign up for our free tier today.

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

[D] Prepared a Deep Voice Cloning tutorial by using TorToiSe TTS. Do you thin it is best available open source at the moment?

4 projects | /r/MachineLearning | 14 May 2023
I made Lisa-nee TTS (Imai Lisa)

2 projects | /r/BanGDream | 4 Feb 2023
VALL-E unoffical implementation (text to speech synthesis)

1 project | news.ycombinator.com | 21 Jan 2023
Do you think Vall-E will ever be Open Source?

1 project | /r/ValleAI | 16 Jan 2023
Ollama v0.1.45

5 projects | news.ycombinator.com | 15 Jun 2024

Fish Speech TTS: clone OpenAI TTS in 30 minutes

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com
hardware-buttons scrape-images linkedin-bot
Post date: 22 May 2024

fish-speech

Scout Monitoring

Related posts

[D] Prepared a Deep Voice Cloning tutorial by using TorToiSe TTS. Do you thin it is best available open source at the moment?

I made Lisa-nee TTS (Imai Lisa)

VALL-E unoffical implementation (text to speech synthesis)

Do you think Vall-E will ever be Open Source?

Ollama v0.1.45

Fish Speech TTS: clone OpenAI TTS in 30 minutes

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com hardware-buttons scrape-images linkedin-bot Post date: 22 May 2024

fish-speech

Scout Monitoring

Related posts

[D] Prepared a Deep Voice Cloning tutorial by using TorToiSe TTS. Do you thin it is best available open source at the moment?

I made Lisa-nee TTS (Imai Lisa)

VALL-E unoffical implementation (text to speech synthesis)

Do you think Vall-E will ever be Open Source?

Ollama v0.1.45

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com
hardware-buttons scrape-images linkedin-bot
Post date: 22 May 2024