PlayHT2.0: State-of-the-Art Generative Voice AI Model for Conversational Speech

InfluxDB - Power Real-Time Data Analytics at Scale

Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

www.influxdata.com

featured

SaaSHub - Software Alternatives and Reviews

SaaSHub helps you find the best software and product alternatives

www.saashub.com

featured

tortoise-tts

145 11,819 8.0 Jupyter Notebook

A multi-voice TTS system trained with an emphasis on quality

Previously TortoiseTTS was associated with PlayHT in some way, although the exact connection is a bit vague [0].
From the descriptions here it sounds a lot like AudioLM / SPEAR TTS / some of Meta's recent multilingual TTS approaches, although those models are not open source, sounds like PlayHT's approach is in a similar spirit. The discussion of "mel tokens" is closer to what I would call the classic TTS pipeline in many ways... PlayHT has generally been kind of closed about what they used, would be interesting to know more.
I assume the key factor here is high quality, emotive audio with good data cleaning processes. Probably not even a lot of data, at least in the scale of "a lot" in speech, e.g. ASR (millions of hours) or TTS (hundreds to thousands). As opposed to some radically new architectural piece never before seen in the literature, there are lots of really nice tools for emotive and expressive TTS buried in recent years of publications.
Tacotron 2 is perfectly capable of this type of stuff as well, as shown by Dessa [1] a few years ago (this writeup is a nice intro to TTS concepts). With the limit largely being, at some point you haven't heard certain phonetic sounds before in a voice, and need to do something to get plausible outcomes for new voices.
[0] Discussion here https://github.com/neonbjb/tortoise-tts/issues/182#issuecomm...
[1] https://medium.com/dessa-news/realtalk-how-it-works-94c1afda...

InfluxDB

www.influxdata.com featured

Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Machine Learning Models: Linear Regression

2 projects | dev.to | 3 May 2024
Finetune a GPT Model for Spam Detection on Your Laptop in Just 5 Minutes

1 project | news.ycombinator.com | 3 May 2024
AI Agent Notebooks Using LangChain, LlamaIndex, Milvus, and More

1 project | news.ycombinator.com | 3 May 2024
Mastering Dataset Acquisition: A Comprehensive Guide

2 projects | dev.to | 3 May 2024
Simple GitHub Issue Handled(?) By Copilot Workspace

1 project | news.ycombinator.com | 2 May 2024

PlayHT2.0: State-of-the-Art Generative Voice AI Model for Conversational Speech

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com Post date: 11 Aug 2023

tortoise-tts

InfluxDB

Related posts

Machine Learning Models: Linear Regression

Finetune a GPT Model for Spam Detection on Your Laptop in Just 5 Minutes

AI Agent Notebooks Using LangChain, LlamaIndex, Milvus, and More

Mastering Dataset Acquisition: A Comprehensive Guide

Simple GitHub Issue Handled(?) By Copilot Workspace