PlayHT2.0: State-of-the-Art Generative Voice AI Model for Conversational Speech

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

InfluxDB - Power Real-Time Data Analytics at Scale
Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
www.influxdata.com
featured
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com
featured
  • tortoise-tts

    A multi-voice TTS system trained with an emphasis on quality

  • Previously TortoiseTTS was associated with PlayHT in some way, although the exact connection is a bit vague [0].

    From the descriptions here it sounds a lot like AudioLM / SPEAR TTS / some of Meta's recent multilingual TTS approaches, although those models are not open source, sounds like PlayHT's approach is in a similar spirit. The discussion of "mel tokens" is closer to what I would call the classic TTS pipeline in many ways... PlayHT has generally been kind of closed about what they used, would be interesting to know more.

    I assume the key factor here is high quality, emotive audio with good data cleaning processes. Probably not even a lot of data, at least in the scale of "a lot" in speech, e.g. ASR (millions of hours) or TTS (hundreds to thousands). As opposed to some radically new architectural piece never before seen in the literature, there are lots of really nice tools for emotive and expressive TTS buried in recent years of publications.

    Tacotron 2 is perfectly capable of this type of stuff as well, as shown by Dessa [1] a few years ago (this writeup is a nice intro to TTS concepts). With the limit largely being, at some point you haven't heard certain phonetic sounds before in a voice, and need to do something to get plausible outcomes for new voices.

    [0] Discussion here https://github.com/neonbjb/tortoise-tts/issues/182#issuecomm...

    [1] https://medium.com/dessa-news/realtalk-how-it-works-94c1afda...

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts

  • Machine Learning Models: Linear Regression

    2 projects | dev.to | 3 May 2024
  • Finetune a GPT Model for Spam Detection on Your Laptop in Just 5 Minutes

    1 project | news.ycombinator.com | 3 May 2024
  • AI Agent Notebooks Using LangChain, LlamaIndex, Milvus, and More

    1 project | news.ycombinator.com | 3 May 2024
  • Mastering Dataset Acquisition: A Comprehensive Guide

    2 projects | dev.to | 3 May 2024
  • Simple GitHub Issue Handled(?) By Copilot Workspace

    1 project | news.ycombinator.com | 2 May 2024