Text-to-Audio Generation Using Instruction Tuned LLM and Latent Diffusion Model

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

Our great sponsors
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • WorkOS - The modern identity platform for B2B SaaS
  • SaaSHub - Software Alternatives and Reviews
  • SpecVQGAN

    Source code for "Taming Visually Guided Sound Generation" (Oral at the BMVC 2021)

  • Excellent. Some of the theory here goes back to Oct/2021 and beyond [1].

    The riffusion.com [2] guys made this practical. Also, my video of high-level overview and examples [3].

    1. SpecVQGAN: https://github.com/v-iashin/SpecVQGAN

    2. Riffusion: ://www.riffusion.com/

    3. Riffusion high-level overview: https://youtu.be/olkLVGcvib8

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts