The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning. Learn more →
Top 12 audio-generation Open-Source Projects
-
LocalAI
:robot: The free, Open Source OpenAI alternative. Self-hosted, community-driven and local-first. Drop-in replacement for OpenAI running on consumer-grade hardware. No GPU required. Runs gguf, transformers, diffusers and many more models architectures. It allows to generate Text, Audio, Video, Images. Also with voice cloning capabilities.
-
Amphion
Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, music, and speech generation research and development.
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
-
Project mention: Want to know if there's an ai for text (prompt) to sound effects like stable diffusion | /r/StableDiffusion | 2023-05-19
-
-
-
tts-generation-webui
TTS Generation Web UI (Bark, MusicGen + AudioGen, Tortoise, RVC, Vocos, Demucs, SeamlessM4T, MAGNet, StyleTTS2, MMS)
Project mention: [D] Open-source SOTA Audio-to-Audio: how do I sound like a famous actor? | /r/MachineLearning | 2023-10-27I'd use the TTS web UI with RVC. I'll link to the UI. If you are looking for an individual project, you should check in the Readme for RVC. https://github.com/rsxdalv/tts-generation-webui
-
soundstorm-pytorch
Implementation of SoundStorm, Efficient Parallel Audio Generation from Google Deepmind, in Pytorch
Project mention: Meta introduces Voicebox: state-of-the-art generative AI model for speech | news.ycombinator.com | 2023-06-19got a response here https://github.com/lucidrains/soundstorm-pytorch/discussions...
-
WorkOS
The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
-
tango
Codes and Model of the paper "Text-to-Audio Generation using Instruction Tuned LLM and Latent Diffusion Model" (by declare-lab)
Project mention: [Research] [Project] Text-to-Audio Generation using Instruction-Tuned LLM and Latent Diffusion Model | /r/MachineLearning | 2023-05-04Found relevant code at https://github.com/declare-lab/tango + all code implementations here
-
Project mention: Text-to-Audio Generation Using Instruction Tuned LLM and Latent Diffusion Model | news.ycombinator.com | 2023-04-28
Excellent. Some of the theory here goes back to Oct/2021 and beyond [1].
The riffusion.com [2] guys made this practical. Also, my video of high-level overview and examples [3].
1. SpecVQGAN: https://github.com/v-iashin/SpecVQGAN
2. Riffusion: ://www.riffusion.com/
3. Riffusion high-level overview: https://youtu.be/olkLVGcvib8
-
Project mention: I Built a Modular Python Library for Designing and Training Diffusion Models from Scratch | /r/SideProject | 2023-09-06
Last week, I released a project I've been working on for months: Modular Diffusion. It's a modular Python library for designing and training your own Diffusion Models in just a few lines of code. I also wrote a documentation page. The project has already gotten some great community feedback and I'm hoping you guys like it too!
-
word2wave
Word2Wave: a framework for generating short audio samples from a text prompt using WaveGAN and COALA.
-
🎙️ rsxdalv/bark-speaker-directory
audio-generation related posts
- Technique makes Taylor Swift to sing perfect Mandarin Chinese song
- [Research] [Project] Text-to-Audio Generation using Instruction-Tuned LLM and Latent Diffusion Model
- Text-to-Audio Generation Using Instruction Tuned LLM and Latent Diffusion Model
- AI Enhancement of classical music - would you notice the difference?
- AI (not my company) vs Music vs Modular - your thoughts?
-
A note from our sponsor - WorkOS
workos.com | 29 Mar 2024
Index
What are some of the best open-source audio-generation projects? This list will help you:
Project | Stars | |
---|---|---|
1 | LocalAI | 18,205 |
2 | Amphion | 3,715 |
3 | AudioLDM | 2,179 |
4 | audio-ai-timeline | 1,852 |
5 | audio-diffusion-pytorch | 1,748 |
6 | tts-generation-webui | 1,165 |
7 | soundstorm-pytorch | 1,099 |
8 | tango | 864 |
9 | SpecVQGAN | 313 |
10 | modular-diffusion | 251 |
11 | word2wave | 116 |
12 | bark-speaker-directory | 45 |