AudioLDM
AudioGPT
AudioLDM | AudioGPT | |
---|---|---|
10 | 4 | |
2,238 | 9,788 | |
- | 0.7% | |
6.0 | 3.7 | |
6 months ago | about 1 month ago | |
Python | Python | |
GNU General Public License v3.0 or later | GNU General Public License v3.0 or later |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
AudioLDM
- Want to know if there's an ai for text (prompt) to sound effects like stable diffusion
- GitHub - haoheliu/AudioLDM: AudioLDM: Generate speech, sound effects, music and beyond, with text.
- AudioLDM: Text-to-Audio Generation with Latent Diffusion Models
-
AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head
Take a look at AudioLDM (https://github.com/haoheliu/AudioLDM), it might be more what you expected:
* Text-to-Audio Generation: Generate audio given text input.
-
Are you digital or traditional artist or student and use Stable Diffusion?
As a part time filmmaker, there's no way that I could be this close to being done after a week worth of work. AudioLDM (https://audioldm.github.io/) saved me so much time bc instead of looking for sonic textures or futzing around with a synth, I was able to prompt my way to a 30s audio output.
-
[N] AudioLM now available on GitHub and HF with demo and checkpoint
GitHub: https://github.com/haoheliu/AudioLDM
AudioGPT
- FLiPN-FLaNK Stack Weekly May 8 2023
-
AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head
Large language models (LLMs) have exhibited remarkable capabilities across a variety of domains and tasks, challenging our understanding of learning and cognition. Despite the recent success, current LLMs are not capable of processing complex audio information or conducting spoken conversations (like Siri or Alexa). In this work, we propose a multi-modal AI system named AudioGPT, which complements LLMs (i.e., ChatGPT) with 1) foundation models to process complex audio information and solve numerous understanding and generation tasks; and 2) the input/output interface (ASR, TTS) to support spoken dialogue. With an increasing demand to evaluate multi-modal LLMs of human intention understanding and cooperation with foundation models, we outline the principles and processes and test AudioGPT in terms of consistency, capability, and robustness. Experimental results demonstrate the capabilities of AudioGPT in solving AI tasks with speech, music, sound, and talking head understanding and generation in multi-round dialogues, which empower humans to create rich and diverse audio content with unprecedented ease. Our system is publicly available at \url{https://github.com/AIGC-Audio/AudioGPT}.
What are some alternatives?
highstorm - Open Source Event Monitoring
thinkgpt - Agent techniques to augment your LLM and push it beyong its limits
Discord-Chatbot-Gpt4Free - This is a Discord Chatbot with image detection, OCR, internet access and DALL-E image generation for free [Moved to: https://github.com/mishalhossin/Discord-AI-Chatbot]
CML_AMP_LLM_Chatbot_Augmented_with_Enterprise_Data
vscode-openai-code-analyzer - Analyze code with OpenAI
pranadb
browsr - 🗂️ a pleasant file explorer in your terminal supporting all filesystems
glow - Render markdown on the CLI, with pizzazz! 💅🏻
123elf - A native port of Lotus 1-2-3 to Linux.
roadmapper - Roadmapper - A Roadmap as Code (Rac) python library. Generate professional roadmap diagram using python code.
frogmouth - A Markdown browser for your terminal