ai-audio-startups
AudioGPT
ai-audio-startups | AudioGPT | |
---|---|---|
1 | 4 | |
1,468 | 9,839 | |
- | 0.6% | |
6.0 | 3.7 | |
4 days ago | 2 months ago | |
Python | ||
Apache License 2.0 | GNU General Public License v3.0 or later |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
ai-audio-startups
AudioGPT
- FLiPN-FLaNK Stack Weekly May 8 2023
-
AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head
Large language models (LLMs) have exhibited remarkable capabilities across a variety of domains and tasks, challenging our understanding of learning and cognition. Despite the recent success, current LLMs are not capable of processing complex audio information or conducting spoken conversations (like Siri or Alexa). In this work, we propose a multi-modal AI system named AudioGPT, which complements LLMs (i.e., ChatGPT) with 1) foundation models to process complex audio information and solve numerous understanding and generation tasks; and 2) the input/output interface (ASR, TTS) to support spoken dialogue. With an increasing demand to evaluate multi-modal LLMs of human intention understanding and cooperation with foundation models, we outline the principles and processes and test AudioGPT in terms of consistency, capability, and robustness. Experimental results demonstrate the capabilities of AudioGPT in solving AI tasks with speech, music, sound, and talking head understanding and generation in multi-round dialogues, which empower humans to create rich and diverse audio content with unprecedented ease. Our system is publicly available at \url{https://github.com/AIGC-Audio/AudioGPT}.
What are some alternatives?
AudioLDM - AudioLDM: Generate speech, sound effects, music and beyond, with text.
highstorm - Open Source Event Monitoring
thinkgpt - Agent techniques to augment your LLM and push it beyong its limits
Discord-Chatbot-Gpt4Free - This is a Discord Chatbot with image detection, OCR, internet access and DALL-E image generation for free [Moved to: https://github.com/mishalhossin/Discord-AI-Chatbot]
CML_AMP_LLM_Chatbot_Augmented_with_Enterprise_Data
vscode-openai-code-analyzer - Analyze code with OpenAI
pranadb
browsr - 🗂️ a pleasant file explorer in your terminal supporting all filesystems
glow - Render markdown on the CLI, with pizzazz! 💅🏻
123elf - A native port of Lotus 1-2-3 to Linux.
roadmapper - Roadmapper - A Roadmap as Code (Rac) python library. Generate professional roadmap diagram using python code.
frogmouth - A Markdown browser for your terminal