super-gradients
AudioGPT
super-gradients | AudioGPT | |
---|---|---|
8 | 4 | |
4,343 | 9,788 | |
1.6% | 0.7% | |
9.5 | 3.7 | |
7 days ago | about 1 month ago | |
Jupyter Notebook | Python | |
Apache License 2.0 | GNU General Public License v3.0 or later |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
super-gradients
-
Zero-Shot Prediction Plugin for FiftyOne
Most computer vision models are trained to predict on a preset list of label classes. In object detection, for instance, many of the most popular models like YOLOv8 and YOLO-NAS are pretrained with the classes from the MS COCO dataset. If you download the weights checkpoints for these models and run prediction on your dataset, you will generate object detection bounding boxes for the 80 COCO classes.
-
Open Source Advent Fun Wraps Up!
23. SuperGradients | Github | tutorial
- FLaNK Stack Weekly 06 Nov 2023
-
Autodistill: A new way to create CV models
And the target models include: * YOLOv8 (You Only Look Once) * YOLO-NAS * YOLOv5 * and DETR
- FLaNK Stack for 15 May 2023
- GitHub - Deci-AI/super-gradients: Easily train or fine-tune SOTA co...GitHub - Deci-AI/super-gradients: Easily train or fine-tune SOTA co...
- Meet YOLO-NAS: An Open-Sourced YOLO-based Architecture Redefining State-of-the-Art in Object Detection
- FLiPN-FLaNK Stack Weekly May 8 2023
AudioGPT
- FLiPN-FLaNK Stack Weekly May 8 2023
-
AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head
Large language models (LLMs) have exhibited remarkable capabilities across a variety of domains and tasks, challenging our understanding of learning and cognition. Despite the recent success, current LLMs are not capable of processing complex audio information or conducting spoken conversations (like Siri or Alexa). In this work, we propose a multi-modal AI system named AudioGPT, which complements LLMs (i.e., ChatGPT) with 1) foundation models to process complex audio information and solve numerous understanding and generation tasks; and 2) the input/output interface (ASR, TTS) to support spoken dialogue. With an increasing demand to evaluate multi-modal LLMs of human intention understanding and cooperation with foundation models, we outline the principles and processes and test AudioGPT in terms of consistency, capability, and robustness. Experimental results demonstrate the capabilities of AudioGPT in solving AI tasks with speech, music, sound, and talking head understanding and generation in multi-round dialogues, which empower humans to create rich and diverse audio content with unprecedented ease. Our system is publicly available at \url{https://github.com/AIGC-Audio/AudioGPT}.
What are some alternatives?
ultralytics - NEW - YOLOv8 🚀 in PyTorch > ONNX > OpenVINO > CoreML > TFLite
AudioLDM - AudioLDM: Generate speech, sound effects, music and beyond, with text.
SegGradCAM - SEG-GRAD-CAM: Interpretable Semantic Segmentation via Gradient-Weighted Class Activation Mapping
highstorm - Open Source Event Monitoring
thinkgpt - Agent techniques to augment your LLM and push it beyong its limits
openvino_notebooks - 📚 Jupyter notebook tutorials for OpenVINO™
Discord-Chatbot-Gpt4Free - This is a Discord Chatbot with image detection, OCR, internet access and DALL-E image generation for free [Moved to: https://github.com/mishalhossin/Discord-AI-Chatbot]
pyvideotrans - Translate the video from one language to another and add dubbing. 将视频从一种语言翻译为另一种语言,并添加配音
CML_AMP_LLM_Chatbot_Augmented_with_Enterprise_Data
Detic - Code release for "Detecting Twenty-thousand Classes using Image-level Supervision".
vscode-openai-code-analyzer - Analyze code with OpenAI