NExT-GPT
Otter
NExT-GPT | Otter | |
---|---|---|
1 | 4 | |
2,882 | 3,463 | |
- | - | |
9.3 | 9.1 | |
4 months ago | 2 months ago | |
Python | Python | |
BSD 3-clause "New" or "Revised" License | MIT License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
NExT-GPT
Otter
-
OpenAI vs Google, Detect ChatGPT Content with 99% accuracy, Navigating AI compute costs
👀 Video-LLaMA - Empower large language models with video and audio understanding capability. (link) 🦦 Otter - Multi-modal model with improved instruction-following and in-context learning ability. 🔗 Linkly.AI - AI-powered lead analytics and management platform that helps you track, analyze, and streamline your leads in one place. 🎬 Jet Cut Ready - AI plugin for Adobe Premiere Pro that automatically removes silent parts in videos. (link) 💬 HeyGen's ChatGPT Plugin - Convert text into high-quality videos using AI text and video generation.
- Multimodal models and "active" learning
- Otter: A Multi-Modal Model with In-Context Instruction Tuning
-
Otter is a multi-modal model developed on OpenFlamingo (open-sourced version of DeepMind's Flamingo), trained on a dataset of multi-modal instruction-response pairs. Otter demonstrates remarkable proficiency in multi-modal perception, reasoning, and in-context learning.
GitHub repo includes HuggingFace links to the model: https://github.com/Luodian/Otter
What are some alternatives?
mPLUG-Owl - mPLUG-Owl & mPLUG-Owl2: Modularized Multimodal Large Language Model
LLaMA-Adapter - [ICLR 2024] Fine-tuning LLaMA to follow Instructions within 1 Hour and 1.2M Parameters
gpt_academic - 为GPT/GLM等LLM大语言模型提供实用化交互接口,特别优化论文阅读/润色/写作体验,模块化设计,支持自定义快捷按钮&函数插件,支持Python和C++等项目剖析&自译解功能,PDF/LaTex论文翻译&总结功能,支持并行问询多种LLM模型,支持chatglm3等本地模型。接入通义千问, deepseekcoder, 讯飞星火, 文心一言, llama2, rwkv, claude2, moss等。
Video-LLaMA - [EMNLP 2023 Demo] Video-LLaMA: An Instruction-tuned Audio-Visual Language Model for Video Understanding
InternChat - InternGPT / InternChat allows you to interact with ChatGPT by clicking, dragging and drawing using a pointing device. [Moved to: https://github.com/OpenGVLab/InternGPT]
Sophia - Effortless plugin and play Optimizer to cut model training costs by 50%. New optimizer that is 2x faster than Adam on LLMs.
Awesome-Multimodal-Large-Language-Models - :sparkles::sparkles:Latest Papers and Datasets on Multimodal Large Language Models, and Their Evaluation.
InternGPT - InternGPT (iGPT) is an open source demo platform where you can easily showcase your AI models. Now it supports DragGAN, ChatGPT, ImageBind, multimodal chat like GPT-4, SAM, interactive image editing, etc. Try it at igpt.opengvlab.com (支持DragGAN、ChatGPT、ImageBind、SAM的在线Demo系统)
LinkedInGPT - Skynet
unilm - Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
squeezelite-esp32 - ESP32 Music streaming based on Squeezelite, with support for multi-room sync, AirPlay, Bluetooth, Hardware buttons, display and more