prismer
The implementation of "Prismer: A Vision-Language Model with Multi-Task Experts". (by NVlabs)
CLIP-Caption-Reward
PyTorch code for "Fine-grained Image Captioning with CLIP Reward" (Findings of NAACL 2022) (by j-min)
prismer | CLIP-Caption-Reward | |
---|---|---|
5 | 2 | |
1,285 | 225 | |
-0.2% | - | |
5.2 | 0.0 | |
5 months ago | almost 2 years ago | |
Python | Python | |
GNU General Public License v3.0 or later | GNU General Public License v3.0 or later |
The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
prismer
Posts with mentions or reviews of prismer.
We have used some of these posts to build our list of alternatives
and similar projects.
-
[D] Tracking Dancing People
tracking with An Ensemble of Experts similar to this https://github.com/NVlabs/Prismer
-
Meet Prismer: An Open Source Vision-Language Model with An Ensemble of Experts
Quick Read: https://www.marktechpost.com/2023/03/11/meet-prismer-an-open-source-vision-language-model-with-an-ensemble-of-experts/ Paper: https://arxiv.org/pdf/2303.02506.pdf Code: https://github.com/nvlabs/prismer
- Prismer: A Vision-Language Model with Multi-Modal Experts
-
[R] Prismer: An Open Source Vision-Language Model with An Ensemble of Experts.
Code and Models - https://github.com/NVlabs/prismer
CLIP-Caption-Reward
Posts with mentions or reviews of CLIP-Caption-Reward.
We have used some of these posts to build our list of alternatives
and similar projects.
-
is there any "image to text" ai?
Look for 'image captioning'. Here's an on-line example: https://vision-explorer.allenai.org/image_captioning . Here's a recent one that was open sourced: https://github.com/j-min/CLIP-Caption-Reward
-
Adobe AI Researchers Open-Source Image Captioning AI CLIP-S: An Image-Captioning AI Model That Produces Fine-Grained Descriptions of Images
Continue reading | Checkout the paper, github
What are some alternatives?
When comparing prismer and CLIP-Caption-Reward you can also consider the following projects:
Oscar - Oscar and VinVL
LAVIS - LAVIS - A One-stop Library for Language-Vision Intelligence
InternGPT - InternGPT (iGPT) is an open source demo platform where you can easily showcase your AI models. Now it supports DragGAN, ChatGPT, ImageBind, multimodal chat like GPT-4, SAM, interactive image editing, etc. Try it at igpt.opengvlab.com (支持DragGAN、ChatGPT、ImageBind、SAM的在线Demo系统)
Qwen-VL - The official repo of Qwen-VL (通义千问-VL) chat & pretrained large vision language model proposed by Alibaba Cloud.
VLDet - [ICLR 2023] PyTorch implementation of VLDet (https://arxiv.org/abs/2211.14843)
VehicleFinder-CTIM
MAGIC - Language Models Can See: Plugging Visual Controls in Text Generation