The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning. Learn more →
Top 22 image-captioning Open-Source Projects
-
-
BLIP
PyTorch code for BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
I suggest trying BLIP for this. I've had really good results from that.
https://github.com/salesforce/BLIP
I built a tiny Python CLI wrapper for it to make it easier to try: https://github.com/simonw/blip-caption
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
-
InternGPT
InternGPT (iGPT) is an open source demo platform where you can easily showcase your AI models. Now it supports DragGAN, ChatGPT, ImageBind, multimodal chat like GPT-4, SAM, interactive image editing, etc. Try it at igpt.opengvlab.com (支持DragGAN、ChatGPT、ImageBind、SAM的在线Demo系统)
You can also create an issue and ask the developers for help.
-
a-PyTorch-Tutorial-to-Image-Captioning
Show, Attend, and Tell | a PyTorch Tutorial to Image Captioning
-
OFA
Official repository of OFA (ICML 2022). Paper: OFA: Unifying Architectures, Tasks, and Modalities Through a Simple Sequence-to-Sequence Learning Framework
-
CameraManager
Simple Swift class to provide all the configurations you need to create custom camera view in your app
-
-
WorkOS
The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
-
-
-
-
awesome-foundation-and-multimodal-models
👁️ + 💬 + 🎧 = 🤖 Curated list of top foundation and multimodal models! [Paper + Code + Examples + Tutorials]
-
Project mention: Show HN: I scraped 200M Shopify products to build a search engine | news.ycombinator.com | 2024-02-22
I found some things on Github you could use, I'm not a dev myself and I'm not sure how scalable these are, but have a look, maybe there's something useful. https://github.com/jhc13/taggui
The category filtering is what I wanted to get at, I think the search would improve a lot.
-
DataTurks
ML data annotations made super easy for teams. Just upload data, add your team and build training/evaluation dataset in hours.
-
-
-
CLIP-Caption-Reward
PyTorch code for "Fine-grained Image Captioning with CLIP Reward" (Findings of NAACL 2022)
-
UPop
[ICML 2023] UPop: Unified and Progressive Pruning for Compressing Vision-Language Transformers.
Project mention: Show HN: Compress vision-language and unimodal AI models by structured pruning | news.ycombinator.com | 2023-07-31 -
Project mention: ByteDetective (first rust project | feedback appreciated) - MacOS Tauri app that let you search for images on your computer by describing them | /r/rust | 2023-07-14
-
perturb-predict-paraphrase
Implementation of Perturb, Predict & Paraphrase: Semi-supervised Learning using Noisy Student for Image Captioning
-
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
image-captioning related posts
- [D] Why is most Open Source AI happening outside the USA?
- Is there a website where you can upload a photo and get the description in a paragraph?
- Need help for a colab notebook running Lavis blip2_instruct_vicuna13b?
- most sane web3 job listing
- I work at a non-tech company and have been asked to make software that is impossible. How do I explain it to my boss?
- Two-minute Daily AI Update (Date: 5/15/2023)
- InstructBLIP: Towards General-purpose Vision-Language Models with Instruction Tuning
-
A note from our sponsor - WorkOS
workos.com | 18 Apr 2024
Index
What are some of the best open-source image-captioning projects? This list will help you:
Project | Stars | |
---|---|---|
1 | LAVIS | 8,634 |
2 | BLIP | 4,222 |
3 | InternGPT | 3,111 |
4 | a-PyTorch-Tutorial-to-Image-Captioning | 2,591 |
5 | OFA | 2,318 |
6 | CameraManager | 1,348 |
7 | prismer | 1,285 |
8 | Oscar | 1,024 |
9 | virtex | 555 |
10 | meshed-memory-transformer | 497 |
11 | awesome-foundation-and-multimodal-models | 495 |
12 | taggui | 298 |
13 | DataTurks | 255 |
14 | MAGIC | 245 |
15 | catr | 242 |
16 | CLIP-Caption-Reward | 220 |
17 | UPop | 82 |
18 | image-captioning | 28 |
19 | ByteDetective | 25 |
20 | perturb-predict-paraphrase | 5 |
21 | fiftyone-image-captioning-plugin | 5 |
22 | inscriptor | 3 |