image-captioning

Open-source projects categorized as image-captioning

Top 22 image-captioning Open-Source Projects

  • LAVIS

    LAVIS - A One-stop Library for Language-Vision Intelligence

    Project mention: FLaNK AI for 11 March 2024 | dev.to | 2024-03-11
  • BLIP

    PyTorch code for BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation

    Project mention: MetaCLIP – Meta AI Research | news.ycombinator.com | 2023-10-26

    I suggest trying BLIP for this. I've had really good results from that.

    https://github.com/salesforce/BLIP

    I built a tiny Python CLI wrapper for it to make it easier to try: https://github.com/simonw/blip-caption

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

  • InternGPT

    InternGPT (iGPT) is an open source demo platform where you can easily showcase your AI models. Now it supports DragGAN, ChatGPT, ImageBind, multimodal chat like GPT-4, SAM, interactive image editing, etc. Try it at igpt.opengvlab.com (支持DragGAN、ChatGPT、ImageBind、SAM的在线Demo系统)

    Project mention: How do I use the programs on Github? | /r/github | 2023-06-16

    You can also create an issue and ask the developers for help.

  • a-PyTorch-Tutorial-to-Image-Captioning

    Show, Attend, and Tell | a PyTorch Tutorial to Image Captioning

  • OFA

    Official repository of OFA (ICML 2022). Paper: OFA: Unifying Architectures, Tasks, and Modalities Through a Simple Sequence-to-Sequence Learning Framework

  • CameraManager

    Simple Swift class to provide all the configurations you need to create custom camera view in your app

  • prismer

    The implementation of "Prismer: A Vision-Language Model with Multi-Task Experts".

  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

  • Oscar

    Oscar and VinVL

  • virtex

    [CVPR 2021] VirTex: Learning Visual Representations from Textual Annotations

  • meshed-memory-transformer

    Meshed-Memory Transformer for Image Captioning. CVPR 2020

  • awesome-foundation-and-multimodal-models

    👁️ + 💬 + 🎧 = 🤖 Curated list of top foundation and multimodal models! [Paper + Code + Examples + Tutorials]

    Project mention: Foundation Multimodal Models | news.ycombinator.com | 2024-03-01
  • taggui

    Tag manager and captioner for image datasets

    Project mention: Show HN: I scraped 200M Shopify products to build a search engine | news.ycombinator.com | 2024-02-22

    I found some things on Github you could use, I'm not a dev myself and I'm not sure how scalable these are, but have a look, maybe there's something useful. https://github.com/jhc13/taggui

    The category filtering is what I wanted to get at, I think the search would improve a lot.

  • DataTurks

    ML data annotations made super easy for teams. Just upload data, add your team and build training/evaluation dataset in hours.

  • MAGIC

    Language Models Can See: Plugging Visual Controls in Text Generation (by yxuansu)

  • catr

    Image Captioning Using Transformer

  • CLIP-Caption-Reward

    PyTorch code for "Fine-grained Image Captioning with CLIP Reward" (Findings of NAACL 2022)

  • UPop

    [ICML 2023] UPop: Unified and Progressive Pruning for Compressing Vision-Language Transformers.

    Project mention: Show HN: Compress vision-language and unimodal AI models by structured pruning | news.ycombinator.com | 2023-07-31
  • image-captioning

    Image captioning using python and BLIP

  • ByteDetective

    The easiest way to search for images on your desktop 🔎

    Project mention: ByteDetective (first rust project | feedback appreciated) - MacOS Tauri app that let you search for images on your computer by describing them | /r/rust | 2023-07-14
  • perturb-predict-paraphrase

    Implementation of Perturb, Predict & Paraphrase: Semi-supervised Learning using Noisy Student for Image Captioning

  • fiftyone-image-captioning-plugin

    Caption images across your datasets with state of the art models from Hugging Face and Replicate!

    Project mention: How to Cluster Images | dev.to | 2024-04-09

    Concept Modeling Techniques: the built-in concept modeling technique in this walkthrough uses GPT-4V and some light prompting to identify each cluster's core concept. This is but one way to approach an open-ended problem. Try using image captioning and topic modeling, or create your own technique!

  • inscriptor

    Blip 2 Captioning, Mass Captioning, Question Answering, and other tools.

  • SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020). The latest post mention was on 2024-04-09.

image-captioning related posts

Index

What are some of the best open-source image-captioning projects? This list will help you:

Project Stars
1 LAVIS 8,634
2 BLIP 4,222
3 InternGPT 3,111
4 a-PyTorch-Tutorial-to-Image-Captioning 2,591
5 OFA 2,318
6 CameraManager 1,348
7 prismer 1,285
8 Oscar 1,024
9 virtex 555
10 meshed-memory-transformer 497
11 awesome-foundation-and-multimodal-models 495
12 taggui 298
13 DataTurks 255
14 MAGIC 245
15 catr 242
16 CLIP-Caption-Reward 220
17 UPop 82
18 image-captioning 28
19 ByteDetective 25
20 perturb-predict-paraphrase 5
21 fiftyone-image-captioning-plugin 5
22 inscriptor 3
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com