blip-caption
MetaCLIP
blip-caption | MetaCLIP | |
---|---|---|
2 | 5 | |
101 | 1,019 | |
- | 4.6% | |
4.0 | 7.5 | |
8 months ago | 13 days ago | |
Python | Python | |
- | GNU General Public License v3.0 or later |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
blip-caption
-
Bash One-Liners for LLMs
I've been gleefully exploring the intersection of LLMs and CLI utilities for a few months now - they are such a great fit for each other! The unix philosophy of piping things together is a perfect fit for how LLMs work.
I've mostly been exploring this with my https://llm.datasette.io/ CLI tool, but I have a few other one-off tools as well: https://github.com/simonw/blip-caption and https://github.com/simonw/ospeak
I'm puzzled that more people aren't loudly exploring this space (LLM+CLI) - it's really fun.
-
MetaCLIP – Meta AI Research
I suggest trying BLIP for this. I've had really good results from that.
https://github.com/salesforce/BLIP
I built a tiny Python CLI wrapper for it to make it easier to try: https://github.com/simonw/blip-caption
MetaCLIP
-
A History of CLIP Model Training Data Advances
(Github Repo | Most Popular Model | Paper)
-
How to Build a Semantic Search Engine for Emojis
Whenever I’m working on semantic search applications that connect images and text, I start with a family of models known as contrastive language image pre-training (CLIP). These models are trained on image-text pairs to generate similar vector representations or embeddings for images and their captions, and dissimilar vectors when images are paired with other text strings. There are multiple CLIP-style models, including OpenCLIP and MetaCLIP, but for simplicity we’ll focus on the original CLIP model from OpenAI. No model is perfect, and at a fundamental level there is no right way to compare images and text, but CLIP certainly provides a good starting point.
- MetaCLIP by Meta AI Research
- MetaCLIP – Meta AI Research
What are some alternatives?
NumPyCLIP - Pure NumPy implementation of https://github.com/openai/CLIP
BLIP - PyTorch code for BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
autodistill-metaclip - MetaCLIP module for use with Autodistill.
open_clip - An open source implementation of CLIP.
sgpt - SGPT is a command-line tool that provides a convenient way to interact with OpenAI models, enabling users to run queries, generate shell commands and produce code directly from the terminal.
emoji-search-plugin - Semantic Emoji Search Plugin for FiftyOne
geppetto - golang GPT3 tooling