The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning. Learn more →
Open_clip Alternatives
Similar projects and alternatives to open_clip
-
CLIP
CLIP (Contrastive Language-Image Pretraining), Predict the most relevant text snippet given an image
-
stablediffusion
High-Resolution Image Synthesis with Latent Diffusion Models
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
-
DALLE-pytorch
Implementation / replication of DALL-E, OpenAI's Text to Image Transformer, in Pytorch
-
bitsandbytes
Accessible large language models via k-bit quantization for PyTorch.
-
taming-transformers
Taming Transformers for High-Resolution Image Synthesis
-
Dreambooth-Stable-Diffusion
Implementation of Dreambooth (https://arxiv.org/abs/2208.12242) with Stable Diffusion
-
clip-retrieval
Easily compute clip embeddings and build a clip retrieval system with them
-
WorkOS
The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
-
-
InvokeAI
InvokeAI is a leading creative engine for Stable Diffusion models, empowering professionals, artists, and enthusiasts to generate and create visual media using the latest AI-driven technologies. The solution offers an industry leading WebUI, supports terminal use through a CLI, and serves as the foundation for multiple commercial products.
-
MetaCLIP
ICLR2024 Spotlight: curation/training code, metadata, distribution and pre-trained models for MetaCLIP.
-
dalle-lightning
Refactoring dalle-pytorch and taming-transformers for TPU VM
-
openpilot
openpilot is an open source driver assistance system. openpilot performs the functions of Automated Lane Centering and Adaptive Cruise Control for 250+ supported car makes and models.
-
RWKV-LM
RWKV is an RNN with transformer-level LLM performance. It can be directly trained like a GPT (parallelizable). So it's combining the best of RNN and transformer - great performance, fast inference, saves VRAM, fast training, "infinite" ctx_len, and free sentence embedding.
-
xformers
Hackable and optimized Transformers building blocks, supporting a composable construction.
-
MiDaS
Code for robust monocular depth estimation described in "Ranftl et. al., Towards Robust Monocular Depth Estimation: Mixing Datasets for Zero-shot Cross-dataset Transfer, TPAMI 2022"
-
-
-
Real-ESRGAN
Real-ESRGAN aims at developing Practical Algorithms for General Image/Video Restoration.
-
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
open_clip reviews and mentions
-
A History of CLIP Model Training Data Advances
While OpenAI’s CLIP model has garnered a lot of attention, it is far from the only game in town—and far from the best! On the OpenCLIP leaderboard, for instance, the largest and most capable CLIP model from OpenAI ranks just 41st(!) in its average zero-shot accuracy across 38 datasets.
-
How to Build a Semantic Search Engine for Emojis
Whenever I’m working on semantic search applications that connect images and text, I start with a family of models known as contrastive language image pre-training (CLIP). These models are trained on image-text pairs to generate similar vector representations or embeddings for images and their captions, and dissimilar vectors when images are paired with other text strings. There are multiple CLIP-style models, including OpenCLIP and MetaCLIP, but for simplicity we’ll focus on the original CLIP model from OpenAI. No model is perfect, and at a fundamental level there is no right way to compare images and text, but CLIP certainly provides a good starting point.
- MetaCLIP – Meta AI Research
-
COMFYUI SDXL WORKFLOW INBOUND! Q&A NOW OPEN! (WIP EARLY ACCESS WORKFLOW INCLUDED!)
in the modal card it says: pretrained text encoders (OpenCLIP-ViT/G and CLIP-ViT/L).
-
What's up in the Python community? – April 2023
https://replicate.com/pharmapsychotic/clip-interrogator
using:
cfg.apply_low_vram_defaults()
interrogate_fast()
I tried lighter models like vit32/laion400 and others etc all are very very slow to load or use (model list: https://github.com/mlfoundations/open_clip)
I'm desperately looking for something more modest and light.
-
Alternate LLM's
They have a great track record on similar scale projects. They've partnered with /r/datahoarders and volunteers on creation of training sets including their 5.8 billion image/text-pair dataset that they used to train a better version of CLIP.
- Does anyone have recommendations for GPT3 like performance for open-source models? It seems flan-t5 and its variants are the way to go - any other ones?
- 🐍 5 Awesome Python Projects People Don’t Know About
-
Some notes on porting SD2 over to iPhone (or other platforms)
The text encoder uses a new vocabulary set, make sure you copied them from open_clip repo: https://github.com/mlfoundations/open_clip (I have these also available at: https://github.com/liuliu/swift-diffusion/tree/liu/unet/examples/open_clip
-
Stable Diffusion 2.0 Release
> Writing a training loop for CLIP manually wound up with me banging against all sorts of strange roadblocks and missing bits of documentation, and I still don't have it working.
There is working training code for openCLIP https://github.com/mlfoundations/open_clip
But training multi-modal text-to-image models is still a _very_ new thing, in terms of the software world. Given that, my experience has been that it's never been easier to get to work on this stuff from the software POV. The hardware is the tricky bit (and preventing bandwidth issues on distributed systems).
-
A note from our sponsor - WorkOS
workos.com | 28 Mar 2024
Stats
mlfoundations/open_clip is an open source project licensed under GNU General Public License v3.0 or later which is an OSI approved license.
The primary programming language of open_clip is Jupyter Notebook.