-
MetaCLIP
ICLR2024 Spotlight: curation/training code, metadata, distribution and pre-trained models for MetaCLIP; CVPR 2024: MoDE: CLIP Data Experts via Clustering
-
Scout Monitoring
Free Django app performance insights with Scout Monitoring. Get Scout setup in minutes, and let us sweat the small stuff. A couple lines in settings.py is all you need to start monitoring your apps. Sign up for our free tier today.
-
BLIP
PyTorch code for BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
I have been playing with MetaCLIP this afternoon and made https://github.com/autodistill/autodistill-metaclip as a pip installable version. The Facebook repository has some guidance but you have to pull the weights yourself, save them, etc.
My inference function (model.predict("image.png")) return an sv.Classifications object that you can load into supervision for processing (i.e. get top k) [1].
The paper [2] notes the following in terms of performance:
> In Table 4, we observe that MetaCLIP outperforms OpenAI CLIP on ImageNet and average accuracy across 26 tasks, for 3 model scales. With 400 million training data points on ViT-B/32, MetaCLIP outperforms CLIP by +2.1% on ImageNet and by +1.6% on average. On ViT-B/16, MetaCLIP outperforms CLIP by +2.5% on ImageNet and by +1.5% on average. On ViT-L/14, MetaCLIP outperforms CLIP by +0.7% on ImageNet and by +1.4% on average across the 26 tasks.
[1] https://github.com/autodistill/autodistill-metaclip
https://github.com/mlfoundations/open_clip/blob/main/docs/op...
I found CLIP to be _amazing_ for all kinds of image search, like search-by-text or search-by-image. I even ported it to NumPy to understand it better. The whole thing is less than 500 lines of Python: https://github.com/99991/NumPyCLIP
I suggest trying BLIP for this. I've had really good results from that.
https://github.com/salesforce/BLIP
I built a tiny Python CLI wrapper for it to make it easier to try: https://github.com/simonw/blip-caption
I suggest trying BLIP for this. I've had really good results from that.
https://github.com/salesforce/BLIP
I built a tiny Python CLI wrapper for it to make it easier to try: https://github.com/simonw/blip-caption