BLIP
nix-stable-diffusion
Our great sponsors
BLIP | nix-stable-diffusion | |
---|---|---|
14 | 1 | |
4,242 | 7 | |
5.5% | - | |
0.0 | 10.0 | |
7 months ago | over 1 year ago | |
Jupyter Notebook | Jupyter Notebook | |
BSD 3-clause "New" or "Revised" License | GNU General Public License v3.0 or later |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
BLIP
-
MetaCLIP – Meta AI Research
I suggest trying BLIP for this. I've had really good results from that.
https://github.com/salesforce/BLIP
I built a tiny Python CLI wrapper for it to make it easier to try: https://github.com/simonw/blip-caption
-
Is there a website where you can upload a photo and get the description in a paragraph?
You can download the source and run it yourself from here: https://github.com/salesforce/BLIP
-
Stable Diffusion v2-1-unCLIP model released
Then there's also BLIP (Bootstrapping Language-Image Pre-training).
-
GPT-4 shows emergent Theory of Mind on par with an adult. It scored in the 85+ percentile for a lot of major college exams. It can also do taxes and create functional websites from a simple drawing
Or BLIP
-
meme
GitHub - salesforce/BLIP: PyTorch code for BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
-
Object Recognition for Photo Metadata
From what I understand, what's most important to you is having a model that's already trained on something, rather than the architecture. Yolo is probably fine, as would be some of the older ones. You should be able to find a model that's been pretrained on COCO - you can look at see what classes are included. I don't know if there are other broadly trained models available that will serve your purpose. What I'd do is just run your picture through a COCO trained object detection model and see if the annotations do what you want.
Though backing up a bit, there are also image captioning models that may better do what you want to do for organizing your photos. I'm not really familiar with any - though I did come across BLIP the other day but I haven't used it: https://github.com/salesforce/BLIP
This may be a better way to get at what you want
-
I have a problem with the "interrogate" function of Automatic1111's fork. Can someone help me?
git clone https://github.com/salesforce/BLIP.git repositories/BLIP
-
Stable-diffusion in Nix
# Copy models as described in README cp ~/Downloads/model.ckpt . cp ~/Downloads/GFPGANv1.3.pth . # Clone other repos as mentioned in README mkdir repositories git clone https://github.com/CompVis/stable-diffusion.git repositories/stable-diffusion git clone https://github.com/CompVis/taming-transformers.git repositories/taming-transformers git clone https://github.com/sczhou/CodeFormer.git repositories/CodeFormer git clone https://github.com/salesforce/BLIP.git repositories/BLIP export NIXPKGS_ALLOW_UNFREE=1 nix-shell default.nix pip install torch --extra-index-url https://download.pytorch.org/whl/cu113 # Also from linux instructions. Can probably be added to default.nix python webui.py
-
My easy-to-install Windows GUI for Stable Diffusion is ready for a beta release! It supports img2img as well, various samplers, can run multiple scales per image automatically, and more!
Also check img2text (basically to prompt): https://github.com/salesforce/BLIP
- [D] Author Interview - BLIP: Bootstrapping Language-Image Pre-training (Video)
nix-stable-diffusion
-
Stable-diffusion in Nix
I have been playing around with a memory efficient version of [stable-diffusion](https://github.com/basujindal/stable-diffusion), an AI model that outputs images based on input text. I think it is rather neat, and [this post](https://xeiaso.net/blog/stable-diffusion-nixos) by Xeiaso (good blog for nix resources and rants) inspired me to try and make the installation process more nix-friendly. So I have forked the original [stable-diffusion repo](https://github.com/CompVis/stable-diffusion) and added a [`default.nix`](https://github.com/Danielhp95/nix-stable-diffusion/blob/main/default.nix) which dumps you in a `nix-shell` with all dependencies: cuda (nvidia) stuff, python dependencies and C++ libraries for pytorch (deep learning framework). Although **note** that in order to run it you must first download the model weights from [this repo](https://huggingface.co/CompVis/stable-diffusion-v1-4) and place them in `models/ldm/stable-diffusion-v1/model.ckpt` directory.
What are some alternatives?
CLIP - CLIP (Contrastive Language-Image Pretraining), Predict the most relevant text snippet given an image
CodeFormer - [NeurIPS 2022] Towards Robust Blind Face Restoration with Codebook Lookup Transformer
a-PyTorch-Tutorial-to-Image-Captioning - Show, Attend, and Tell | a PyTorch Tutorial to Image Captioning
taming-transformers - Taming Transformers for High-Resolution Image Synthesis
stable-diffusion - A latent text-to-image diffusion model
virtex - [CVPR 2021] VirTex: Learning Visual Representations from Textual Annotations
stable-diffusion-webui - Stable Diffusion web UI
stable-diffusion - Optimized Stable Diffusion modified to run on lower GPU VRAM
rtic-gcn-pytorch - Official PyTorch Implementation of RITC
ghci-ng