BLIP
stable-diffusion-webui
Our great sponsors
BLIP | stable-diffusion-webui | |
---|---|---|
14 | 75 | |
4,242 | 2,208 | |
5.5% | - | |
0.0 | 9.8 | |
7 months ago | over 1 year ago | |
Jupyter Notebook | Python | |
BSD 3-clause "New" or "Revised" License | GNU Affero General Public License v3.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
BLIP
-
MetaCLIP – Meta AI Research
I suggest trying BLIP for this. I've had really good results from that.
https://github.com/salesforce/BLIP
I built a tiny Python CLI wrapper for it to make it easier to try: https://github.com/simonw/blip-caption
-
Is there a website where you can upload a photo and get the description in a paragraph?
You can download the source and run it yourself from here: https://github.com/salesforce/BLIP
-
Stable Diffusion v2-1-unCLIP model released
Then there's also BLIP (Bootstrapping Language-Image Pre-training).
-
GPT-4 shows emergent Theory of Mind on par with an adult. It scored in the 85+ percentile for a lot of major college exams. It can also do taxes and create functional websites from a simple drawing
Or BLIP
-
meme
GitHub - salesforce/BLIP: PyTorch code for BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
-
Object Recognition for Photo Metadata
From what I understand, what's most important to you is having a model that's already trained on something, rather than the architecture. Yolo is probably fine, as would be some of the older ones. You should be able to find a model that's been pretrained on COCO - you can look at see what classes are included. I don't know if there are other broadly trained models available that will serve your purpose. What I'd do is just run your picture through a COCO trained object detection model and see if the annotations do what you want.
Though backing up a bit, there are also image captioning models that may better do what you want to do for organizing your photos. I'm not really familiar with any - though I did come across BLIP the other day but I haven't used it: https://github.com/salesforce/BLIP
This may be a better way to get at what you want
-
I have a problem with the "interrogate" function of Automatic1111's fork. Can someone help me?
git clone https://github.com/salesforce/BLIP.git repositories/BLIP
-
Stable-diffusion in Nix
# Copy models as described in README cp ~/Downloads/model.ckpt . cp ~/Downloads/GFPGANv1.3.pth . # Clone other repos as mentioned in README mkdir repositories git clone https://github.com/CompVis/stable-diffusion.git repositories/stable-diffusion git clone https://github.com/CompVis/taming-transformers.git repositories/taming-transformers git clone https://github.com/sczhou/CodeFormer.git repositories/CodeFormer git clone https://github.com/salesforce/BLIP.git repositories/BLIP export NIXPKGS_ALLOW_UNFREE=1 nix-shell default.nix pip install torch --extra-index-url https://download.pytorch.org/whl/cu113 # Also from linux instructions. Can probably be added to default.nix python webui.py
-
My easy-to-install Windows GUI for Stable Diffusion is ready for a beta release! It supports img2img as well, various samplers, can run multiple scales per image automatically, and more!
Also check img2text (basically to prompt): https://github.com/salesforce/BLIP
- [D] Author Interview - BLIP: Bootstrapping Language-Image Pre-training (Video)
stable-diffusion-webui
- [Stablediffusion] Interface utilisateur Web Diffusion stable
- Generating game concept art
-
../../workspace/imgs/txt2img
I am using this one : https://github.com/hlky/stable-diffusion-webui
-
How to generate a similar images to an input image *without* a prompt?
Not sure about the script but you can try using this web-ui's img2img tab.
-
Enhancing local detail and cohesion by mosaicing
https://github.com/hlky/stable-diffusion-webui now redirects to /sd-webui/stable-diffusion-webui, as though they're the "true" sd-webui.
-
reintalled new Hlky update & img2img returns errors (not just where you have to click on mask & back on crop)
As an update, in case anyone else has the issue, after getting some help (thanks u/vedroboev) I installed from here not sure what the difference is, but I got it working.
- Is anyone else unable to use the site?
-
Fixing SD images with img2img, am I misunderstanding the concept?
I would pick a version on the github from 8/31 in the stable diffusion repo and then follow step 2a in this guide https://rentry.org/GUItard to transfer the files from this https://github.com/hlky/stable-diffusion-webui/tree/96aba4b36d59803f3817ee60e96a097f54962ae4
-
Can't seem to get img2img up and running
This is a bug with the newest UI version. See this.
- Stable Diffusion Img2Img Help
What are some alternatives?
CLIP - CLIP (Contrastive Language-Image Pretraining), Predict the most relevant text snippet given an image
GFPGAN - GFPGAN aims at developing Practical Algorithms for Real-world Face Restoration.
a-PyTorch-Tutorial-to-Image-Captioning - Show, Attend, and Tell | a PyTorch Tutorial to Image Captioning
onnx - Open standard for machine learning interoperability
CodeFormer - [NeurIPS 2022] Towards Robust Blind Face Restoration with Codebook Lookup Transformer
waifu-diffusion - stable diffusion finetuned on weeb stuff
virtex - [CVPR 2021] VirTex: Learning Visual Representations from Textual Annotations
diffusers-uncensored - Uncensored fork of diffusers
nix-stable-diffusion - Nix-friendly fork of: Optimized Stable Diffusion modified to run on lower GPU VRAM
txt2imghd - A port of GOBIG for Stable Diffusion
taming-transformers - Taming Transformers for High-Resolution Image Synthesis