BLIP
CodeFormer
Our great sponsors
BLIP | CodeFormer | |
---|---|---|
14 | 28 | |
4,242 | 13,396 | |
5.5% | - | |
0.0 | 2.0 | |
7 months ago | 28 days ago | |
Jupyter Notebook | Python | |
BSD 3-clause "New" or "Revised" License | GNU General Public License v3.0 or later |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
BLIP
-
MetaCLIP – Meta AI Research
I suggest trying BLIP for this. I've had really good results from that.
https://github.com/salesforce/BLIP
I built a tiny Python CLI wrapper for it to make it easier to try: https://github.com/simonw/blip-caption
-
Is there a website where you can upload a photo and get the description in a paragraph?
You can download the source and run it yourself from here: https://github.com/salesforce/BLIP
-
Stable Diffusion v2-1-unCLIP model released
Then there's also BLIP (Bootstrapping Language-Image Pre-training).
-
GPT-4 shows emergent Theory of Mind on par with an adult. It scored in the 85+ percentile for a lot of major college exams. It can also do taxes and create functional websites from a simple drawing
Or BLIP
-
meme
GitHub - salesforce/BLIP: PyTorch code for BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
-
Object Recognition for Photo Metadata
From what I understand, what's most important to you is having a model that's already trained on something, rather than the architecture. Yolo is probably fine, as would be some of the older ones. You should be able to find a model that's been pretrained on COCO - you can look at see what classes are included. I don't know if there are other broadly trained models available that will serve your purpose. What I'd do is just run your picture through a COCO trained object detection model and see if the annotations do what you want.
Though backing up a bit, there are also image captioning models that may better do what you want to do for organizing your photos. I'm not really familiar with any - though I did come across BLIP the other day but I haven't used it: https://github.com/salesforce/BLIP
This may be a better way to get at what you want
-
I have a problem with the "interrogate" function of Automatic1111's fork. Can someone help me?
git clone https://github.com/salesforce/BLIP.git repositories/BLIP
-
Stable-diffusion in Nix
# Copy models as described in README cp ~/Downloads/model.ckpt . cp ~/Downloads/GFPGANv1.3.pth . # Clone other repos as mentioned in README mkdir repositories git clone https://github.com/CompVis/stable-diffusion.git repositories/stable-diffusion git clone https://github.com/CompVis/taming-transformers.git repositories/taming-transformers git clone https://github.com/sczhou/CodeFormer.git repositories/CodeFormer git clone https://github.com/salesforce/BLIP.git repositories/BLIP export NIXPKGS_ALLOW_UNFREE=1 nix-shell default.nix pip install torch --extra-index-url https://download.pytorch.org/whl/cu113 # Also from linux instructions. Can probably be added to default.nix python webui.py
-
My easy-to-install Windows GUI for Stable Diffusion is ready for a beta release! It supports img2img as well, various samplers, can run multiple scales per image automatically, and more!
Also check img2text (basically to prompt): https://github.com/salesforce/BLIP
- [D] Author Interview - BLIP: Bootstrapping Language-Image Pre-training (Video)
CodeFormer
-
Automatic1111 for Intel Arc (A380 Tested)
CodeFormer
-
Working with a prompt someone posted earlier ( workflow in comments)
https://github.com/sczhou/CodeFormer like this, did you install anything???
- Robust Blind Face Restoration with Codebook Lookup Transformer
-
Images created in Automatic1111 on M1 Mac - Blue tint
https://github.com/sczhou/CodeFormer/releases/download/v0.1.0/codeformer.pth to /stable-diffusion-webui/models/Codeformer/codeformer-v0.1.0.pth
-
How can I make this command run?
I just watched a youtube video of two minute papers and I was impressed by this face restauration ai.
- Towards Robust Blind Face Restoration with Codebook Lookup TransFormer | high quality faces!
-
I tried restoring this REALY old photo of my wife's great great great great Grandpa.
Have you tried this instead? https://github.com/sczhou/CodeFormer , you can try it at https://replicate.com/sczhou/codeformer
- 12 best AI websites to make your life easier [save 100s of hours]
-
Hoping to get this 1890's photo of my great grandmother and her 3 sisters restored as a Christmas present for my father.
CodeFormer is another good alternative to GFPGAN if you're not pleased with its results.
- new ai upscale tech
What are some alternatives?
CLIP - CLIP (Contrastive Language-Image Pretraining), Predict the most relevant text snippet given an image
GFPGAN - GFPGAN aims at developing Practical Algorithms for Real-world Face Restoration.
a-PyTorch-Tutorial-to-Image-Captioning - Show, Attend, and Tell | a PyTorch Tutorial to Image Captioning
stable-diffusion-webui - Stable Diffusion web UI
virtex - [CVPR 2021] VirTex: Learning Visual Representations from Textual Annotations
GPEN
nix-stable-diffusion - Nix-friendly fork of: Optimized Stable Diffusion modified to run on lower GPU VRAM
Real-ESRGAN-ncnn-vulkan - NCNN implementation of Real-ESRGAN. Real-ESRGAN aims at developing Practical Algorithms for General Image Restoration.
taming-transformers - Taming Transformers for High-Resolution Image Synthesis
MidJourney-Styles-and-Keywords-Reference - A reference containing Styles and Keywords that you can use with MidJourney AI. There are also pages showing resolution comparison, image weights, and much more!
rtic-gcn-pytorch - Official PyTorch Implementation of RITC
stable-diffusion-ui - Easiest 1-click way to install and use Stable Diffusion on your computer. Provides a browser UI for generating images from text prompts and images. Just enter your text prompt, and see the generated image. [Moved to: https://github.com/easydiffusion/easydiffusion]