CLIP-Guided-Diffusion
mindall-e
CLIP-Guided-Diffusion | mindall-e | |
---|---|---|
4 | 8 | |
377 | 630 | |
- | -0.2% | |
0.0 | 0.0 | |
over 1 year ago | almost 2 years ago | |
Python | Python | |
GNU General Public License v3.0 or later | GNU General Public License v3.0 or later |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
CLIP-Guided-Diffusion
-
Which is your favorite text to image model overall?
Runner-ups are Craiyon (for being more "creative" than SD), Disco Diffusion, minDALL-E, and CLIP Guided Diffusion.
-
Once have access, do you run it on your computer or over the internet on Open-AI's computers?
-clip guided diffusion https://github.com/nerdyrodent/CLIP-Guided-Diffusion
-
how would i go about running disco diffusion locally?
Nerdy Rodent has a Github repo for this; it should work fine from the Anaconda command line: https://github.com/nerdyrodent/CLIP-Guided-Diffusion
-
PLAYING AGAIN (CLIP GUIDED DIFFUSION) (VQGAN + CLIP) (Beksinski)
As far as I understand, VQGAN is not a guided diffusion model. I've been using a slightly tweaked version of https://github.com/nerdyrodent/CLIP-Guided-Diffusion for diffusion. Once you get it set up the interface is pretty much what you might expect:
mindall-e
-
Which is your favorite text to image model overall?
Runner-ups are Craiyon (for being more "creative" than SD), Disco Diffusion, minDALL-E, and CLIP Guided Diffusion.
-
minDALL-E on Conceptual Captions
minDALL-E at replicate.com. (Found here.)
GitHub: https://github.com/kakaobrain/minDALL-E Colab demo: https://colab.research.google.com/drive/1Gg7-c7LrUTNfQ-Fk-BVNCe9kvedZZsAh?usp=sharing
-
We got openAI's DALL-E
For those wondering, this is minDALL-E as u/DEATH_STAR_EXTRACTOR mentioned
-
[P] minDALL-E: PyTorch implementation of a 1.3B text-to-image generation model trained on 14 million image-text pairs
Hello. I introduce an open source project, which released the checkpoint of the text-to-image generation model, DALL-E. Link: https://github.com/kakaobrain/minDALL-E
-
Release: 602M-parameter CLIP-conditioned diffusion model trained on Conceptual 12M (v-diffusion-pytorch)
See also the much chonkier minDALL-E: https://github.com/kakaobrain/minDALL-E Wonder which one is better? Diffusion models are pretty good with CLIP.
-
Kakao Brain releases 1.3 billion parameter text-to-image model minDALL-E. Details in a comment. Example: "a Christmas tree".
According to its GitHub repo, minDALL-E was trained on 14 million image+text pairs from the Conceptual Captions and Conceptual Captions 12M datasets.
What are some alternatives?
VQGAN-CLIP - Just playing with getting VQGAN+CLIP running locally, rather than having to use colab.
dalle-mini - DALLĀ·E Mini - Generate images from a text prompt
DALLE-mtf - Open-AI's DALL-E for large scale training in mesh-tensorflow.
disco-diffusion
artroom-stable-diffusion
big-sleep - A simple command line tool for text to image generation, using OpenAI's CLIP and a BigGAN. Technique was originally created by https://twitter.com/advadnoun
vqgan-clip-app - Local image generation using VQGAN-CLIP or CLIP guided diffusion
feed_forward_vqgan_clip - Feed forward VQGAN-CLIP model, where the goal is to eliminate the need for optimizing the latent space of VQGAN for each input prompt
zero-shot-object-tracking - Object tracking implemented with the Roboflow Inference API, DeepSort, and OpenAI CLIP.