feed_forward_vqgan_clip
Feed forward VQGAN-CLIP model, where the goal is to eliminate the need for optimizing the latent space of VQGAN for each input prompt (by mehdidc)
CLIP-Guided-Diffusion
Just playing with getting CLIP Guided Diffusion running locally, rather than having to use colab. (by nerdyrodent)
feed_forward_vqgan_clip | CLIP-Guided-Diffusion | |
---|---|---|
4 | 4 | |
136 | 377 | |
- | - | |
3.7 | 0.0 | |
4 months ago | over 1 year ago | |
Python | Python | |
MIT License | GNU General Public License v3.0 or later |
The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
feed_forward_vqgan_clip
Posts with mentions or reviews of feed_forward_vqgan_clip.
We have used some of these posts to build our list of alternatives
and similar projects. The last one was on 2021-09-11.
-
[D] Hosting AI Art Generative ML Model
WOMBO I suspect uses the feed forward inferential approach to VQGAN + CLIP (instead of finetuning, predict the final z latent vector for a given text input) which is why their outputs are less sophisticated: as a result there are many deployment optimizations you can do to speed that up, which may be complicated.
-
A small experiment on how changes in a text prompt may affect output image in a CLIP-based system
The system used to produce these images is unlike most other VQGAN+CLIP systems because it uses a neural network trained by the developer(s) instead of an iterative process. This system is known to have a "formula" for image layout.
-
Get a VQGAN output image for a given text description almost instantly (not including time for one-time setup) using Colab notebook "Feed Forward VQGAN CLIP - Using a pretrained model" from mehdidc. Here are 20 non-cherry picked images from the notebook. Details in a comment.
Hello, some news. For those who are interested, I released new models (release 0.2) that you could try and you might find them better (depending on the prompt) than the current one(s), also the problem that was mentioned by /u/Wiskkey is less visible (object parts appearing systematically on top-left), but still not 100% solved, there is still a common global structure that can be identified, but it's more centered on the image. The Colab notebook was updated to use the new models.
CLIP-Guided-Diffusion
Posts with mentions or reviews of CLIP-Guided-Diffusion.
We have used some of these posts to build our list of alternatives
and similar projects. The last one was on 2022-09-23.
-
Which is your favorite text to image model overall?
Runner-ups are Craiyon (for being more "creative" than SD), Disco Diffusion, minDALL-E, and CLIP Guided Diffusion.
-
Once have access, do you run it on your computer or over the internet on Open-AI's computers?
-clip guided diffusion https://github.com/nerdyrodent/CLIP-Guided-Diffusion
-
how would i go about running disco diffusion locally?
Nerdy Rodent has a Github repo for this; it should work fine from the Anaconda command line: https://github.com/nerdyrodent/CLIP-Guided-Diffusion
-
PLAYING AGAIN (CLIP GUIDED DIFFUSION) (VQGAN + CLIP) (Beksinski)
As far as I understand, VQGAN is not a guided diffusion model. I've been using a slightly tweaked version of https://github.com/nerdyrodent/CLIP-Guided-Diffusion for diffusion. Once you get it set up the interface is pretty much what you might expect:
What are some alternatives?
When comparing feed_forward_vqgan_clip and CLIP-Guided-Diffusion you can also consider the following projects:
VQGAN-CLIP - Just playing with getting VQGAN+CLIP running locally, rather than having to use colab.
big-sleep - A simple command line tool for text to image generation, using OpenAI's CLIP and a BigGAN. Technique was originally created by https://twitter.com/advadnoun
DALLE-mtf - Open-AI's DALL-E for large scale training in mesh-tensorflow.
deep-daze - Simple command line tool for text to image generation using OpenAI's CLIP and Siren (Implicit neural representation network). Technique was originally created by https://twitter.com/advadnoun
disco-diffusion
Text-to-Image-Synthesis - Pytorch implementation of Generative Adversarial Text-to-Image Synthesis paper
feed_forward_vqgan_clip vs VQGAN-CLIP
CLIP-Guided-Diffusion vs VQGAN-CLIP
feed_forward_vqgan_clip vs big-sleep
CLIP-Guided-Diffusion vs DALLE-mtf
feed_forward_vqgan_clip vs deep-daze
CLIP-Guided-Diffusion vs disco-diffusion
feed_forward_vqgan_clip vs Text-to-Image-Synthesis
CLIP-Guided-Diffusion vs big-sleep