Feed forward VQGAN-CLIP model, where the goal is to eliminate the need for optimizing the latent space of VQGAN for each input prompt
Why do you think that https://github.com/nerdyrodent/CLIP-Guided-Diffusion is a good alternative to feed_forward_vqgan_clip