Our great sponsors
-
DALLE-pytorch
Implementation / replication of DALL-E, OpenAI's Text to Image Transformer, in Pytorch (by Jack000)
-
WorkOS
The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
-
DALLE-pytorch
Implementation / replication of DALL-E, OpenAI's Text to Image Transformer, in Pytorch
link to images and code: https://github.com/Jack000/DALLE-pytorch/
link to diffusion model: https://github.com/Jack000/guided-diffusion
If in general DDPM > GAN > VAE, why do transformer image generators all use VQVAE to decode images? Wouldn't it be better to use a diffusion model? I was wondering about this and started experimenting with different ways to decode vector-quantized embeddings with a diffusion model - see discussion here After a lot of trial and error I got something that works pretty well.