-
DALLE2-pytorch
Implementation of DALL-E 2, OpenAI's updated text-to-image synthesis neural network, in Pytorch
Not a complete answer to your question but you may find this discussion interesting:
https://github.com/lucidrains/DALLE2-pytorch/discussions/10
Inference cost and scale seems to be much more favourable than large language models (for now).
-
CodeRabbit
CodeRabbit: AI Code Reviews for Developers. Revolutionize your code reviews with AI. CodeRabbit offers PR summaries, code walkthroughs, 1-click suggestions, and AST-based analysis. Boost productivity and code quality across all major languages with each PR.
-
CLIP
CLIP (Contrastive Language-Image Pretraining), Predict the most relevant text snippet given an image
This uses OpenAI’s CLIP model which is open source: https://github.com/openai/clip
-
An older, but similar and still impressive alternative is available here: https://github.com/CompVis/latent-diffusion
If you have a decent amount of VRAM, you can use it to start generating images with their pre-trained models. They're nowhere near as impressive as DALL-E 2, but they're still pretty damn cool. I don't know what the exact memory requirements are, but I've gotten it to run on a 1080 TI with 11gb.
-
Also very interested in this. AFAIK, the best alternative to DALLE-type generation is CLIP-Guided generation (such as Disco Diffusion [1] and MidJourney[2]) which can take anywhere from 1 - 20 minutes on an RTX A5000.
[1]: https://github.com/alembics/disco-diffusion
-
In case anyone else is put off by the link referencing an answer that then links to something else with most likely higher hardware requirements that are not stated, the end of the rabbit hole seems to be here: https://github.com/openai/dalle-2-preview/issues/6#issuecomm...
TL;DR: A single NVidia A100 is most likely sufficient; with a lot of optimization and stepwise execution a single 3090 Ti might also be within the realm of possibility.
-
big-sleep
A simple command line tool for text to image generation, using OpenAI's CLIP and a BigGAN. Technique was originally created by https://twitter.com/advadnoun
and after a few hours got this: https://i.imgur.com/FxdfdmV.png
Not nearly as cool as the real DALL-e, but maybe I'm missing something.
[1] https://github.com/lucidrains/big-sleep
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives