OFA
clip-guided-diffusion
Our great sponsors
OFA | clip-guided-diffusion | |
---|---|---|
3 | 5 | |
2,323 | 440 | |
2.4% | - | |
2.8 | 1.8 | |
4 days ago | about 2 years ago | |
Python | Python | |
Apache License 2.0 | MIT License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
OFA
-
[R][P] Unifying Architectures, Tasks, and Modalities Through a Simple Sequence-to-Sequence Learning Framework + VQA Hugging Face Spaces Demo
github: https://github.com/OFA-Sys/OFA
-
OFA: model that does text-to-image as well as other tasks
From this:
- [R] Paper: Unifying Architectures, Tasks, and Modalities Through a Simple Sequence-to-Sequence Learning Framework. Shocking performance in text-to-image synthesis and open-domain tasks.
clip-guided-diffusion
-
[D] Which GAN is Jon Rafman using?
According to his bio he uses "clip-guided diffusion". Never heard of it before, but it appears to not use GANs. Text model and image classifier.
-
Someone posted my art on this subreddit and it reached the front page without credit, so I thought I'd post something myself
But yeah this software generates similar but to be fair not nearly as “aesthetic” gifs with a single terminal command and actually 0 photoshop.
-
AI-generated image for "ghost town at night"
I used CLIP guided diffusion to generate the image (see OpenAi CLIP).
-
Smoggy place. By AI
I used this https://github.com/afiaka87/clip-guided-diffusion. No reference at all only a prompt "Steampunk town"
-
Trying out new method of generating pixels from text
I used this method. It consumes about 8gb of vram and takes about 20 minutes to generate 1 image. You can also import it in colab. And if you get an unlucky seed, you have to reset the timer and start crafting your items again.
What are some alternatives?
ImageNet21K - Official Pytorch Implementation of: "ImageNet-21K Pretraining for the Masses"(NeurIPS, 2021) paper
stylegan2-ada - StyleGAN2 with adaptive discriminator augmentation (ADA) - Official TensorFlow implementation
GroundingDINO - Official implementation of the paper "Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection"
discoart - 🪩 Create Disco Diffusion artworks in one line
ONE-PEACE - A general representation model across vision, audio, language modalities. Paper: ONE-PEACE: Exploring One General Representation Model Toward Unlimited Modalities
big-sleep - A simple command line tool for text to image generation, using OpenAI's CLIP and a BigGAN. Technique was originally created by https://twitter.com/advadnoun
MAGIC - Language Models Can See: Plugging Visual Controls in Text Generation
blended-diffusion - Official implementation for "Blended Diffusion for Text-driven Editing of Natural Images" [CVPR 2022]
UPop - [ICML 2023] UPop: Unified and Progressive Pruning for Compressing Vision-Language Transformers.