zero123
ComfyUI_examples
zero123 | ComfyUI_examples | |
---|---|---|
6 | 6 | |
2,545 | 1,154 | |
2.2% | - | |
6.9 | 6.8 | |
6 months ago | 10 days ago | |
Python | HTML | |
MIT License | - |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
zero123
-
Stable Cascade
Someone with resources will have to train Zero123 [1] with this backbone.
[1] https://zero123.cs.columbia.edu/
-
Stable Zero123: Quality 3D Object Generation from Single Images
This looks a fine-tune of the classic zero123 (https://github.com/cvlab-columbia/zero123) I’m excited to check out the quality improvements.
Though 3d model synthesis is one use case, I found the less advertised base reprojection model to be more useful for gamedev at the moment. You can generate a multiview spritesheet from an image, and it’s fast enough for synthesis during a gameplay session. I couldn’t get a good quality/time balance to do the same with the 3d models, and the lack of mesh rigging or animation combined with imperfections in a fully 3d model tends to break the suspension of disbelief compared to what players are used to. I’m this will change as the tech develops and we layer more AI on top (automatic animation synthesis is an active research area).
If you’re interested in this you might also want to check out deforum (https://github.com/deforum-art/deforum-stable-diffusion) which provides even more powerful camera controls on top of stable diffusion designed for full scenes rather than single objects.
-
Text-to-image-to-3D on 16GB GPU after stable-dreamfusion repo update
As described in the stable-dreamfusion repo for the image to 3D using the zero123 model (you can read more about that in their repo here: https://github.com/cvlab-columbia/zero123) I used the 105000 checkpoint of zero123. It took about an hour to go through their initial NeRF generation and cleanup steps to get the model output.
-
NVIDIA presents GeNVS: Generative Novel View Synthesis with 3D-Aware Diffusion Models
Until then https://github.com/cvlab-columbia/zero123 was kinda okay, but practical results often left to be desired, from the imprecision of the view angles to the at times fanciful re-imaginations of the source object.
-
Zero-1-to-3: Zero-shot One Image to 3D Object
For anyone else who tried to download the weights and got Google Drive throwing a quota error at you, they're working on it: https://github.com/cvlab-columbia/zero123/issues/2
ComfyUI_examples
-
Stable Cascade
ComfyUI is similar to Houdini in complexity, but immensely powerful. It's a joy to use.
There are also a large amount of resources available for it on YouTube, GitHub (https://github.com/comfyanonymous/ComfyUI_examples), reddit (https://old.reddit.com/r/comfyui), CivitAI, Comfy Workflows (https://comfyworkflows.com/), and OpenArt Flow (https://openart.ai/workflows/).
I still use AUTO1111 (https://github.com/AUTOMATIC1111/stable-diffusion-webui) and the recently released and heavily modified fork of AUTO1111 called Forge (https://github.com/lllyasviel/stable-diffusion-webui-forge).
-
A comparison Stable Diffusion XL 1.0 using Fooocus, Automatic1111 and ComfyUI
For ComfyUI, the workflow was sdxl_refiner_prompt_example.
-
including workflows
GitHub - comfyanonymous/ComfyUI_examples: Examples of ComfyUI workflows
-
PromptONLY for 3 Celebs
Here are examples of Noisy Latent Composition. Noisy latent composition is when latents are composited together while still noisy before the image is fully denoised. Since general shapes like poses and subjects are denoised in the first sampling steps this lets us for example position subjects with specific poses anywhere on the image while keeping a great amount of consistency. The subjects are then composited (pasted) onto the background with some feathering applied. The rest of the sampling steps are then run on this composited image. ComfyUI_examples/noisy_latent_composition at master · comfyanonymous/ComfyUI_examples · GitHub I dont use auto1111 so i unsure of its equivalent but ImpossibleAds answer sounds pretty similar.
-
Is Stable Diffusion able to process multiple image prompts at the same time?
From what i've seen, this guy has done some composition stuff, not sure what exactly though: https://github.com/comfyanonymous/ComfyUI_examples/tree/master/noisy_latent_composition
-
Landscape Combination with ComfyUI
This is just a slightly modified ComfyUI workflow from an example provided in the examples repo. The idea is that it creates a tall canvas and renders 4 vertical sections separately, combining them as they go. Then there's a full render of the image with a prompt that describes the whole thing. And there's the addition of an astronaut subject standing in the scene. Comfy UI looks daunting at first and I didn't try it the first or second time I came across it, but it's really just a good way to get more control over your creation while learning more about how SD works. Also, the image IS the workflow. Install ComfyUI and run it. Then drag the image onto it and it'll build out the workflow for you.
What are some alternatives?
stable-diffusion-webui-forge
stable-dreamfusion - Text-to-3D & Image-to-3D & Mesh Exportation with NeRF + Diffusion.
StableCascade - Official Code for Stable Cascade
ComfyUI - The most powerful and modular stable diffusion GUI, api and backend with a graph/nodes interface.
ComfyUI-DiffusersStableCascade - Simple inference with StableCascade using diffusers in ComfyUI
Fooocus - Focus on prompting and generating
genvs
deforum-stable-diffusion