MultiDiffusion vs sd-dynamic-thresholding

MultiDiffusion

Official Pytorch Implementation for "MultiDiffusion: Fusing Diffusion Paths for Controlled Image Generation" presenting "MultiDiffusion" (ICML 2023) (by omerbt)

Source Code

multidiffusion.github.io

Suggest alternative

Edit details

sd-dynamic-thresholding

Dynamic Thresholding (CFG Scale Fix) for Stable Diffusion (StableSwarmUI, ComfyUI, and Auto WebUI) (by mcmonkeyprojects)

Suggest topics

Source Code

Suggest alternative

Edit details

InfluxDB - Power Real-Time Data Analytics at Scale

Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

www.influxdata.com

featured

SaaSHub - Software Alternatives and Reviews

SaaSHub helps you find the best software and product alternatives

www.saashub.com

featured

MultiDiffusion		sd-dynamic-thresholding
	Project
13	Mentions	26
911	Stars	1,019
-	Growth	4.8%
4.8	Activity	7.2
8 months ago	Latest Commit	19 days ago
Jupyter Notebook	Language	Python
-	License	MIT License

The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.

MultiDiffusion

Posts with mentions or reviews of MultiDiffusion. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2023-08-15.

Opendream: A Non-Destructive UI for Stable Diffusion
4 projects | news.ycombinator.com | 15 Aug 2023

For composing this approach works pretty well
https://multidiffusion.github.io/
Messing with the denoising loop can allow you to reach new places in latent space. Over 8+ different research papers/Auto1111 extension ideas in a single pipe. Load once and do lots of different things (SD 2.1 or 1.5)
7 projects | /r/StableDiffusion | 15 Mar 2023

So I've continued to experiment with how many papers I can fit into a single pipe and have them play nicely together. The images below were created by combining the panorama code from omerbt/MultiDiffusion with the ideas from albarji/mixture-of-diffusers. Also turns out nateraw/stable-diffusion-videos can be seen as a special case of a panorama (in latent space rather than prompt space).
MultiDiffusion Region Control, a prompt on each mask webui extension is out.
3 projects | /r/StableDiffusion | 3 Mar 2023
Hubble Diffusion with MultiDiffusion
1 project | /r/StableDiffusion | 28 Feb 2023

Essentially, I fine-tuned Stable Diffusion 2.1 base (the 512x512) model on the ESA Hubble Deep Space Images & Captions dataset I collected from public Hubble images & captions. After around 33,000 training steps, I saved the model and was really impressed by the results. But I really wanted to be able to generate wallpaper-level quality space images, so I stumbled upon MultiDiffusion: a new project for generating massive panorama images using stable diffusion models. I then used hubble-diffusion-2 along with MultiDiffusion to generate each one of these amazing 2560x1536 images. Each image took a little over an hour to generate on a Google Colab T4 GPU. I used the following prompts for each of these images:
MultiDiffusion: Fusing Diffusion Paths for Controlled Image Generation
1 project | /r/StableDiffusion | 27 Feb 2023
What is the maximum size a 3090 24gb can produce?
1 project | /r/StableDiffusion | 26 Feb 2023

If you need generated and not upscaled 4k for some reason, try something like https://github.com/omerbt/MultiDiffusion
[R] [N] "MultiDiffusion: Fusing Diffusion Paths for Controlled Image Generation" enables controllable image generation without any further training or finetuning of diffusion models.
2 projects | /r/MachineLearning | 24 Feb 2023

Project: https://multidiffusion.github.io/ Paper: https://arxiv.org/abs/2302.08113 GitHub: https://github.com/omerbt/MultiDiffusion
Meet MultiDiffusion: A Unified AI Framework That Enables Versatile And Controllable Image Generation Using A Pre-Trained Text-to-Image Diffusion Model
2 projects | /r/machinelearningnews | 24 Feb 2023

Quick Read: https://www.marktechpost.com/2023/02/24/meet-multidiffusion-a-unified-ai-framework-that-enables-versatile-and-controllable-image-generation-using-a-pre-trained-text-to-image-diffusion-model/ Paper: https://arxiv.org/abs/2302.08113 Github: https://github.com/omerbt/MultiDiffusion Project: https://multidiffusion.github.io/
You to can create Panorama images 512x10240+ (not a typo) using less then 6GB VRAM (Vertorama works too). A modification of the MultiDiffusion code to pass the image through the VAE in slices then reassemble. Potato computers of the world rejoice.
3 projects | /r/StableDiffusion | 23 Feb 2023

So I haven't made many images with Stable Diffusion despite using it heavily. The reason is I've been messing with the internals of the diffusion pipe, to interfere with the diffusion process in different ways. Todays fun result is based on omerbt/MultiDiffusion for making panoramas.
First version of Stable Diffusion was released on August 22, 2022
4 projects | /r/StableDiffusion | 23 Feb 2023

If we combine Mixture of Diffusers + MultiDiffusion+ Composer+ cross-domain-compositing and probably some more I'm not thinking of.

sd-dynamic-thresholding

Posts with mentions or reviews of sd-dynamic-thresholding. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2023-11-23.

ZeroDiffusion -- a clean zero terminal SNR training 1.5 base model + experimental inpainting model
2 projects | /r/StableDiffusion | 23 Nov 2023

For outputs to look right, you will need some form of CFG rescale or dynamic thresholding in order to correct for overexposure (A1111 extensions are linked -- I am told that ComfyUI has nodes available for these functions). A good starting point for CFG rescale is 0.7, as recommended in the paper. I strongly suspect that CFG rescale is not an ideal solution and leaves a substantial training-inference gap, and when using zero terminal SNR models I find that Dynamic Thresholding can give better outputs that are closer to what I expect from the data without the brownout often caused by CFG rescale. A potential starting point for Dynamic Thresholding would be: Restart sampler, 15 CFG scale, Mimic CFG scale 15 7.5, Sawtooth on both scale schedulers, 6 for both minimum values, scheduler value 4, do not separate feature channels, ZERO, STD. You will likely have to experiment a lot with Dynamic Thresholding. (edit: small correction to DT settings)
Dynamic Thresholding for comfyui?
2 projects | /r/comfyui | 14 Aug 2023

Recently switched from A1111 and i love it so far, flexibility to orchestrate complex workflows automatically instead of manual operations is a life changer. Anyhow, one extension i like on A1111 was this one: https://github.com/mcmonkeyprojects/sd-dynamic-thresholding
How do I implement Dynamic Thresholding (CFG scale fix) in ComfyUI?
2 projects | /r/comfyui | 26 Jul 2023

In the Automatic1111 webui, there is a Dynamic Thresholding (CFG scale fix) extension that:
How to diffuse better faces?
2 projects | /r/StableDiffusion | 7 Jul 2023

Ive found using ADetailer (https://github.com/Bing-su/adetailer, using their reccomended advanced settings and face_yolov8n.pt) and Dynamic Thresholding (CFG set to 12 and Mimic to 7) has vastly improved my face renders. (https://github.com/mcmonkeyprojects/sd-dynamic-thresholding) GL!
Kohya UI settings as asked (style+character training)
2 projects | /r/StableDiffusion | 5 Jul 2023

The output LoRA works best with CFG at 4, because at 7 it gets that gasoline colors and contrast of overbaking, but I guess this is a tradeoff of that many steps in total (5200) since the earlier snapshots were not that good in style and with character details. You can use a workaround like the Dynamic Trescholding extention: https://github.com/mcmonkeyprojects/sd-dynamic-thresholding.git - helps a lot in many cases when you want a high CFG but the model/lora overbakes them (it mimics a lower CFG while keeping the high CFG details and prompt alignment).
Does anyone know how to create this type of hyper realistic pic?
5 projects | /r/StableDiffusion | 19 Jun 2023

Use sd-dynamic-thresholding extension (set CFG scale to 12 or more and mimic CFG scale to 7): https://github.com/mcmonkeyprojects/sd-dynamic-thresholding
ControlNet Reference-Only problems
13 projects | /r/StableDiffusionInfo | 15 Jun 2023
What's your favorite small tweaks to make? I'll go first
13 projects | /r/StableDiffusion | 3 Jun 2023

Tweak this up or down for small changes. Too far and you’ll get a different image. Extensions like Dynamic Thresholding can let you go much higher without the overexposed look.
Blurred/Low quality/Low details images
1 project | /r/StableDiffusion | 2 Jun 2023

Turn CFG scale down or maybe use this extension, I've never used Dynamic Thresholding before but I think its what you want
Dynamic threshold & Offset noise - The answer to oversaturated images?
1 project | /r/StableDiffusion | 17 May 2023

What are some alternatives?

When comparing MultiDiffusion and sd-dynamic-thresholding you can also consider the following projects:

stable-diffusion-webui-two-shot - Latent Couple extension (two shot diffusion port)

stable-diffusion-webui-anti-burn - Extension for AUTOMATIC1111/stable-diffusion-webui for smoothing generated images by skipping a few very last steps and averaging together some images before them.

sd-webui-controlnet - WebUI extension for ControlNet

Stable-Diffusion - Stable Diffusion, SDXL, LoRA Training, DreamBooth Training, Automatic1111 Web UI, DeepFake, Deep Fakes, TTS, Animation, Text To Video, Tutorials, Guides, Lectures, Courses, ComfyUI, Google Colab, RunPod, NoteBooks, ControlNet, TTS, Voice Cloning, AI, AI News, ML, ML News, News, Tech, Tech News, Kohya LoRA, Kandinsky 2, DeepFloyd IF, Midjourney

mixture-of-diffusers - Mixture of Diffusers for scene composition and high resolution image generation

adetailer - Auto detecting, masking and inpainting with detection model.

Diffusion-Models-Papers-Survey-Taxonomy - Diffusion model papers, survey, and taxonomy

multidiffusion-upscaler-for-automatic1111 - Tiled Diffusion and VAE optimize, licensed under CC BY-NC-SA 4.0

stable-diffusion-videos - Create 🔥 videos with Stable Diffusion by exploring the latent space and morphing between text prompts

sd_webui_SAG

openpose-editor - Openpose Editor for AUTOMATIC1111's stable-diffusion-webui

sd-dynamic-prompts - A custom script for AUTOMATIC1111/stable-diffusion-webui to implement a tiny template language for random prompt generation

MultiDiffusion vs stable-diffusion-webui-two-shot sd-dynamic-thresholding vs stable-diffusion-webui-anti-burn MultiDiffusion vs sd-webui-controlnet sd-dynamic-thresholding vs Stable-Diffusion MultiDiffusion vs mixture-of-diffusers sd-dynamic-thresholding vs adetailer MultiDiffusion vs Diffusion-Models-Papers-Survey-Taxonomy sd-dynamic-thresholding vs multidiffusion-upscaler-for-automatic1111 MultiDiffusion vs stable-diffusion-videos sd-dynamic-thresholding vs sd_webui_SAG MultiDiffusion vs openpose-editor sd-dynamic-thresholding vs sd-dynamic-prompts

Compare MultiDiffusion vs sd-dynamic-thresholding and see what are their differences.

MultiDiffusion

sd-dynamic-thresholding

MultiDiffusion

sd-dynamic-thresholding

What are some alternatives?