MiDaS vs stable-diffusion-webui-depthmap-script

MiDaS

Code for robust monocular depth estimation described in "Ranftl et. al., Towards Robust Monocular Depth Estimation: Mixing Datasets for Zero-shot Cross-dataset Transfer, TPAMI 2022" (by isl-org)

monocular-depth-estimation single-image-depth-prediction Deeplearning

Source Code

Suggest alternative

Edit details

stable-diffusion-webui-depthmap-script

High Resolution Depth Maps for Stable Diffusion WebUI (by thygate)

depthmap stable-diffusion lookingglass

Source Code

Suggest alternative

Edit details

Our great sponsors

InfluxDB - Power Real-Time Data Analytics at Scale

WorkOS - The modern identity platform for B2B SaaS

SaaSHub - Software Alternatives and Reviews

Our great sponsors

MiDaS		stable-diffusion-webui-depthmap-script
	Project
27	Mentions	64
4,089	Stars	1,582
4.1%	Growth	-
2.4	Activity	8.3
3 months ago	Latest Commit	about 1 month ago
Python	Language	Python
MIT License	License	MIT License

The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.

MiDaS

Posts with mentions or reviews of MiDaS. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2024-04-25.

How to Estimate Depth from a Single Image
8 projects | dev.to | 25 Apr 2024

The checkpoint below uses MiDaS, which returns the inverse depth map, so we have to invert it back to get a comparable depth map.
Distance estimation from monocular vision using deep learning
3 projects | /r/computervision | 13 Jun 2023

Hi, I have made use of the KITTI dataset for this, and yes it depends on objects of know sizes. Here I have defined the following classes: Car, Van, Truck, Pedestrian, Person_sitting, Cyclist, Tram, Misc, or DontCare and the predictions are pretty accurate for those classes. Even if it's not the same class, it still recognizes the object since I have made use of the coco names dataset here and that is used along with YOLO for object detection. And there are several already implemented projects that make use of deep learning models trained on 2D datasets to predict 3D distance. This was one of my inspirations for this project: https://blogs.nvidia.com/blog/2019/06/19/drive-labs-distance-to-object-detection/ Furthermore, there are well-documented and researched papers like DistYOLO or MiDaS that makes use of deep learning for depth estimation
OMPR V0.6.10 update
2 projects | /r/u_OMPR_App | 14 Mar 2023

-Added AI image depth generator Create your own depth map image at a click of a button. Using the awesome MIDAS3.1 https://github.com/isl-org/MiDaS as the backend and the model "dpt_beit_large_512" for the highest quality depth map. Video and GIF depth map generators coming out next together with the Depth movie player feature.
AI that converts a regular 2d image to stereoscopic
1 project | /r/ArtificialInteligence | 9 Feb 2023

It uses MiDaS. That extension may be the most accessible way to use it at home. IDK.
Idea: training on magiceye images
1 project | /r/StableDiffusion | 5 Feb 2023

Here's the project homepage https://github.com/isl-org/MiDaS
MiDaS v3_1 and DiscoDiffusion
2 projects | /r/DiscoDiffusion | 27 Dec 2022

The problem came up after MiDaS updated to version V3_1 on Dec 24th. Although the fix works fine, with the new version there are many changes, which for me produces slightly different results. I would like to able to produce results like before. I still clone the MiDaS repo, but then set it back to the last commit before the changes in december, which is 66882994a432727317267145dc3c2e47ec78c38a.
File not found error
3 projects | /r/DiscoDiffusion | 27 Dec 2022

try: from midas.dpt_depth import DPTDepthModel except: if not os.path.exists('MiDaS'): gitclone("https://github.com/isl-org/MiDaS.git") gitclone("https://github.com/bytedance/Next-ViT.git", f'{PROJECT_DIR}/externals/Next_ViT') if not os.path.exists('MiDaS/midas_utils.py'): shutil.move('MiDaS/utils.py', 'MiDaS/midas_utils.py') if not os.path.exists(f'{model_path}/dpt_large-midas-2f21e586.pt'): wget("https://github.com/intel-isl/DPT/releases/download/1_0/dpt_large-midas-2f21e586.pt", model_path) sys.path.append(f'{PROJECT_DIR}/MiDaS')
A quick demo to show how structurally coherent depth2img is compared to img2img using Automatic1111.
2 projects | /r/StableDiffusion | 12 Dec 2022

Cool. The repo for MiDaS is here. https://github.com/isl-org/MiDaS You can see that they partially trained the model on 3D movies Here's a list of the movies that were used to train it. I wonder if they'll be training a MiDaS v 4.0 as things have moved on quite a bit since it was released in Apr 2021?
Boosting Monocular Depth repo
3 projects | /r/computervision | 9 Dec 2022

We present a stand-alone implementation of our Merging Operator. This new repo allows using any pair of monocular depth estimations in our double estimation. This includes using separate networks for base and high-res estimations, using networks not supported by this repo (such as Midas-v3), or using manually edited depth maps for artistic use. This will also be useful for scientists developing CNN-based MDE as a way to quickly apply double estimation to their own network. For more details please take a look here.
DepthViewer is now live on Steam :)
3 projects | /r/virtualreality | 30 Nov 2022

I'll make the feature to export only the depthmap .png file. If you need the depthmap .png right now you can use the MiDaS python script.

stable-diffusion-webui-depthmap-script

Posts with mentions or reviews of stable-diffusion-webui-depthmap-script. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2023-12-09.

PATCHFUSION is really impressive. High resolution depth maps in 16bit. I've been waiting for this. https://github.com/zhyever/PatchFusion
5 projects | /r/StableDiffusion | 9 Dec 2023

The guide on the github page for the extension is OK: https://github.com/thygate/stable-diffusion-webui-depthmap-script
Extension not showing. Depthmap help 🙏
1 project | /r/StableDiffusion | 20 Jul 2023

New to SD. I'm trying to get an extension to work (https://github.com/thygate/stable-diffusion-webui-depthmap-script) but opposite to the tutorials the "depth" tab doesn't show after installation. Anyone who can help locate the problem? Thanks!
Is anyone working on stereoscopic 3D SD? Is it even possible?
1 project | /r/StableDiffusion | 30 Jun 2023

You can use this extension to generate stereoscopic images . . . I don't (yet) dabble in video, so I don't know what it'll do there. I've done a ton of stereo pics with it. My fascination sort of comes and goes. You can do cross-eyed or parllel view as well as red/cyan anaglyphs.
GUIDE: Ways to generate consistent environments for comics, novels, etc
2 projects | /r/StableDiffusion | 28 Jun 2023

Option 8. Use img2img of existing 360 HDRIs, extract their depth maps with the depth extension. Use that as a displacement map on a sphere in Blender, similarly to this, with the refurbished HDRI as an image texture, then take screenshots from a position close to the center of the sphere. You are limited to staying close to the center in order to avoid distortion, but now you have 360 degrees of consistent freedom for a particular scene. If you have 2 or more HDRIs of the same place, even better. You could also combine this with the 3D environments of the other options to use 360 renders as bases for the img2img.
Another Ai image to 3d
1 project | /r/StableDiffusion | 21 Jun 2023

you have another automatic 1111 extension that allow you to create there also the 3d file, but this consume a lot of vram https://github.com/thygate/stable-diffusion-webui-depthmap-script
Get a 16-Bit Controlnet Depth
1 project | /r/StableDiffusion | 19 Jun 2023

If you're using A1111 webui there is the depthmap2mask extension which you can install from the extensions tab. It will add a depth tab which will allow you to create 16-bit depth maps among many other things.
180 VR - Blue Techno World - (Stable Diffusion + Deforum) stereo video
1 project | /r/StableDiffusion | 5 Jun 2023

actually it is very easy to do. What you need is install extension for Stable Diffusion webUI (https://stable-diffusion-art.com/install-windows/) . This extension will generate stereo for you automatically. Name is Depth. (https://github.com/thygate/stable-diffusion-webui-depthmap-script)
Is it possible for me to approximate a depth map from a generated image and make a 3D model?
3 projects | /r/StableDiffusion | 31 May 2023
Thanks for loving our Star Wars video! We created a new one for Lord of the Rings. Enjoy this mid-journey to Middle-Earth.
1 project | /r/midjourney | 12 May 2023
Found this site through twitter that slightly animates images. Throwing Stable Diffusion generations into it is pretty awesome. Site in comments.
1 project | /r/StableDiffusion | 6 May 2023

You can do this inside of a1111 as well with this extension https://github.com/thygate/stable-diffusion-webui-depthmap-script

What are some alternatives?

When comparing MiDaS and stable-diffusion-webui-depthmap-script you can also consider the following projects:

DenseDepth - High Quality Monocular Depth Estimation via Transfer Learning

a1111-sd-zoe-depth - a1111 sd WebUI extention version of ZoeDepth

stablediffusion - High-Resolution Image Synthesis with Latent Diffusion Models

multi-subject-render - Generate multiple complex subjects all at once!

deeplearning4j-examples - Deeplearning4j Examples (DL4J, DL4J Spark, DataVec) [Moved to: https://github.com/deeplearning4j/deeplearning4j-examples]

Thin-Plate-Spline-Motion-Model - [CVPR 2022] Thin-Plate Spline Motion Model for Image Animation.

DiverseDepth - The code and data of DiverseDepth

depthmap2mask - Create masks out of depthmaps in img2img

Insta-DM - Learning Monocular Depth in Dynamic Scenes via Instance-Aware Projection Consistency (AAAI 2021)

point-e - Point cloud diffusion for 3D model synthesis

U-2-Net - The code for our newly accepted paper in Pattern Recognition 2020: "U^2-Net: Going Deeper with Nested U-Structure for Salient Object Detection."

stable-diffusion-webui-dataset-tag-editor - Extension to edit dataset captions for SD web UI by AUTOMATIC1111