MiDaS vs InvokeAI

MiDaS

Code for robust monocular depth estimation described in "Ranftl et. al., Towards Robust Monocular Depth Estimation: Mixing Datasets for Zero-shot Cross-dataset Transfer, TPAMI 2022" (by isl-org)

monocular-depth-estimation single-image-depth-prediction Deeplearning

Source Code

Suggest alternative

Edit details

InvokeAI is a leading creative engine for Stable Diffusion models, empowering professionals, artists, and enthusiasts to generate and create visual media using the latest AI-driven technologies. The solution offers an industry leading WebUI, supports terminal use through a CLI, and serves as the foundation for multiple commercial products. (by invoke-ai)

ai-art Artificial intelligence generative-art image-generation img2img inpainting latent-diffusion Linux MacOS outpainting txt2img Windows stable-diffusion

Source Code

invoke-ai.github.io

Suggest alternative

Edit details

Our great sponsors

WorkOS - The modern identity platform for B2B SaaS

InfluxDB - Power Real-Time Data Analytics at Scale

SaaSHub - Software Alternatives and Reviews

Our great sponsors

MiDaS		InvokeAI
	Project
27	Mentions	239
4,089	Stars	21,266
4.1%	Growth	2.3%
2.4	Activity	10.0
2 months ago	Latest Commit	5 days ago
Python	Language	TypeScript
MIT License	License	Apache License 2.0

The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.

MiDaS

Posts with mentions or reviews of MiDaS. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2024-04-25.

How to Estimate Depth from a Single Image
8 projects | dev.to | 25 Apr 2024

The checkpoint below uses MiDaS, which returns the inverse depth map, so we have to invert it back to get a comparable depth map.
Distance estimation from monocular vision using deep learning
3 projects | /r/computervision | 13 Jun 2023

Hi, I have made use of the KITTI dataset for this, and yes it depends on objects of know sizes. Here I have defined the following classes: Car, Van, Truck, Pedestrian, Person_sitting, Cyclist, Tram, Misc, or DontCare and the predictions are pretty accurate for those classes. Even if it's not the same class, it still recognizes the object since I have made use of the coco names dataset here and that is used along with YOLO for object detection. And there are several already implemented projects that make use of deep learning models trained on 2D datasets to predict 3D distance. This was one of my inspirations for this project: https://blogs.nvidia.com/blog/2019/06/19/drive-labs-distance-to-object-detection/ Furthermore, there are well-documented and researched papers like DistYOLO or MiDaS that makes use of deep learning for depth estimation
OMPR V0.6.10 update
2 projects | /r/u_OMPR_App | 14 Mar 2023

-Added AI image depth generator Create your own depth map image at a click of a button. Using the awesome MIDAS3.1 https://github.com/isl-org/MiDaS as the backend and the model "dpt_beit_large_512" for the highest quality depth map. Video and GIF depth map generators coming out next together with the Depth movie player feature.
AI that converts a regular 2d image to stereoscopic
1 project | /r/ArtificialInteligence | 9 Feb 2023

It uses MiDaS. That extension may be the most accessible way to use it at home. IDK.
Idea: training on magiceye images
1 project | /r/StableDiffusion | 5 Feb 2023

Here's the project homepage https://github.com/isl-org/MiDaS
MiDaS v3_1 and DiscoDiffusion
2 projects | /r/DiscoDiffusion | 27 Dec 2022

The problem came up after MiDaS updated to version V3_1 on Dec 24th. Although the fix works fine, with the new version there are many changes, which for me produces slightly different results. I would like to able to produce results like before. I still clone the MiDaS repo, but then set it back to the last commit before the changes in december, which is 66882994a432727317267145dc3c2e47ec78c38a.
File not found error
3 projects | /r/DiscoDiffusion | 27 Dec 2022

try: from midas.dpt_depth import DPTDepthModel except: if not os.path.exists('MiDaS'): gitclone("https://github.com/isl-org/MiDaS.git") gitclone("https://github.com/bytedance/Next-ViT.git", f'{PROJECT_DIR}/externals/Next_ViT') if not os.path.exists('MiDaS/midas_utils.py'): shutil.move('MiDaS/utils.py', 'MiDaS/midas_utils.py') if not os.path.exists(f'{model_path}/dpt_large-midas-2f21e586.pt'): wget("https://github.com/intel-isl/DPT/releases/download/1_0/dpt_large-midas-2f21e586.pt", model_path) sys.path.append(f'{PROJECT_DIR}/MiDaS')
A quick demo to show how structurally coherent depth2img is compared to img2img using Automatic1111.
2 projects | /r/StableDiffusion | 12 Dec 2022

Cool. The repo for MiDaS is here. https://github.com/isl-org/MiDaS You can see that they partially trained the model on 3D movies Here's a list of the movies that were used to train it. I wonder if they'll be training a MiDaS v 4.0 as things have moved on quite a bit since it was released in Apr 2021?
Boosting Monocular Depth repo
3 projects | /r/computervision | 9 Dec 2022

We present a stand-alone implementation of our Merging Operator. This new repo allows using any pair of monocular depth estimations in our double estimation. This includes using separate networks for base and high-res estimations, using networks not supported by this repo (such as Midas-v3), or using manually edited depth maps for artistic use. This will also be useful for scientists developing CNN-based MDE as a way to quickly apply double estimation to their own network. For more details please take a look here.
DepthViewer is now live on Steam :)
3 projects | /r/virtualreality | 30 Nov 2022

I'll make the feature to export only the depthmap .png file. If you need the depthmap .png right now you can use the MiDaS python script.

InvokeAI

Posts with mentions or reviews of InvokeAI. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2024-02-22.

Stable Diffusion 3
3 projects | news.ycombinator.com | 22 Feb 2024

Probably not, since I have no idea what you're talking about. I've just been using the models that InvokeAI (2.3, I only just now saw there's a 3.0) downloads for me [0]. The SD1.5 one is as good as ever, but the SD2 model introduces artifacts on (many, but not all) faces and copyrighted characters.
[0] https://github.com/invoke-ai/InvokeAI
AMD Funded a Drop-In CUDA Implementation Built on ROCm: It's Open-Source
23 projects | news.ycombinator.com | 12 Feb 2024

I actually used the rocm/pytorch image you also linked.
I'm not sure what you're pointing to with your reference to the Fedora-based images. I'm quite happy with my NixOS install and really don't want to switch to anything else. And as long as I have the correct kernel module, my host OS really shouldn't matter to run any of the images.
And I'm sure it can be made to work with many base images, my point was just that the dependency management around pytorch was in a bad state, where it is extremely easy to break.
> Anyways, hopefully this PR fixes the immediate issue: https://github.com/invoke-ai/InvokeAI/pull/5714/files
It does! At least for me. It is my PR after all ;)
Can some expert analyze a github repo and tell us if it's really safe or not?
3 projects | /r/cybersecurity | 7 Dec 2023

The data being flagged is not in that github repo, it's fetched from elsewhere and I don't fancy spending time looking for it. The alert is for 'Sirefef!cfg' which has been reported as a false positive with a bunch of other stable diffusion projects (https://www.reddit.com/r/StableDiffusion/comments/101zjec/trojanwin32sirefefcfg_an_apparently_common_false/, https://www.reddit.com/r/StableDiffusion/comments/xmhukb/trojan_in_waifudiffusion_model_file/, https://github.com/invoke-ai/InvokeAI/issues/2773 )
What is the most effcient port of SD to mac?
1 project | /r/StableDiffusion | 6 Dec 2023

I haven’t tried it recently, but InvokeAI runs on Mac. Invoke. I used to run on my MacBook, but have since gotten a Win laptop.
Easy Stable Diffusion XL in your device, offline
6 projects | news.ycombinator.com | 1 Dec 2023

There are already a number of local, inference options that are (crucially) open-source, with more robust feature sets.
And if the defense here is "but Auto1111 and Comfy don't have as user-friendly a UI", that's also already covered. https://github.com/invoke-ai/InvokeAI
Ask HN: Selfhosted ChatGPT and Stable-diffusion like alternatives?
1 project | news.ycombinator.com | 25 Nov 2023

https://github.com/invoke-ai/InvokeAI should work on your machine. For LLM models, the smaller ones should run using llama.cpp, but I don't think you'll be happy comparing them to ChatGPT.
🚀 InvokeAI 3.4 now supports LCM & LCM-LoRAs and much more!
1 project | /r/StableDiffusion | 12 Nov 2023
Best ai image generator without a nsfw filter?
2 projects | /r/aiArt | 28 Oct 2023

Stable Diffusion. /r/stablediffusion There are many tutorials on how to set it up locally and use it. InvokeAI is the easiest way to set it up. https://github.com/invoke-ai/InvokeAI
What's the best stable diffusion client for base m1 MacBook air?
3 projects | /r/StableDiffusion | 20 Oct 2023

InvokeAI
invoke-ai/InvokeAI
1 project | /r/programming | 30 Aug 2023

What are some alternatives?

When comparing MiDaS and InvokeAI you can also consider the following projects:

stable-diffusion-webui-depthmap-script - High Resolution Depth Maps for Stable Diffusion WebUI

stable-diffusion-webui - Stable Diffusion web UI

DenseDepth - High Quality Monocular Depth Estimation via Transfer Learning

stable-diffusion

stablediffusion - High-Resolution Image Synthesis with Latent Diffusion Models

ControlNet - Let us control diffusion models!

deeplearning4j-examples - Deeplearning4j Examples (DL4J, DL4J Spark, DataVec) [Moved to: https://github.com/deeplearning4j/deeplearning4j-examples]

ComfyUI - The most powerful and modular stable diffusion GUI, api and backend with a graph/nodes interface.

DiverseDepth - The code and data of DiverseDepth

dreambooth-gui

Insta-DM - Learning Monocular Depth in Dynamic Scenes via Instance-Aware Projection Consistency (AAAI 2021)

stable-diffusion - Optimized Stable Diffusion modified to run on lower GPU VRAM

MiDaS vs stable-diffusion-webui-depthmap-script InvokeAI vs stable-diffusion-webui MiDaS vs DenseDepth InvokeAI vs stable-diffusion MiDaS vs stablediffusion InvokeAI vs ControlNet MiDaS vs deeplearning4j-examples InvokeAI vs ComfyUI MiDaS vs DiverseDepth InvokeAI vs dreambooth-gui MiDaS vs Insta-DM InvokeAI vs stable-diffusion

Compare MiDaS vs InvokeAI and see what are their differences.

MiDaS

InvokeAI

MiDaS

InvokeAI

What are some alternatives?