U-2-Net vs MiDaS

U-2-Net

The code for our newly accepted paper in Pattern Recognition 2020: "U^2-Net: Going Deeper with Nested U-Structure for Salient Object Detection." (by NathanUA)

Source Code

Suggest alternative

Edit details

MiDaS

Code for robust monocular depth estimation described in "Ranftl et. al., Towards Robust Monocular Depth Estimation: Mixing Datasets for Zero-shot Cross-dataset Transfer, TPAMI 2022" (by isl-org)

monocular-depth-estimation single-image-depth-prediction Deeplearning

Source Code

Suggest alternative

Edit details

Our great sponsors

InfluxDB - Power Real-Time Data Analytics at Scale

WorkOS - The modern identity platform for B2B SaaS

SaaSHub - Software Alternatives and Reviews

Our great sponsors

U-2-Net		MiDaS
	Project
30	Mentions	27
8,098	Stars	4,074
-	Growth	3.8%
3.1	Activity	2.4
4 months ago	Latest Commit	2 months ago
Python	Language	Python
Apache License 2.0	License	MIT License

The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.

U-2-Net

Posts with mentions or reviews of U-2-Net. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2023-05-14.

I used the ChatGPT API to create a proof-of-concept AI driven video game. Using generative AI for the images and dialogue and GPT-3.5 for narrative and game control. More info in comments.
1 project | /r/ChatGPT | 17 Jun 2023

I use a finetuned custom Stable Diffusion model in combination with a style embedding for the characters for image generation and U²-Net for background removal.
[Help] Meta's segment anything - How can I make smooth border ?
3 projects | /r/computervision | 14 May 2023

Hi :) I am app/web developer and new to AI. Currently, I am making photo app which can segment all the things in image. I've used meta's segment anything. I've got all the masks but the boundary of masks are very bumpy. So I've tried rembg which uses u2net(salient object detection) and pymatting together. Do I have to use pymatting separately after getting segment from segment anything to improve boundary quality of my segmented output ?
BackgroundRemover 0.2.1 - Remove Background from Video and Images using AI
4 projects | /r/opensource | 5 May 2023

Cool, thanks for sharing. It might be worth clearly attributing the models you're using, and maybe add a models/license file with the U2net license, since that license is different to the one you're using for your project, and since you're distributing the models.
How to do Human Head Segmentation from images?
4 projects | /r/computervision | 27 Mar 2023

Background Removal - I'd use u2net which has a model that's specifically trained on people vs backgrounds. If that didn't work, maybe DIS which is the newer version or rembg. These are pretty easy to get running I found.
Just a reminder that there is a new 'remove background' extension for a1111
11 projects | /r/StableDiffusion | 15 Mar 2023

u2net_human_seg (download, source): A pre-trained model for human segmentation.
OMPR V0.6.10 update
2 projects | /r/u_OMPR_App | 14 Mar 2023

Optimized – AI tweak Image background remover is now faster and enables trained model (onnx) swapping Revamp the python engine for background remover. Should be running faster than the previous build. Also added was the ability to replace the pre-trained ONNX model by the user themselves. https://github.com/xuebinqin/U-2-Net
OMPR V0.6.8 update
1 project | /r/OMPR_PCVR | 20 Feb 2023

-Added AI Background remover based on U2Net AI framework for Image projector. Check the Image projection "AI Tweaks" dropdown to toggle between u2net standard, u2netp – portrait, u2net-human_seg, u2net_cloth_seg, or silueta as the background remover AI model. As you can guess from the names, each of the models excels for different image subjects for background removal. For example, the portrait model is good for human portraits, cloth seg for clothing subjects and so on. Default to CUDA (Nvidia) processor, if you have an AMD card or would like to use CPU as the processor, untick the CUDA checkbox. Go here if you want to know more about the mechanics of U2Net -> https://github.com/xuebinqin/U-2-Net
Computer Vision Free Lancer
2 projects | /r/computervision | 17 Jan 2023

Also checkout https://github.com/xuebinqin/U-2-Net. They have a new version in this repo: https://github.com/xuebinqin/DIS
image segmentation using U-nets
2 projects | /r/learnmachinelearning | 27 Oct 2022

There, the author has the same goal as you do, and has a train.py and instructions. You can reach out to the author and ask questions either in the issues section or perhaps email directly. Many times people are very helpful when you show interest in their work. The neural network it is based on (U2-net) is very easy to get running by the way, and has lots of use cases: https://github.com/xuebinqin/U-2-Net
After much experimentation 🤖
2 projects | /r/StableDiffusion | 11 Oct 2022

really any segmentation model could work. "salient object detection" is well suited for "i have a single, obvious subject that I want to isolate from the background". This is the model I had in mind, but it wouldn't have to be this necessarily: https://github.com/xuebinqin/U-2-Net

MiDaS

Posts with mentions or reviews of MiDaS. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2024-04-25.

How to Estimate Depth from a Single Image
8 projects | dev.to | 25 Apr 2024

The checkpoint below uses MiDaS, which returns the inverse depth map, so we have to invert it back to get a comparable depth map.
Distance estimation from monocular vision using deep learning
3 projects | /r/computervision | 13 Jun 2023

Hi, I have made use of the KITTI dataset for this, and yes it depends on objects of know sizes. Here I have defined the following classes: Car, Van, Truck, Pedestrian, Person_sitting, Cyclist, Tram, Misc, or DontCare and the predictions are pretty accurate for those classes. Even if it's not the same class, it still recognizes the object since I have made use of the coco names dataset here and that is used along with YOLO for object detection. And there are several already implemented projects that make use of deep learning models trained on 2D datasets to predict 3D distance. This was one of my inspirations for this project: https://blogs.nvidia.com/blog/2019/06/19/drive-labs-distance-to-object-detection/ Furthermore, there are well-documented and researched papers like DistYOLO or MiDaS that makes use of deep learning for depth estimation
OMPR V0.6.10 update
2 projects | /r/u_OMPR_App | 14 Mar 2023

-Added AI image depth generator Create your own depth map image at a click of a button. Using the awesome MIDAS3.1 https://github.com/isl-org/MiDaS as the backend and the model "dpt_beit_large_512" for the highest quality depth map. Video and GIF depth map generators coming out next together with the Depth movie player feature.
AI that converts a regular 2d image to stereoscopic
1 project | /r/ArtificialInteligence | 9 Feb 2023

It uses MiDaS. That extension may be the most accessible way to use it at home. IDK.
Idea: training on magiceye images
1 project | /r/StableDiffusion | 5 Feb 2023

Here's the project homepage https://github.com/isl-org/MiDaS
MiDaS v3_1 and DiscoDiffusion
2 projects | /r/DiscoDiffusion | 27 Dec 2022

The problem came up after MiDaS updated to version V3_1 on Dec 24th. Although the fix works fine, with the new version there are many changes, which for me produces slightly different results. I would like to able to produce results like before. I still clone the MiDaS repo, but then set it back to the last commit before the changes in december, which is 66882994a432727317267145dc3c2e47ec78c38a.
File not found error
3 projects | /r/DiscoDiffusion | 27 Dec 2022

try: from midas.dpt_depth import DPTDepthModel except: if not os.path.exists('MiDaS'): gitclone("https://github.com/isl-org/MiDaS.git") gitclone("https://github.com/bytedance/Next-ViT.git", f'{PROJECT_DIR}/externals/Next_ViT') if not os.path.exists('MiDaS/midas_utils.py'): shutil.move('MiDaS/utils.py', 'MiDaS/midas_utils.py') if not os.path.exists(f'{model_path}/dpt_large-midas-2f21e586.pt'): wget("https://github.com/intel-isl/DPT/releases/download/1_0/dpt_large-midas-2f21e586.pt", model_path) sys.path.append(f'{PROJECT_DIR}/MiDaS')
A quick demo to show how structurally coherent depth2img is compared to img2img using Automatic1111.
2 projects | /r/StableDiffusion | 12 Dec 2022

Cool. The repo for MiDaS is here. https://github.com/isl-org/MiDaS You can see that they partially trained the model on 3D movies Here's a list of the movies that were used to train it. I wonder if they'll be training a MiDaS v 4.0 as things have moved on quite a bit since it was released in Apr 2021?
Boosting Monocular Depth repo
3 projects | /r/computervision | 9 Dec 2022

We present a stand-alone implementation of our Merging Operator. This new repo allows using any pair of monocular depth estimations in our double estimation. This includes using separate networks for base and high-res estimations, using networks not supported by this repo (such as Midas-v3), or using manually edited depth maps for artistic use. This will also be useful for scientists developing CNN-based MDE as a way to quickly apply double estimation to their own network. For more details please take a look here.
DepthViewer is now live on Steam :)
3 projects | /r/virtualreality | 30 Nov 2022

I'll make the feature to export only the depthmap .png file. If you need the depthmap .png right now you can use the MiDaS python script.

What are some alternatives?

When comparing U-2-Net and MiDaS you can also consider the following projects:

detectron2 - Detectron2 is a platform for object detection, segmentation and other visual recognition tasks.

stable-diffusion-webui-depthmap-script - High Resolution Depth Maps for Stable Diffusion WebUI

image-background-remove-tool - ✂️ Automated high-quality background removal framework for an image using neural networks. ✂️

DenseDepth - High Quality Monocular Depth Estimation via Transfer Learning

backgroundremover - Background Remover lets you Remove Background from images and video using AI with a simple command line interface that is free and open source.

stablediffusion - High-Resolution Image Synthesis with Latent Diffusion Models

rembg-greenscreen - Rembg Video Virtual Green Screen Edition

deeplearning4j-examples - Deeplearning4j Examples (DL4J, DL4J Spark, DataVec) [Moved to: https://github.com/deeplearning4j/deeplearning4j-examples]

trt_pose - Real-time pose estimation accelerated with NVIDIA TensorRT

DiverseDepth - The code and data of DiverseDepth

Anime2Sketch - A sketch extractor for anime/illustration.

Insta-DM - Learning Monocular Depth in Dynamic Scenes via Instance-Aware Projection Consistency (AAAI 2021)

U-2-Net vs detectron2 MiDaS vs stable-diffusion-webui-depthmap-script U-2-Net vs image-background-remove-tool MiDaS vs DenseDepth U-2-Net vs backgroundremover MiDaS vs stablediffusion U-2-Net vs rembg-greenscreen MiDaS vs deeplearning4j-examples U-2-Net vs trt_pose MiDaS vs DiverseDepth U-2-Net vs Anime2Sketch MiDaS vs Insta-DM

Compare U-2-Net vs MiDaS and see what are their differences.

U-2-Net

MiDaS

U-2-Net

MiDaS

What are some alternatives?