Our great sponsors
-
MiDaS
Code for robust monocular depth estimation described in "Ranftl et. al., Towards Robust Monocular Depth Estimation: Mixing Datasets for Zero-shot Cross-dataset Transfer, TPAMI 2022"
-
WorkOS
The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
Hi, I have made use of the KITTI dataset for this, and yes it depends on objects of know sizes. Here I have defined the following classes: Car, Van, Truck, Pedestrian, Person_sitting, Cyclist, Tram, Misc, or DontCare and the predictions are pretty accurate for those classes. Even if it's not the same class, it still recognizes the object since I have made use of the coco names dataset here and that is used along with YOLO for object detection. And there are several already implemented projects that make use of deep learning models trained on 2D datasets to predict 3D distance. This was one of my inspirations for this project: https://blogs.nvidia.com/blog/2019/06/19/drive-labs-distance-to-object-detection/ Furthermore, there are well-documented and researched papers like DistYOLO or MiDaS that makes use of deep learning for depth estimation