-
Sounds like this is a 3D to 2D correspondence estimation problem. So is it correct that you are trying find the pose of the object based on seen 2D images? First you need to define a canonical reference frame for the object. This object reference frame is essentially glued to the object and you want to estimate the object to camera frame transformation matrix which will give you the pose of the object relative to how you are viewing it from a given frame. To achieve this, most literature use some form of 3D to 2D feature correspondence search from which a transformation matrix is obtained using projective geometry. Features like SIFT features can be used to find correspondences between features seen in the 2D image and features in the 3D object. This is also an active area of research in computer vision and the state of the art uses learned deep features. You can check out https://github.com/cvg/Hierarchical-Localization which is the State-of-the-Art in camera 6DOF pose estimation from known 3D models of the world. For your scenario, you just need to define the object coordinate system and you can obtain the pose if you know the object to camera transformations. You should also first look into the classical approaches which use some variants of PNP + RANSAC algorithm to find 2D to 3D correspondences. Since you also know the relative poses of the cameras, you can also do refinement like bundle adjustment to better predict your 2D to 3D correspondences. Let me know if you find any good tutorials or resources online.
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
-
I've been doing some research in this area and there are a few deep learning solutions to this problem. For example, NVIDIA's Deep Object Pose Estimation will estimate the 6DOF pose of a known object. But you'll have to train the network if you want to detect a new object. PoseCNN, which someone else mentioned, does a similar thing. CenterPose is more interesting, as it can estimate then pose of an object from a known category; e.g. sneakers, or laptops, rather that one specific object (as DOPE and PoseCNN do).
-
I've been doing some research in this area and there are a few deep learning solutions to this problem. For example, NVIDIA's Deep Object Pose Estimation will estimate the 6DOF pose of a known object. But you'll have to train the network if you want to detect a new object. PoseCNN, which someone else mentioned, does a similar thing. CenterPose is more interesting, as it can estimate then pose of an object from a known category; e.g. sneakers, or laptops, rather that one specific object (as DOPE and PoseCNN do).
-
CenterPose
Single-Stage Keypoint-based Category-level Object Pose Estimation from an RGB Image (ICRA 2022)
I've been doing some research in this area and there are a few deep learning solutions to this problem. For example, NVIDIA's Deep Object Pose Estimation will estimate the 6DOF pose of a known object. But you'll have to train the network if you want to detect a new object. PoseCNN, which someone else mentioned, does a similar thing. CenterPose is more interesting, as it can estimate then pose of an object from a known category; e.g. sneakers, or laptops, rather that one specific object (as DOPE and PoseCNN do).
-
The animation on the page tells it all. They also released the code: https://github.com/yenchenlin/iNeRF-public