DISP6D: Disentangled Implicit Shape And Pose Learning For Scalable 6D Pose Estimation
2021 Β· Yilin Wen, Xiangyu Li, Hao Pan, et al.
Abstract
Scalable 6D pose estimation for rigid objects from RGB images aims at handling multiple objects and generalizing to novel objects. Building on a well-known auto-encoding framework to cope with object symmetry and the lack of labeled training data, we achieve scalability by disentangling the latent representation of auto-encoder into shape and pose sub-spaces. The latent shape space models the similarity of different objects through contrastive metric learning, and the latent pose code is compared with canonical rotations for rotation retrieval. Because different object symmetries induce inconsistent latent pose spaces, we re-entangle the shape representation with canonical rotations to generate shape-dependent pose codebooks for rotation retrieval. We show state-of-the-art performance on two benchmarks containing textureless CAD objects without category and daily objects with categories respectively, and further demonstrate improved scalability by extending to a more challenging settin
Authors
(none)
Tags
Stats
Related papers
- 3D Pose Estimation And 3D Model Retrieval For Objects In The Wild (2018)15.25
- View-invariant, Occlusion-robust Probabilistic Embedding For Human Pose (2020)8.82
- Poseembroider: Towards A 3D, Visual, Semantic-aware Human Pose Representation (2024)6.34
- Extending Deepsdf For Automatic 3D Shape Retrieval And Similarity Transform Estimation (2020)0.00
- Dual Pose-invariant Embeddings: Learning Category And Object-specific Discriminative Representations For Recognition And Retrieval (2024)4.52
- Hashmod: A Hashing Method For Scalable 3D Object Detection (2016)10.07
- Multiview-consistent Semi-supervised Learning For 3D Human Pose Estimation (2019)13.05
- DH3D: Deep Hierarchical 3D Descriptors For Robust Large-scale 6dof Relocalization (2020)14.76