Latformer: Locality-aware Point-view Fusion Transformer For 3D Shape Recognition
2021 Β· Xinwei He, Silin Cheng, Dingkang Liang, et al.
Abstract
Recently, 3D shape understanding has achieved significant progress due to the advances of deep learning models on various data formats like images, voxels, and point clouds. Among them, point clouds and multi-view images are two complementary modalities of 3D objects and learning representations by fusing both of them has been proven to be fairly effective. While prior works typically focus on exploiting global features of the two modalities, herein we argue that more discriminative features can be derived by modeling ``where to fuse''. To investigate this, we propose a novel Locality-Aware Point-View Fusion Transformer (LATFormer) for 3D shape retrieval and classification. The core component of LATFormer is a module named Locality-Aware Fusion (LAF) which integrates the local features of correlated regions across the two modalities based on the co-occurrence scores. We further propose to filter out scores with low values to obtain salient local co-occurring regions, which reduces redu
Authors
(none)
Tags
Stats
Related papers
- Pvrnet: Point-view Relation Neural Network For 3D Shape Recognition (2018)13.11
- Fusing Local Similarities For Retrieval-based 3D Orientation Estimation Of Unseen Objects (2022)10.07
- MVTN: Multi-view Transformation Network For 3D Shape Recognition (2020)21.44
- Location Field Descriptors: Single Image 3D Model Retrieval In The Wild (2019)11.39
- Viewformer: View Set Attention For Multi-view 3D Shape Understanding (2023)0.00
- Joint Learning Of 3D Shape Retrieval And Deformation (2021)11.08
- DH3D: Deep Hierarchical 3D Descriptors For Robust Large-scale 6dof Relocalization (2020)14.76
- Gram Regularization For Multi-view 3D Shape Retrieval (2020)0.00