Joint Multi-view 2D Convolutional Neural Networks for 3D Object Classification

Joint Multi-view 2D Convolutional Neural Networks for 3D Object Classification

Jinglin Xu, Xiangsen Zhang, Wenbin Li, Xinwang Liu, Junwei Han

Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence
Main track. Pages 3202-3208. https://doi.org/10.24963/ijcai.2020/443

Three-dimensional (3D) object classification is widely involved in various computer vision applications, e.g., autonomous driving, simultaneous localization and mapping, which has attracted lots of attention in the committee. However, solving 3D object classification by directly employing the 3D convolutional neural networks (CNNs) generally suffers from high computational cost. Besides, existing view-based methods cannot better explore the content relationships between views. To this end, this work proposes a novel multi-view framework by jointly using multiple 2D-CNNs to capture discriminative information with relationships as well as a new multi-view loss fusion strategy, in an end-to-end manner. Specifically, we utilize multiple 2D views of a 3D object as input and integrate the intra-view and inter-view information of each view through the view-specific 2D-CNN and a series of modules (outer product, view pair pooling, 1D convolution, and fully connected transformation). Furthermore, we design a novel view ensemble mechanism that selects several discriminative and informative views to jointly infer the category of a 3D object. Extensive experiments demonstrate that the proposed method is able to outperform current state-of-the-art methods on 3D object classification. More importantly, this work provides a new way to improve 3D object classification from the perspective of fully utilizing well-established 2D-CNNs.
Keywords:
Machine Learning: Classification
Machine Learning: Multi-instance;Multi-label;Multi-view learning