PVNA：与点伏耐卷积的3D神经架构搜索

论文标题

PVNA：与点伏耐卷积的3D神经架构搜索

PVNAS: 3D Neural Architecture Search with Point-Voxel Convolution

论文作者

Liu, Zhijian, Tang, Haotian, Zhao, Shengyu, Shao, Kevin, Han, Song

论文摘要

3D神经网络被广泛用于现实世界应用（例如AR/VR耳机，自动驾驶汽车）。他们必须快速准确；但是，边缘设备上有限的硬件资源使这些要求变得更具挑战性。先前的工作过程使用基于体素或基于点的神经网络的3D数据，但是由于内存足迹和随机内存访问，两种类型的3D模型都不是硬件有效的。在本文中，我们从效率的角度研究了3D深度学习。我们首先系统地分析了前3D方法的瓶颈。然后，我们将基于点和基于体素的模型的最佳组合在一起，并提出一种新型的硬件3D原始的点 - 素卷积（PVCONV）。我们通过稀疏的卷积进一步增强了这种原始性，以使其在处理大型（户外）场景方面更有效。根据我们设计的3D原始性，我们介绍了3D神经体系结构搜索（3D-NAS），以探索给定资源约束的最佳3D网络体系结构。我们在六个代表性基准数据集上评估了我们提出的方法，并以1.8-23.7倍的速度来实现最先进的性能。此外，我们的方法已部署到MIT无人驾驶的自动赛车工具中，达到了更大的检测范围，更高的准确性和较低的延迟。

3D neural networks are widely used in real-world applications (e.g., AR/VR headsets, self-driving cars). They are required to be fast and accurate; however, limited hardware resources on edge devices make these requirements rather challenging. Previous work processes 3D data using either voxel-based or point-based neural networks, but both types of 3D models are not hardware-efficient due to the large memory footprint and random memory access. In this paper, we study 3D deep learning from the efficiency perspective. We first systematically analyze the bottlenecks of previous 3D methods. We then combine the best from point-based and voxel-based models together and propose a novel hardware-efficient 3D primitive, Point-Voxel Convolution (PVConv). We further enhance this primitive with the sparse convolution to make it more effective in processing large (outdoor) scenes. Based on our designed 3D primitive, we introduce 3D Neural Architecture Search (3D-NAS) to explore the best 3D network architecture given a resource constraint. We evaluate our proposed method on six representative benchmark datasets, achieving state-of-the-art performance with 1.8-23.7x measured speedup. Furthermore, our method has been deployed to the autonomous racing vehicle of MIT Driverless, achieving larger detection range, higher accuracy and lower latency.

下载PDF全文

下载文献需遵守相关版权规定

论文标题