GD-MAE：MAE预训练的生成解码器在激光雷达点云上

论文标题

GD-MAE：MAE预训练的生成解码器在激光雷达点云上

GD-MAE: Generative Decoder for MAE Pre-training on LiDAR Point Clouds

论文作者

Yang, Honghui, He, Tong, Liu, Jiaheng, Chen, Hua, Wu, Boxi, Lin, Binbin, He, Xiaofei, Ouyang, Wanli

论文摘要

尽管蒙版自动编码器（MAE）在开发诸如图像和视频之类的视觉任务中取得了巨大进展，但由于固有的不规则性，在大规模3D点云中探索MAE仍然具有挑战性。与以前的3D MAE框架相反，该框架要么设计一个复杂的解码器来从维护区域推断掩盖信息，要么采用复杂的掩蔽策略，我们提出了更简单的范式。核心思想是将MAE（GD-MAE）的\ textbf {g}启用\ textbf {d} ecoder自动合并周围环境，以层次融合方式恢复掩盖的几何知识。这样一来，我们的方法就不受介绍解码器的启发式设计，并具有探索各种掩盖策略的灵活性。与常规方法相比，相应的零件的成本低于\ textbf {12 \％}延迟，同时实现更好的性能。我们证明了所提出的方法对几个大规模基准的功效：Waymo，Kitti和一次。下游检测任务的一致改进说明了强大的鲁棒性和泛化能力。不仅我们的方法揭示了最新的结果，而且值得注意的是，即使使用Waymo数据集上标记的数据的\ textbf {20 \％}，我们也可以达到可比的精度。代码将在https://github.com/nightmare-n/gd-mae上发布。

Despite the tremendous progress of Masked Autoencoders (MAE) in developing vision tasks such as image and video, exploring MAE in large-scale 3D point clouds remains challenging due to the inherent irregularity. In contrast to previous 3D MAE frameworks, which either design a complex decoder to infer masked information from maintained regions or adopt sophisticated masking strategies, we instead propose a much simpler paradigm. The core idea is to apply a \textbf{G}enerative \textbf{D}ecoder for MAE (GD-MAE) to automatically merges the surrounding context to restore the masked geometric knowledge in a hierarchical fusion manner. In doing so, our approach is free from introducing the heuristic design of decoders and enjoys the flexibility of exploring various masking strategies. The corresponding part costs less than \textbf{12\%} latency compared with conventional methods, while achieving better performance. We demonstrate the efficacy of the proposed method on several large-scale benchmarks: Waymo, KITTI, and ONCE. Consistent improvement on downstream detection tasks illustrates strong robustness and generalization capability. Not only our method reveals state-of-the-art results, but remarkably, we achieve comparable accuracy even with \textbf{20\%} of the labeled data on the Waymo dataset. Code will be released at https://github.com/Nightmare-n/GD-MAE.

下载PDF全文

下载文献需遵守相关版权规定

论文标题